Hi,
I used TopHat/Cufflinks for the virus transcriptome assembly and I just used raw reads without filtering. I have two questions.
(1) Though the reference genome has 170 proteins, there are only 109 transcripts but seems much longer length in the transcripts.gtf file. Is it a bad assembly or how can I validate the assembly?
(2) I am interested in several proteins in the specific region of the reference genome. I wonder how I can find these corresponding transcripts abundance in the files genes.fpkm_tracking or isoforms.fpkm_tracking?
Thank you!
Dawn
I used TopHat/Cufflinks for the virus transcriptome assembly and I just used raw reads without filtering. I have two questions.
(1) Though the reference genome has 170 proteins, there are only 109 transcripts but seems much longer length in the transcripts.gtf file. Is it a bad assembly or how can I validate the assembly?
(2) I am interested in several proteins in the specific region of the reference genome. I wonder how I can find these corresponding transcripts abundance in the files genes.fpkm_tracking or isoforms.fpkm_tracking?
Thank you!
Dawn