I am working with RNAseq of an insect under different treatments. I already have the data (Illumina HiSeq paired-end reads) and I was planning to follow the protocol from Anders et al. (2013) to analyze it (I already did it in the past with a model organism). However, the genome of this insect was recently sequenced, so it still has no GTF file available. The only files available are: the genome.fasta and the CDS.fasta.
So far, I mapped the reads on the genome using tophat (without providing a GTF file) which provided me bam/sam files. Now, I should count the reads per gene using HTSeq. However, the input for HTSeq is a GTF file that I dont have.
How can I count reads per gene without a GTF file? Is it possible to use the genome and the CDS to create a GTF/GFF file for HTSeq?
Any other idea on how to proceed would be helpful.
Thank you so much!
So far, I mapped the reads on the genome using tophat (without providing a GTF file) which provided me bam/sam files. Now, I should count the reads per gene using HTSeq. However, the input for HTSeq is a GTF file that I dont have.
How can I count reads per gene without a GTF file? Is it possible to use the genome and the CDS to create a GTF/GFF file for HTSeq?
Any other idea on how to proceed would be helpful.
Thank you so much!
Comment