View Single Post
Old 07-23-2016, 09:55 AM   #44
Simon Anders
Senior Member
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994

If each gene is on its own pseudo-contig, you don't need htseq-count. You just count how often you see each gene ID in the third column of your SAM file which contains the chromosome to which the read was mapped. You may want to use a suitable script to skip over multi-mapping reads, though (in the easiest case, grep for all reads with NH:i:1 or something like that).

On the other hand, it is not a good idea to use a tool chain in a manner not intended by the developers of the tools unless you know enough about the tools\ internals to be sure that this is sound. Bowtie is not meant to map to a reference made up not of complete chromosomes of contigs but of individual genes, and htseq-count is neither meant to be used in this manner.

However, there are now tools designed for your manner of operation, most notably salmon and kallisto. So, the easiest and cleanest solution would be to use one of these two.

Last edited by Simon Anders; 07-23-2016 at 09:57 AM.
Simon Anders is offline   Reply With Quote