View Single Post
Old 10-28-2014, 04:32 AM   #3
Senior Member
Location: USA, Midwest

Join Date: May 2008
Posts: 1,167

What gene are you looking at in IGV? The most likely answer is that you are looking at a duplicated gene and the 100's to 1000's of reads which map to it also map to the other copies elsewhere in the genome. When a read maps to multiple locations HTSeq-count will not count it for any of the genes it maps to, rather it will be be counted among the 'alignment_not_unique'. Your experiment of running HTSeq-count on just one chromosome shows that the additional mappings for these reads fall on other chromosomes. When HTSeq-count is given an incomplete alignment set with only one valid mapping for these reads it will naturally count them for that one alignment. This demonstrates why it is a bad idea to perform read counting on only partial alignment sets, unless you are prepared to deal with these types of situations.

If you want to confirm this identify one of the reads aligned to the gene you have identified in IGV and then search for that read name in your accepted_hits.bam to see if it is multiply mapped.
kmcarr is offline   Reply With Quote