View Single Post
Old 01-27-2014, 11:37 AM   #1
zzhao2
Member
 
Location: Dallas

Join Date: Aug 2012
Posts: 21
Default htseq-count: reads with letter "N"

Hi,
I've just found that a read starting with letter "N" in my sam file was marked as "no_feature" by htseq-count. I hacked that read by changing the "N" to the reference letter and also modifying the affected fields such as FLAG and MD, and got htseq-count worked with that read. My question is: does htseq-count discard any reads with "N"s in anywhere, or just those starting with "N"s? It looks like that in that read all other letters of the sequence were same with the reference, so I would rather keep that read.
I raised this question because by using FastQC I've found that many of my reads have "N"s in their tails (3'?). Will htseq-count output better counts if I trim those tails?

Thanks,
Sylvia

Last edited by zzhao2; 01-27-2014 at 11:41 AM.
zzhao2 is offline   Reply With Quote