I am using htseq-count to get gene counts from paired-end illumina data. Alignment was done by tophat.
When I used accepted_hits.bam after sorting the bam file with default option of sort by position
samtools sort accepted_hits.bam accepted_hits_test &
samtools view accepted_hits_test.bam | htseq-count --mode=intersection-nonempty - genes.gtf > genecounts1.txt
I got warnings - 'Is SAM properly sorted?'. But I got gene counts.
When I sorted BAM file with sort -n option and run htseq-count for gene counts, using the following commands, I didn't get any warnings except about the non-availability of information on Chr M in gtf file.
samtools sort -n accepted_hits.bam accepted_hits_sortedN &
samtools view accepted_hits_sortedN.bam | htseq-count --mode=intersection-nonempty - genes.gtf > genecounts2.txt
Counts obtained in both the ways were different. Counts in genecounts1.txt are aproximately half to the counts obtained in genecounts2.txt. And, counts obtained form Cufflinks are matching with genecounts1.txt (sort by position)
Which one should I rely on - genecounts1 ? or genecounts2 ?
Thanks,
Sandhya
When I used accepted_hits.bam after sorting the bam file with default option of sort by position
samtools sort accepted_hits.bam accepted_hits_test &
samtools view accepted_hits_test.bam | htseq-count --mode=intersection-nonempty - genes.gtf > genecounts1.txt
I got warnings - 'Is SAM properly sorted?'. But I got gene counts.
When I sorted BAM file with sort -n option and run htseq-count for gene counts, using the following commands, I didn't get any warnings except about the non-availability of information on Chr M in gtf file.
samtools sort -n accepted_hits.bam accepted_hits_sortedN &
samtools view accepted_hits_sortedN.bam | htseq-count --mode=intersection-nonempty - genes.gtf > genecounts2.txt
Counts obtained in both the ways were different. Counts in genecounts1.txt are aproximately half to the counts obtained in genecounts2.txt. And, counts obtained form Cufflinks are matching with genecounts1.txt (sort by position)
Which one should I rely on - genecounts1 ? or genecounts2 ?
Thanks,
Sandhya