SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   HTSeq-count: Warning: x reads with missing mate encountered. (http://seqanswers.com/forums/showthread.php?t=73273)

ronaldrcutler 12-24-2016 07:00 AM

HTSeq-count: Warning: x reads with missing mate encountered.
 
As I am running through an RNA-Seq pipeline using Hisat2 for alignment and HTSeq-count for counting reads in features I notice this warning at the bottom of the log file
Code:

Warning: 284233 reads with missing mate encountered.
Looking at the stats of the bam file that gave the HTSeq-count warnings using "samtools flagstat"
Code:

76075665 + 0 in total (QC-passed reads + QC-failed reads)
1565341 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
71435955 + 0 mapped (93.90% : N/A)
74510324 + 0 paired in sequencing
37255162 + 0 read1
37255162 + 0 read2
64430312 + 0 properly paired (86.47% : N/A)
67187398 + 0 with itself and mate mapped
2683216 + 0 singletons (3.60% : N/A)
2452092 + 0 with mate mapped to a different chr
2095660 + 0 with mate mapped to a different chr (mapQ>=5)

Now for the previous RNA-Seq pipeline on the same data, with the only difference being Tophat2 for alignment, I do not see this warning in the HTSeq-count log files.

Looking at the stats of the tophat2 aligned bam file that came from the same sample above.
Code:

85046681 + 0 in total (QC-passed reads + QC-failed reads)
18181171 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
85046681 + 0 mapped (100.00% : N/A)
66865510 + 0 paired in sequencing
34237825 + 0 read1
32627685 + 0 read2
16861294 + 0 properly paired (25.22% : N/A)
61704000 + 0 with itself and mate mapped
5161510 + 0 singletons (7.72% : N/A)
4055974 + 0 with mate mapped to a different chr
1899988 + 0 with mate mapped to a different chr (mapQ>=5)


I know this HTSeq-count warning is characteristic of unsorted bam files as I have run into that problem in the past. However, I made sure that I was still getting this warning even with name sorted files and making sure HTSeq-count was expecting name sorted files! I can see that in the hisat2 alignment, I did not have 100% mapping, which may explain the error - Why are these different? Both aligners were run with default settings.

Moreover, I am wondering how/why this warning occurs as I know HTSeq-count needs only paired or single alignments and cannot deal with both at the same time. Otherwise that is characteristic of this error message:
Code:

'pair_alignments' needs a sequence of paired-end alignments
Although I see in both the Tophat2 and Hisat2 stats that there are singletons.

TLDR; Why isn't there 100% mapping in Hisat2 output alignments when there is 100% mapping for tophat2 output alignments using default settings for each?


All times are GMT -8. The time now is 02:15 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.