SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why HTseq warning of unfound mate pairs? slowsmile Bioinformatics 8 11-15-2017 04:53 PM
HTseq - count - high ambiguous count rate and reads with missing mate encountered 4galaxy7 Bioinformatics 0 12-14-2015 03:48 AM
HTSeq-count warning message canhu Bioinformatics 27 02-11-2015 12:02 PM
htseq-count warning messages: can they be ignored? scalefree Bioinformatics 4 07-09-2013 10:08 AM
htseq-count with warning for every read to represent all of zero counts in output hibachings2013 RNA Sequencing 10 07-15-2011 11:19 AM

Reply
 
Thread Tools
Old 12-24-2016, 08:00 AM   #1
ronaldrcutler
Member
 
Location: Virginia

Join Date: May 2016
Posts: 80
Default HTSeq-count: Warning: x reads with missing mate encountered.

As I am running through an RNA-Seq pipeline using Hisat2 for alignment and HTSeq-count for counting reads in features I notice this warning at the bottom of the log file
Code:
Warning: 284233 reads with missing mate encountered.
Looking at the stats of the bam file that gave the HTSeq-count warnings using "samtools flagstat"
Code:
76075665 + 0 in total (QC-passed reads + QC-failed reads)
1565341 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
71435955 + 0 mapped (93.90% : N/A)
74510324 + 0 paired in sequencing
37255162 + 0 read1
37255162 + 0 read2
64430312 + 0 properly paired (86.47% : N/A)
67187398 + 0 with itself and mate mapped
2683216 + 0 singletons (3.60% : N/A)
2452092 + 0 with mate mapped to a different chr
2095660 + 0 with mate mapped to a different chr (mapQ>=5)
Now for the previous RNA-Seq pipeline on the same data, with the only difference being Tophat2 for alignment, I do not see this warning in the HTSeq-count log files.

Looking at the stats of the tophat2 aligned bam file that came from the same sample above.
Code:
85046681 + 0 in total (QC-passed reads + QC-failed reads)
18181171 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
85046681 + 0 mapped (100.00% : N/A)
66865510 + 0 paired in sequencing
34237825 + 0 read1
32627685 + 0 read2
16861294 + 0 properly paired (25.22% : N/A)
61704000 + 0 with itself and mate mapped
5161510 + 0 singletons (7.72% : N/A)
4055974 + 0 with mate mapped to a different chr
1899988 + 0 with mate mapped to a different chr (mapQ>=5)

I know this HTSeq-count warning is characteristic of unsorted bam files as I have run into that problem in the past. However, I made sure that I was still getting this warning even with name sorted files and making sure HTSeq-count was expecting name sorted files! I can see that in the hisat2 alignment, I did not have 100% mapping, which may explain the error - Why are these different? Both aligners were run with default settings.

Moreover, I am wondering how/why this warning occurs as I know HTSeq-count needs only paired or single alignments and cannot deal with both at the same time. Otherwise that is characteristic of this error message:
Code:
'pair_alignments' needs a sequence of paired-end alignments
Although I see in both the Tophat2 and Hisat2 stats that there are singletons.

TLDR; Why isn't there 100% mapping in Hisat2 output alignments when there is 100% mapping for tophat2 output alignments using default settings for each?

Last edited by ronaldrcutler; 12-24-2016 at 08:04 AM.
ronaldrcutler is offline   Reply With Quote
Reply

Tags
hisat2, htseq count, samtools, tophat2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:10 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO