Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTSeq-count: Warning: x reads with missing mate encountered.

    As I am running through an RNA-Seq pipeline using Hisat2 for alignment and HTSeq-count for counting reads in features I notice this warning at the bottom of the log file
    Code:
    Warning: 284233 reads with missing mate encountered.
    Looking at the stats of the bam file that gave the HTSeq-count warnings using "samtools flagstat"
    Code:
    76075665 + 0 in total (QC-passed reads + QC-failed reads)
    1565341 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    71435955 + 0 mapped (93.90% : N/A)
    74510324 + 0 paired in sequencing
    37255162 + 0 read1
    37255162 + 0 read2
    64430312 + 0 properly paired (86.47% : N/A)
    67187398 + 0 with itself and mate mapped
    2683216 + 0 singletons (3.60% : N/A)
    2452092 + 0 with mate mapped to a different chr
    2095660 + 0 with mate mapped to a different chr (mapQ>=5)
    Now for the previous RNA-Seq pipeline on the same data, with the only difference being Tophat2 for alignment, I do not see this warning in the HTSeq-count log files.

    Looking at the stats of the tophat2 aligned bam file that came from the same sample above.
    Code:
    85046681 + 0 in total (QC-passed reads + QC-failed reads)
    18181171 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    85046681 + 0 mapped (100.00% : N/A)
    66865510 + 0 paired in sequencing
    34237825 + 0 read1
    32627685 + 0 read2
    16861294 + 0 properly paired (25.22% : N/A)
    61704000 + 0 with itself and mate mapped
    5161510 + 0 singletons (7.72% : N/A)
    4055974 + 0 with mate mapped to a different chr
    1899988 + 0 with mate mapped to a different chr (mapQ>=5)

    I know this HTSeq-count warning is characteristic of unsorted bam files as I have run into that problem in the past. However, I made sure that I was still getting this warning even with name sorted files and making sure HTSeq-count was expecting name sorted files! I can see that in the hisat2 alignment, I did not have 100% mapping, which may explain the error - Why are these different? Both aligners were run with default settings.

    Moreover, I am wondering how/why this warning occurs as I know HTSeq-count needs only paired or single alignments and cannot deal with both at the same time. Otherwise that is characteristic of this error message:
    Code:
    'pair_alignments' needs a sequence of paired-end alignments
    Although I see in both the Tophat2 and Hisat2 stats that there are singletons.

    TLDR; Why isn't there 100% mapping in Hisat2 output alignments when there is 100% mapping for tophat2 output alignments using default settings for each?
    Last edited by ronaldrcutler; 12-24-2016, 08:04 AM.

Latest Articles

Collapse

  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM
  • seqadmin
    The Impact of AI in Genomic Medicine
    by seqadmin



    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
    02-26-2024, 02:07 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-14-2024, 06:13 AM
0 responses
33 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-08-2024, 08:03 AM
0 responses
72 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-07-2024, 08:13 AM
0 responses
80 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-06-2024, 09:51 AM
0 responses
68 views
0 likes
Last Post seqadmin  
Working...
X