Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTSeq-count: Warning: x reads with missing mate encountered.

    As I am running through an RNA-Seq pipeline using Hisat2 for alignment and HTSeq-count for counting reads in features I notice this warning at the bottom of the log file
    Code:
    Warning: 284233 reads with missing mate encountered.
    Looking at the stats of the bam file that gave the HTSeq-count warnings using "samtools flagstat"
    Code:
    76075665 + 0 in total (QC-passed reads + QC-failed reads)
    1565341 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    71435955 + 0 mapped (93.90% : N/A)
    74510324 + 0 paired in sequencing
    37255162 + 0 read1
    37255162 + 0 read2
    64430312 + 0 properly paired (86.47% : N/A)
    67187398 + 0 with itself and mate mapped
    2683216 + 0 singletons (3.60% : N/A)
    2452092 + 0 with mate mapped to a different chr
    2095660 + 0 with mate mapped to a different chr (mapQ>=5)
    Now for the previous RNA-Seq pipeline on the same data, with the only difference being Tophat2 for alignment, I do not see this warning in the HTSeq-count log files.

    Looking at the stats of the tophat2 aligned bam file that came from the same sample above.
    Code:
    85046681 + 0 in total (QC-passed reads + QC-failed reads)
    18181171 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    85046681 + 0 mapped (100.00% : N/A)
    66865510 + 0 paired in sequencing
    34237825 + 0 read1
    32627685 + 0 read2
    16861294 + 0 properly paired (25.22% : N/A)
    61704000 + 0 with itself and mate mapped
    5161510 + 0 singletons (7.72% : N/A)
    4055974 + 0 with mate mapped to a different chr
    1899988 + 0 with mate mapped to a different chr (mapQ>=5)

    I know this HTSeq-count warning is characteristic of unsorted bam files as I have run into that problem in the past. However, I made sure that I was still getting this warning even with name sorted files and making sure HTSeq-count was expecting name sorted files! I can see that in the hisat2 alignment, I did not have 100% mapping, which may explain the error - Why are these different? Both aligners were run with default settings.

    Moreover, I am wondering how/why this warning occurs as I know HTSeq-count needs only paired or single alignments and cannot deal with both at the same time. Otherwise that is characteristic of this error message:
    Code:
    'pair_alignments' needs a sequence of paired-end alignments
    Although I see in both the Tophat2 and Hisat2 stats that there are singletons.

    TLDR; Why isn't there 100% mapping in Hisat2 output alignments when there is 100% mapping for tophat2 output alignments using default settings for each?
    Last edited by ronaldrcutler; 12-24-2016, 08:04 AM.

Latest Articles

Collapse

  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM
  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:37 PM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 06:07 PM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2024, 10:03 AM
0 responses
51 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-21-2024, 07:32 AM
0 responses
67 views
0 likes
Last Post seqadmin  
Working...
X