Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTSeq-count: Warning: x reads with missing mate encountered.

    As I am running through an RNA-Seq pipeline using Hisat2 for alignment and HTSeq-count for counting reads in features I notice this warning at the bottom of the log file
    Code:
    Warning: 284233 reads with missing mate encountered.
    Looking at the stats of the bam file that gave the HTSeq-count warnings using "samtools flagstat"
    Code:
    76075665 + 0 in total (QC-passed reads + QC-failed reads)
    1565341 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    71435955 + 0 mapped (93.90% : N/A)
    74510324 + 0 paired in sequencing
    37255162 + 0 read1
    37255162 + 0 read2
    64430312 + 0 properly paired (86.47% : N/A)
    67187398 + 0 with itself and mate mapped
    2683216 + 0 singletons (3.60% : N/A)
    2452092 + 0 with mate mapped to a different chr
    2095660 + 0 with mate mapped to a different chr (mapQ>=5)
    Now for the previous RNA-Seq pipeline on the same data, with the only difference being Tophat2 for alignment, I do not see this warning in the HTSeq-count log files.

    Looking at the stats of the tophat2 aligned bam file that came from the same sample above.
    Code:
    85046681 + 0 in total (QC-passed reads + QC-failed reads)
    18181171 + 0 secondary
    0 + 0 supplementary
    0 + 0 duplicates
    85046681 + 0 mapped (100.00% : N/A)
    66865510 + 0 paired in sequencing
    34237825 + 0 read1
    32627685 + 0 read2
    16861294 + 0 properly paired (25.22% : N/A)
    61704000 + 0 with itself and mate mapped
    5161510 + 0 singletons (7.72% : N/A)
    4055974 + 0 with mate mapped to a different chr
    1899988 + 0 with mate mapped to a different chr (mapQ>=5)

    I know this HTSeq-count warning is characteristic of unsorted bam files as I have run into that problem in the past. However, I made sure that I was still getting this warning even with name sorted files and making sure HTSeq-count was expecting name sorted files! I can see that in the hisat2 alignment, I did not have 100% mapping, which may explain the error - Why are these different? Both aligners were run with default settings.

    Moreover, I am wondering how/why this warning occurs as I know HTSeq-count needs only paired or single alignments and cannot deal with both at the same time. Otherwise that is characteristic of this error message:
    Code:
    'pair_alignments' needs a sequence of paired-end alignments
    Although I see in both the Tophat2 and Hisat2 stats that there are singletons.

    TLDR; Why isn't there 100% mapping in Hisat2 output alignments when there is 100% mapping for tophat2 output alignments using default settings for each?
    Last edited by ronaldrcutler; 12-24-2016, 08:04 AM.

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
46 views
0 likes
Last Post seqadmin  
Working...
X