Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Foolproof ways to remove discordant read pairs

    I am tearing my hair out trying to run an application that requires all reads be properly mapped 1 time in the correct orientation.

    I have applied various GATK filters to the data during indel realignment including:
    # -rf NotPrimaryAlignment \
    # -rf DuplicateRead \
    # -rf UnmappedRead
    I think a lot of that was unnecessary because I think some of them are applied automatically but better safe than sorry.

    I have also filtered the data using samtools with the following command:

    samtools view -b -q 30 -f 2 -o out.filtered.bam in.bam
    I have also tried it as

    samtools view -b -q 30 -f 0x02 -o out.filtered.bam in.bam
    I have sorted the output by queryname using picard so that all read pairs should be adjacent.

    Still, the application (custom script from a separate research group) I am running tells me I have reads that are missing their mate.

    I am going nuts over this. I am at the point where I am removing the problematic reads individually using picard FilterSamReads and the read names.

    Any insights into what monumentally dumb thing I am doing?

  • #2
    The bitwise flag for properly aligned pairs (0x2) is set at the alignment step; that's the flag you want to use. But if you apply any other filters (such as adapter or quality trimming), then that flag may no longer be correct (e.g., if you remove a low quality mate).

    So, you can either filter using only the bitwise flag, or use a function such as BBMap repair.sh to fix mate pairs after other filtering.

    Comment


    • #3
      I haven't applied any adapter or quality trimming post-alignment, but it is possible that I have removed low quality mates.

      I will try filtering discordant mappings immediately post alignment prior to all other filtering steps and see if that clears it up.

      Comment


      • #4
        You can also map with BBMap, like this:

        bbmap.sh in1=r1.fq in2=r2.fq ref=ref.fa outm=mapped.bam pairedonly ambig=toss

        That should create a bam file containing only primary alignments for unambiguously mapped, properly-paired reads. However, "properly paired" is subjective; some programs have weird restrictions like that reads can't overlap, and so forth. I suggest adapter-trimming first, to prevent situations where the plus read appears to start after the minus read, due to adapter sequence.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X