Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kartong
    Junior Member
    • Mar 2014
    • 4

    BWA-MEM non-primary alignments "false proper pairing"

    I was running BWA-MEM (Version: 0.7.7-r441) on my illumina sequencing data with the "-a" option to report all alignments and noticed something really weird with the non-primary alignments. Some of the non-primary alignments were given the flag "Properly paired" when the orientation of the two reads doesn't seem to be consistent with what you expect of a proper pair for a illumina dataset (reads highlighted in blue).

    Read 1
    HWI-ST513_0151:1:1101:4168:2366#0 99 2 89619552 0 76M = 89619801 287 TTCTGCTGATACCAATTTAAATCGCTGCCAATGTTCTGACTTGCCCGGCAAGTGATGGTGACTCCGTCTCCCACAG H@@@@DDD@DDDDEDDDDDDDDDDDBDDDFHDDHDEEDHDDHHHHGHFEHHHHHHHHHGHHHHHHHGHHHHHGGG? NM:i:5 MD:Z:22A5T4C30T6T4 AS:i:51 XS:i:51
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 89901545 0 76M = 89619801 -281783 * * NM:i:5 MD:Z:4A6A30G4A5T22 AS:i:51
    HWI-ST513_0151:1:1101:4168:2366#0 355 2 89567927 0 76M = 89619801 51912 * * NM:i:8 MD:Z:22A1T3T5C6C3T18T6T4 AS:i:36
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 89953111 0 76M = 89619801 -333349 * * NM:i:8 MD:Z:4A6A18A3G6G5A3A1T22 AS:i:36
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 90458425 0 76M = 89619801 -838663 * * NM:i:9 MD:Z:4A6A22G6G0G4A3A0T0A22 AS:i:31

    Read 2
    HWI-ST513_0151:1:1101:4168:2366#0 147 2 89619801 0 38S38M = 89619552 -287 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCTCGGAGCCAGAGTAGCAGGAGCCCCAGGAGCTGAG E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD?<GED>FFDFFHHHCHFHHHHHHHFGHHHHHHHHH NM:i:0 MD:Z:38 AS:i:38 XS:i:38 SA:Z:2,89399608,-,42M34S,0,2;
    HWI-ST513_0151:1:1101:4168:2366#0 387 2 89901334 0 38M38H = 89619552 -281783 * * NM:i:0 MD:Z:38 AS:i:38
    HWI-ST513_0151:1:1101:4168:2366#0 387 2 114163992 0 38M38H = 89619552 -24544441 * * NM:i:1 MD:Z:22G15 AS:i:33
    HWI-ST513_0151:1:1101:4168:2366#0 2195 2 89399608 0 42M34H = 89619552 219904 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCT E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD? NM:i:2 MD:Z:5T15T20 AS:i:32 XS:i:32 SA:Z:2,89619801,-,38S38M,0,0;


    Specifically, if you look at the second alignment of each read, you'll notice that they have a flag of 371 and 387 indicating that the two reads align on the same strand. However, for an illumina paired end dataset, the expectation is for a proper pair to run in the following manner:
    -------> <-------


    (FYI: I have not noticed such discrepancies with the primary alignments though.)

    Thus, is this a case where BWA-MEM only checks for the mate-pair distance and not the orientation of the reads for the non-primary alignments? (i.e. we cannot trust the "properly-paired" flag for the non-primary alignments??)
  • kmcarr
    Senior Member
    • May 2008
    • 1181

    #2
    Originally posted by kartong View Post
    I was running BWA-MEM (Version: 0.7.7-r441) on my illumina sequencing data with the "-a" option to report all alignments and noticed something really weird with the non-primary alignments. Some of the non-primary alignments were given the flag "Properly paired" when the orientation of the two reads doesn't seem to be consistent with what you expect of a proper pair for a illumina dataset (reads highlighted in blue).

    Read 1
    HWI-ST513_0151:1:1101:4168:2366#0 99 2 89619552 0 76M = 89619801 287 TTCTGCTGATACCAATTTAAATCGCTGCCAATGTTCTGACTTGCCCGGCAAGTGATGGTGACTCCGTCTCCCACAG H@@@@DDD@DDDDEDDDDDDDDDDDBDDDFHDDHDEEDHDDHHHHGHFEHHHHHHHHHGHHHHHHHGHHHHHGGG? NM:i:5 MD:Z:22A5T4C30T6T4 AS:i:51 XS:i:51
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 89901545 0 76M = 89619801 -281783 * * NM:i:5 MD:Z:4A6A30G4A5T22 AS:i:51
    HWI-ST513_0151:1:1101:4168:2366#0 355 2 89567927 0 76M = 89619801 51912 * * NM:i:8 MD:Z:22A1T3T5C6C3T18T6T4 AS:i:36
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 89953111 0 76M = 89619801 -333349 * * NM:i:8 MD:Z:4A6A18A3G6G5A3A1T22 AS:i:36
    HWI-ST513_0151:1:1101:4168:2366#0 371 2 90458425 0 76M = 89619801 -838663 * * NM:i:9 MD:Z:4A6A22G6G0G4A3A0T0A22 AS:i:31

    Read 2
    HWI-ST513_0151:1:1101:4168:2366#0 147 2 89619801 0 38S38M = 89619552 -287 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCTCGGAGCCAGAGTAGCAGGAGCCCCAGGAGCTGAG E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD?<GED>FFDFFHHHCHFHHHHHHHFGHHHHHHHHH NM:i:0 MD:Z:38 AS:i:38 XS:i:38 SA:Z:2,89399608,-,42M34S,0,2;
    HWI-ST513_0151:1:1101:4168:2366#0 387 2 89901334 0 38M38H = 89619552 -281783 * * NM:i:0 MD:Z:38 AS:i:38
    HWI-ST513_0151:1:1101:4168:2366#0 387 2 114163992 0 38M38H = 89619552 -24544441 * * NM:i:1 MD:Z:22G15 AS:i:33
    HWI-ST513_0151:1:1101:4168:2366#0 2195 2 89399608 0 42M34H = 89619552 219904 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCT E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD? NM:i:2 MD:Z:5T15T20 AS:i:32 XS:i:32 SA:Z:2,89619801,-,38S38M,0,0;


    Specifically, if you look at the second alignment of each read, you'll notice that they have a flag of 371 and 387 indicating that the two reads align on the same strand. However, for an illumina paired end dataset, the expectation is for a proper pair to run in the following manner:
    -------> <-------


    (FYI: I have not noticed such discrepancies with the primary alignments though.)

    Thus, is this a case where BWA-MEM only checks for the mate-pair distance and not the orientation of the reads for the non-primary alignments? (i.e. we cannot trust the "properly-paired" flag for the non-primary alignments??)
    The order of secondary alignments reported is not meant to imply pairing. You have to look at the position of the aligned read and the position of the mate alignment to determine what it is "pairing with". See that all of the various read 1 alignments are being "paired" with the primary alignment of read 2 at position of 89619801, and all of the read 2 alignments are being "paired" with the primary alignment of read 1 at position 89619552.

    Comment

    • kartong
      Junior Member
      • Mar 2014
      • 4

      #3
      Yes, I do understand that the order of the secondary alignments do not imply pairing.

      But if you were to look at this secondary alignment, you'll notice that it has a flag of 371 (read mapped in proper pair + read reverse strand + mate reverse strand) which does not make sense at all for a "proper pair"
      HWI-ST513_0151:1:1101:4168:2366#0 371 2 89901545 0 76M = 89619801 -281783 * * NM:i:5 MD:Z:4A6A30G4A5T22 AS:i:51

      You're right to say that this secondary is paired to the primary alignment. However, if you were to look at the mate alignment "= 89619801", it has a flag of 147 which indicates that it is mapped to the reverse strand as per the secondary alignment I listed above. Thus, this still does not quite make sense.

      Comment

      • WhatsOEver
        Senior Member
        • Apr 2012
        • 215

        #4
        In my bwa version (0.7.8) the manual says for the "a" parameter:

        -a output all alignments for SE or unpaired PE
        As the parameter is not to be used for paired end data, I think some errors are likely to occur

        Comment

        • kartong
          Junior Member
          • Mar 2014
          • 4

          #5
          I've noticed the description when I was using the "-a" parameter. What you said, makes sense. If it was meant to be an "unpaired alignment", I just don't get why the "paired properly" flag was raised. It is probably an error like what you said.

          Comment

          • WhatsOEver
            Senior Member
            • Apr 2012
            • 215

            #6
            Well, it would be necessary to look into the code to find the problem that is causing the error. But I think the developers were aware of it (as "real" PE reads are excluded in the description) and either didn't find a good solution or didn't think it would be required?! (I'm just speculating, of course )

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Today, 11:58 AM
            0 responses
            9 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            25 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            35 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            56 views
            0 reactions
            Last Post SEQadmin2  
            Working...