Seqanswers Leaderboard Ad

**kmcarr** · 08-15-2014, 07:37 AM

Originally posted by kartong View Post

I was running BWA-MEM (Version: 0.7.7-r441) on my illumina sequencing data with the "-a" option to report all alignments and noticed something really weird with the non-primary alignments. Some of the non-primary alignments were given the flag "Properly paired" when the orientation of the two reads doesn't seem to be consistent with what you expect of a proper pair for a illumina dataset (reads highlighted in blue).

Read 1
HWI-ST513_0151:1:1101:4168:2366#0 99 2 89619552 0 76M = 89619801 287 TTCTGCTGATACCAATTTAAATCGCTGCCAATGTTCTGACTTGCCCGGCAAGTGATGGTGACTCCGTCTCCCACAG H@@@@DDD@DDDDEDDDDDDDDDDDBDDDFHDDHDEEDHDDHHHHGHFEHHHHHHHHHGHHHHHHHGHHHHHGGG? NM:i:5 MD:Z:22A5T4C30T6T4 AS:i:51 XS:i:51
HWI-ST513_0151:1:1101:4168:2366#0 371 2 89901545 0 76M = 89619801 -281783 * * NM:i:5 MD:Z:4A6A30G4A5T22 AS:i:51
HWI-ST513_0151:1:1101:4168:2366#0 355 2 89567927 0 76M = 89619801 51912 * * NM:i:8 MD:Z:22A1T3T5C6C3T18T6T4 AS:i:36
HWI-ST513_0151:1:1101:4168:2366#0 371 2 89953111 0 76M = 89619801 -333349 * * NM:i:8 MD:Z:4A6A18A3G6G5A3A1T22 AS:i:36
HWI-ST513_0151:1:1101:4168:2366#0 371 2 90458425 0 76M = 89619801 -838663 * * NM:i:9 MD:Z:4A6A22G6G0G4A3A0T0A22 AS:i:31

Read 2
HWI-ST513_0151:1:1101:4168:2366#0 147 2 89619801 0 38S38M = 89619552 -287 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCTCGGAGCCAGAGTAGCAGGAGCCCCAGGAGCTGAG E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD?<GED>FFDFFHHHCHFHHHHHHHFGHHHHHHHHH NM:i:0 MD:Z:38 AS:i:38 XS:i:38 SA:Z:2,89399608,-,42M34S,0,2;
HWI-ST513_0151:1:1101:4168:2366#0 387 2 89901334 0 38M38H = 89619552 -281783 * * NM:i:0 MD:Z:38 AS:i:38
HWI-ST513_0151:1:1101:4168:2366#0 387 2 114163992 0 38M38H = 89619552 -24544441 * * NM:i:1 MD:Z:22G15 AS:i:33
HWI-ST513_0151:1:1101:4168:2366#0 2195 2 89399608 0 42M34H = 89619552 219904 GAGGAGGGAGACTGGGTCATCAGGATGTCACATCTGGCACCT E;;??2F?DECEBEE@ACA?@.@;==@>A<A:>A<B:A7DD? NM:i:2 MD:Z:5T15T20 AS:i:32 XS:i:32 SA:Z:2,89619801,-,38S38M,0,0;

Specifically, if you look at the second alignment of each read, you'll notice that they have a flag of 371 and 387 indicating that the two reads align on the same strand. However, for an illumina paired end dataset, the expectation is for a proper pair to run in the following manner:
-------> <-------

(FYI: I have not noticed such discrepancies with the primary alignments though.)

Thus, is this a case where BWA-MEM only checks for the mate-pair distance and not the orientation of the reads for the non-primary alignments? (i.e. we cannot trust the "properly-paired" flag for the non-primary alignments??)

The order of secondary alignments reported is not meant to imply pairing. You have to look at the position of the aligned read and the position of the mate alignment to determine what it is "pairing with". See that all of the various read 1 alignments are being "paired" with the primary alignment of read 2 at position of 89619801, and all of the read 2 alignments are being "paired" with the primary alignment of read 1 at position 89619552.

**kartong** · 08-15-2014, 03:44 PM

Yes, I do understand that the order of the secondary alignments do not imply pairing.

But if you were to look at this secondary alignment, you'll notice that it has a flag of 371 (read mapped in proper pair + read reverse strand + mate reverse strand) which does not make sense at all for a "proper pair"
HWI-ST513_0151:1:1101:4168:2366#0 371 2 89901545 0 76M = 89619801 -281783 * * NM:i:5 MD:Z:4A6A30G4A5T22 AS:i:51

You're right to say that this secondary is paired to the primary alignment. However, if you were to look at the mate alignment "= 89619801", it has a flag of 147 which indicates that it is mapped to the reverse strand as per the secondary alignment I listed above. Thus, this still does not quite make sense.

**WhatsOEver** · 08-18-2014, 06:08 AM

In my bwa version (0.7.8) the manual says for the "a" parameter:

-a output all alignments for SE or unpaired PE

As the parameter is not to be used for paired end data, I think some errors are likely to occur

**kartong** · 08-18-2014, 06:39 PM

I've noticed the description when I was using the "-a" parameter. What you said, makes sense. If it was meant to be an "unpaired alignment", I just don't get why the "paired properly" flag was raised. It is probably an error like what you said.

**WhatsOEver** · 08-18-2014, 11:29 PM

Well, it would be necessary to look into the code to find the problem that is causing the error. But I think the developers were aware of it (as "real" PE reads are excluded in the description) and either didn't find a good solution or didn't think it would be required?! (I'm just speculating, of course

)

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 49 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 50 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 43 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

BWA-MEM non-primary alignments "false proper pairing"

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News