I have been using Bwa to map paired end reads(illumina) recently and I thought i could use some of your help to get answers for some questions that I have.
1. What does it exactly mean by the term "proper pair" in bwa? Does bwa consider orientation of mapped pairs? r we talking only case i below? or just based on the insert size
case i) -----> <------
case ii) -----> ------->
case iii) <------ <------
case iv) <------ -------->
2. i have a whole bunch of mapping result that looks quite odd to me.
They look like they are paired by bwa sampe but obviously they have different read names so can't be a pair..
I321_1_FC30VWBAAXX:7:100:1611:994 83 gi|150002608|ref|NC_009614.1| 1915637 29 75M = 1859241 -56471 TGGCAAATTCCAATTGGGGCTTTTCAATGAATGTTTTTACTTTAAAGAATTCTACTTGTTTTTCTTCCTCAATCT AAALIENKEHOLOKJD>=HXOIUNbda_Sh`hLhah]h]haZhhhhShhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:1 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:53C21
I321_1_FC30VWBAAXX:7:100:1615:784 163 gi|150002608|ref|NC_009614.1| 1859241 29 20M55S = 1915637 56471 GCTTGTTGTGACTTTGATAGATTGTGACGTGTACGAAAATATGCAAGAGGCGGGGATTGATTCGTCTAGCCCGTT hhhhhhhhhhhehhhhhhhhhhhhSheMhRh`hNhWXhWhehhP][\Sh^Ehh^hNW^[PK_<NJJNGN>AGRCH XT:A:M NM:i:2 SM:i:29 AM:i:29 XM:i:2 XO:i:0 XG:i:0 MD:Z:5C9T4
3. When I grep for sequence "I321_1_FC30VWBAAXX:7:100:1611:994"
I get :
I321_1_FC30VWBAAXX:7:100:1611:994 129 gi|150002608|ref|NC_009614.1| 1915576 37 75M = 3763132 1847556 TTGATATTCCATAAGAATATTCCTGAGTTCCAATAGAATTCTCCACTTTCTACGAATACTTTGGCAAATTCCAAT hhhhhhhhhhhhhhhhhhhhhedhhhhhh`hh]hhhcZhcXhZR`ThhhPhhU]RQ`YJSX^PPLNPMWSLCIHU XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:75
I321_1_FC30VWBAAXX:7:100:1611:994 83 gi|150002608|ref|NC_009614.1| 1915637 29 75M = 1859241 -56471 TGGCAAATTCCAATTGGGGCTTTTCAATGAATGTTTTTACTTTAAAGAATTCTACTTGTTTTTCTTCCTCAATCT AAALIENKEHOLOKJD>=HXOIUNbda_Sh`hLhah]h]haZhhhhShhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:1 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:53C21
where the flag of read1(129 --> 0b10000001) is telling me that read1 is mapped in a proper pair and the flag of read2(83 --> 0b1010011) is also telling it's mapped in a proper pair. but they are not paired ( i can tell by the difference size of inferred insert size)
Can anyone help answering these questions? I am trying to filter pairs are mapped properly. I tried using samtools view with -f 2 option (since 2 is 0x02 bit for proper mapping) but i have so many pairs similar to what i described up there.
Anyone? Thanks!
1. What does it exactly mean by the term "proper pair" in bwa? Does bwa consider orientation of mapped pairs? r we talking only case i below? or just based on the insert size
case i) -----> <------
case ii) -----> ------->
case iii) <------ <------
case iv) <------ -------->
2. i have a whole bunch of mapping result that looks quite odd to me.
They look like they are paired by bwa sampe but obviously they have different read names so can't be a pair..
I321_1_FC30VWBAAXX:7:100:1611:994 83 gi|150002608|ref|NC_009614.1| 1915637 29 75M = 1859241 -56471 TGGCAAATTCCAATTGGGGCTTTTCAATGAATGTTTTTACTTTAAAGAATTCTACTTGTTTTTCTTCCTCAATCT AAALIENKEHOLOKJD>=HXOIUNbda_Sh`hLhah]h]haZhhhhShhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:1 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:53C21
I321_1_FC30VWBAAXX:7:100:1615:784 163 gi|150002608|ref|NC_009614.1| 1859241 29 20M55S = 1915637 56471 GCTTGTTGTGACTTTGATAGATTGTGACGTGTACGAAAATATGCAAGAGGCGGGGATTGATTCGTCTAGCCCGTT hhhhhhhhhhhehhhhhhhhhhhhSheMhRh`hNhWXhWhehhP][\Sh^Ehh^hNW^[PK_<NJJNGN>AGRCH XT:A:M NM:i:2 SM:i:29 AM:i:29 XM:i:2 XO:i:0 XG:i:0 MD:Z:5C9T4
3. When I grep for sequence "I321_1_FC30VWBAAXX:7:100:1611:994"
I get :
I321_1_FC30VWBAAXX:7:100:1611:994 129 gi|150002608|ref|NC_009614.1| 1915576 37 75M = 3763132 1847556 TTGATATTCCATAAGAATATTCCTGAGTTCCAATAGAATTCTCCACTTTCTACGAATACTTTGGCAAATTCCAAT hhhhhhhhhhhhhhhhhhhhhedhhhhhh`hh]hhhcZhcXhZR`ThhhPhhU]RQ`YJSX^PPLNPMWSLCIHU XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:75
I321_1_FC30VWBAAXX:7:100:1611:994 83 gi|150002608|ref|NC_009614.1| 1915637 29 75M = 1859241 -56471 TGGCAAATTCCAATTGGGGCTTTTCAATGAATGTTTTTACTTTAAAGAATTCTACTTGTTTTTCTTCCTCAATCT AAALIENKEHOLOKJD>=HXOIUNbda_Sh`hLhah]h]haZhhhhShhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:1 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:53C21
where the flag of read1(129 --> 0b10000001) is telling me that read1 is mapped in a proper pair and the flag of read2(83 --> 0b1010011) is also telling it's mapped in a proper pair. but they are not paired ( i can tell by the difference size of inferred insert size)
Can anyone help answering these questions? I am trying to filter pairs are mapped properly. I tried using samtools view with -f 2 option (since 2 is 0x02 bit for proper mapping) but i have so many pairs similar to what i described up there.
Anyone? Thanks!
Comment