Hello,
I am using Bowtie2 for paired-end alignments and I noticed a strange behavior: the mapping quality scores are not the same if the alignment is done in paired end mode or with the two mates mapped independently. The differences come with ambiguous sequences.For example :
- in paired-end mode :
HISEQ:3291TN8ACXX:6:2306:8678:87560 65 ecoliK12 19558 44 7S84M = 3581916 0 CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA FFFBFEGID8?FFB'5?D;=6>B@BBDBBABB:?@898A9BD>9>0&75?A3?B4>@B99)5908<5?AA<<<-5?>@########### AS:i:156 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:7T70A2T2 YT:Z:UP
HISEQ:3291TN8ACXX:6:2306:8678:87560 129 ecoliK12 3581916 44 90M1S = 19558 0 CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC F><<+A?C?+;E;CB?<11?D*:?DB:067@';/;''3;99?>@::>>A55<CC>::4:<1>4>C@######################### AS:i:172 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:75T6C7 YT:Z:UP
- with alignment of the two mates one after the other:
bowtie2-2.1.0/bowtie2 -x /data/genomes_bact/ecoliK12 -p6 --sam-no-hd --sam-no-sq --quiet --local--very-sensitive-local -c CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA
0 0 ecoliK12 19558 42 7S78M6S * 0 0 CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:148 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:7T70 YT:Z:UU
> bowtie2-2.1.0/bowtie2 -x /data/genomes_bact/ecoliK12 -p6 --sam-no-hd --sam-no-sq --quiet --local --very-sensitive-local -c CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC
0 0 ecoliK12 3581916 1 90M1S * 0 0 CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:164 XS:i:164 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:75T6C7 YT:Z:UU
It is quite disconcerting for me since I though the mapping quality was calculated independently of the relation with the mate. I know in the paired end mode it will try to find near mates but if it cannot, it will normally give an independant position with the true Mapping quality score.What do you think?
I am using Bowtie2 for paired-end alignments and I noticed a strange behavior: the mapping quality scores are not the same if the alignment is done in paired end mode or with the two mates mapped independently. The differences come with ambiguous sequences.For example :
- in paired-end mode :
HISEQ:3291TN8ACXX:6:2306:8678:87560 65 ecoliK12 19558 44 7S84M = 3581916 0 CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA FFFBFEGID8?FFB'5?D;=6>B@BBDBBABB:?@898A9BD>9>0&75?A3?B4>@B99)5908<5?AA<<<-5?>@########### AS:i:156 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:7T70A2T2 YT:Z:UP
HISEQ:3291TN8ACXX:6:2306:8678:87560 129 ecoliK12 3581916 44 90M1S = 19558 0 CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC F><<+A?C?+;E;CB?<11?D*:?DB:067@';/;''3;99?>@::>>A55<CC>::4:<1>4>C@######################### AS:i:172 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:75T6C7 YT:Z:UP
- with alignment of the two mates one after the other:
bowtie2-2.1.0/bowtie2 -x /data/genomes_bact/ecoliK12 -p6 --sam-no-hd --sam-no-sq --quiet --local--very-sensitive-local -c CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA
0 0 ecoliK12 19558 42 7S78M6S * 0 0 CGACAAACCGGCGGGACAGCGAATCTGCAATACGGATTATTCTGCGCTTTTTAGTCCAGCGGTGCGTTAATCGGCAGCTCCCCCAGAGATA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:148 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:7T70 YT:Z:UU
> bowtie2-2.1.0/bowtie2 -x /data/genomes_bact/ecoliK12 -p6 --sam-no-hd --sam-no-sq --quiet --local --very-sensitive-local -c CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC
0 0 ecoliK12 3581916 1 90M1S * 0 0 CACGTATTCGGTGAACGCACTATGGCGACGCTGGGGCGTCTTATGAGCCTGCTGTCACCCTTTGACGTGGTGATAAGGATGATGGATGGCC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:164 XS:i:164 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:75T6C7 YT:Z:UU
It is quite disconcerting for me since I though the mapping quality was calculated independently of the relation with the mate. I know in the paired end mode it will try to find near mates but if it cannot, it will normally give an independant position with the true Mapping quality score.What do you think?
Comment