![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
SAM Format - SEQ field '=' | Bio.X2Y | Bioinformatics | 3 | 04-25-2012 05:26 AM |
BWA MapQ field | yaten2020 | Bioinformatics | 1 | 11-22-2011 12:10 PM |
Is * really a valid value for a SAM FLAG field? | derobins | Bioinformatics | 1 | 01-20-2011 10:06 AM |
SAM flag field and removing unmapped reads from BFAST output | aiden | Bioinformatics | 3 | 05-27-2010 07:10 PM |
Extracting one field from a SAM file | jdrum00 | Bioinformatics | 8 | 01-04-2010 09:40 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Ljubljana Join Date: Jun 2010
Posts: 11
|
![]()
I am trying to find out how bowtie mapq field is computed:
From the bowtie manual: If an alignment is non-repetitive (according to -m, --strata and other options) set the MAPQ (mapping quality) field to this value. See the SAM Spec for details about the MAPQ field Default: 255. If i map data with, for example, -m 10, i am allowing up to 10 multiple hits per read. How is the reported SAM mapq number connected with the number of multiple hits of an alignment? If a read has a single hit (from the manual), mapq = 255. If a read has 5 multiple hits, mapq=? SAM documentation doesn't tell me anything more (only that it is phread-scaled): MAPping Quality (phred-scaled posterior probability that the mapping position of this read is incorrect) In the end i would like to do something like this: samtools view -q 100 result.sam (filter out results with mapq <100) tnx for any help, Gregor |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Mexico Join Date: Mar 2011
Posts: 137
|
![]()
Tophat/bowtie don’t report mapping quality values that are as meaningful as BWA, but there is some information in the mapping quality values tophat reports. Tophat yields 4 distinct values for its mapping quality values (you can do a “unique” count on the mapping quality field of any SAM file from tophat to verify this):
255 = unique mapping 3 = maps to 2 locations in the target 2 = maps to 3 locations 1 = maps to 4-9 locations 0 = maps to 10 or more locations. Except for the 255 case, the simple rule that was encoded by the authors is the usual phred quality scale: MapQ = -10 log10(P) Where P = probability that this mapping is NOT the correct one. The authors ignore the number of mismatches in this calculation and simply assume that if it maps to 2 locations then P = 0.5, 3 locations implies P = 2/3, 4 locations => P = 3/4 etc. As you can clearly see, then MapQ = -10 log10(0.5) = 3; -10 log10(2/3) = 1.76 (rounds to 2); -10 log10(3/4) = 1.25 (rounds to 1), etc. |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Mexico Join Date: Mar 2011
Posts: 137
|
![]()
I think it is safe to say that bowtie does the same.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|