SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SAM Format - SEQ field '=' Bio.X2Y Bioinformatics 3 04-25-2012 04:26 AM
BWA MapQ field yaten2020 Bioinformatics 1 11-22-2011 11:10 AM
Is * really a valid value for a SAM FLAG field? derobins Bioinformatics 1 01-20-2011 09:06 AM
SAM flag field and removing unmapped reads from BFAST output aiden Bioinformatics 3 05-27-2010 06:10 PM
Extracting one field from a SAM file jdrum00 Bioinformatics 8 01-04-2010 08:40 PM

Reply
 
Thread Tools
Old 11-11-2010, 11:55 PM   #1
rgregor
Member
 
Location: Ljubljana

Join Date: Jun 2010
Posts: 11
Default bowtie SAM mapq field

I am trying to find out how bowtie mapq field is computed:

From the bowtie manual:

If an alignment is non-repetitive (according to -m, --strata and other options) set the MAPQ (mapping quality) field to this value. See the SAM Spec for details about the MAPQ field Default: 255.

If i map data with, for example, -m 10, i am allowing up to 10 multiple hits per read. How is the reported SAM mapq number connected with the number of multiple hits of an alignment? If a read has a single hit (from the manual), mapq = 255. If a read has 5 multiple hits, mapq=?

SAM documentation doesn't tell me anything more (only that it is phread-scaled):

MAPping Quality (phred-scaled posterior probability that the mapping position of this read is incorrect)

In the end i would like to do something like this:

samtools view -q 100 result.sam (filter out results with mapq <100)

tnx for any help,
Gregor
rgregor is offline   Reply With Quote
Old 12-19-2012, 04:01 PM   #2
carmeyeii
Senior Member
 
Location: Mexico

Join Date: Mar 2011
Posts: 137
Default

Tophat/bowtie don’t report mapping quality values that are as meaningful as BWA, but there is some information in the mapping quality values tophat reports. Tophat yields 4 distinct values for its mapping quality values (you can do a “unique” count on the mapping quality field of any SAM file from tophat to verify this):



255 = unique mapping

3 = maps to 2 locations in the target

2 = maps to 3 locations

1 = maps to 4-9 locations

0 = maps to 10 or more locations.



Except for the 255 case, the simple rule that was encoded by the authors is the usual phred quality scale:



MapQ = -10 log10(P)



Where P = probability that this mapping is NOT the correct one. The authors ignore the number of mismatches in this calculation and simply assume that if it maps to 2 locations then P = 0.5, 3 locations implies P = 2/3, 4 locations => P = 3/4 etc.



As you can clearly see, then MapQ = -10 log10(0.5) = 3; -10 log10(2/3) = 1.76 (rounds to 2);

-10 log10(3/4) = 1.25 (rounds to 1), etc.
carmeyeii is offline   Reply With Quote
Old 12-19-2012, 04:04 PM   #3
carmeyeii
Senior Member
 
Location: Mexico

Join Date: Mar 2011
Posts: 137
Default

I think it is safe to say that bowtie does the same.
carmeyeii is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:32 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO