SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
identification of indels with samtools interpretation of SNP quality cur Bioinformatics 11 02-08-2012 02:04 PM
MAQ cns.snp output HMorrison Bioinformatics 0 06-06-2011 11:02 AM
MAQ Simulator SNP/Indel Output peveralldubois Bioinformatics 1 02-12-2011 07:30 PM
Calculating the Uniqueness of MAQ SNP output ShwenHo Bioinformatics 0 03-08-2010 04:25 PM
MAQ: SNPs interpretation, etc! nanelle Bioinformatics 0 08-25-2009 11:51 AM

Reply
 
Thread Tools
Old 07-23-2009, 08:42 AM   #1
griffon42
Member
 
Location: New York

Join Date: Jan 2009
Posts: 23
Default Interpretation of columns in MAQ SNP output?

Hi all-

Apologies if this information has been posted elsewhere - I wasn't able to find what I was looking for searching the forums.

I'm using MAQ cns2SNP (and SNPfilter) to call SNPs on short read alignments, mouse genome. Though things have been going well, I'm having a hard time interpreting some of the output.

The MAQ manual states, for cns2SNP output:
"Each line consists of chromosome, position, reference base, consensus base, Phred-like consensus quality, read depth, the average number of hits of reads covering this position, the highest mapping quality of the reads covering the position, the minimum consensus quality in the 3bp flanking regions at each side of the site (6bp in total), the second best call, log likelihood ratio of the second best and the third best call, and the third best call. "

I'm having trouble interpreting "the average number of hits of reads covering this position." Can anyone translate this into something more well-defined?

Also, for "Phred-like consensus quality" - though I realize this is not a true Phred score, what is the best way to think about this? Overall consensus base quality at a given position? "Confidence score" for calling a SNP?
In general, using SNPfilter and some arbitrary filters (discussed on the forums), I've been ignoring called SNPs with a "Phred-like" score of <40. I'd like to know what this actually means and whether it makes good sense.

Thanks for you help!
griffon42 is offline   Reply With Quote
Old 07-27-2009, 05:57 PM   #2
der_eiskern
Member
 
Location: California

Join Date: Jul 2009
Posts: 46
Default re: Griffon interpreting SNPfilter outputs

Quote:
Originally Posted by griffon42 View Post
Hi all-

Apologies if this information has been posted elsewhere - I wasn't able to find what I was looking for searching the forums.

I'm using MAQ cns2SNP (and SNPfilter) to call SNPs on short read alignments, mouse genome. Though things have been going well, I'm having a hard time interpreting some of the output.

The MAQ manual states, for cns2SNP output:
"Each line consists of chromosome, position, reference base, consensus base, Phred-like consensus quality, read depth, the average number of hits of reads covering this position, the highest mapping quality of the reads covering the position, the minimum consensus quality in the 3bp flanking regions at each side of the site (6bp in total), the second best call, log likelihood ratio of the second best and the third best call, and the third best call. "

I'm having trouble interpreting "the average number of hits of reads covering this position." Can anyone translate this into something more well-defined?
...
In general, using SNPfilter and some arbitrary filters (discussed on the forums), I've been ignoring called SNPs with a "Phred-like" score of <40. I'd like to know what this actually means and whether it makes good sense.
What I can gather is that the "average number of hits of reads covering this position" refers to the repetitiveness of the reads mapped to that locus. My impression was that many reads will map to more than 1 genomic position. Anything greater than ~1.1 I consider repetitive and ignore (if i can). This was suggested by Shen et al. According to them 1.1 "represents a conservative cutoff to avoid repeats and alleviate the mapping issues with shorter...reads."

as for phred-like quality scores, i believe heng li talks about what that means in section 4.5 of the first MAQ paper: http://genome.cshlp.org/content/18/11/1851.full.
quickly, (mathematics omitted)
Quote:
Before consensus calling, MAQ first combines mapping quality and base quality. If a read is incorrectly mapped, any sequence differences inferred from the read cannot be reliable. Therefore, the base quality used in SNP calling cannot exceed the mapping quality of the read. MAQ reassigns the quality of each base as the smaller value between the read mapping quality and the raw sequencing base quality.
hope this helps.
der_eiskern is offline   Reply With Quote
Old 07-30-2009, 06:28 PM   #3
griffon42
Member
 
Location: New York

Join Date: Jan 2009
Posts: 23
Default

Thanks. That makes things considerably more clear.
griffon42 is offline   Reply With Quote
Old 09-13-2010, 02:36 AM   #4
mixter
Member
 
Location: Munich, Germany

Join Date: May 2010
Posts: 22
Default Allele counts from MAQ snp output

Hello,

I have an additional question, after having read this thread and the documentation.

What I need would be the direct counts for the reference and variant allele, like in VarScan, for example, but in MAQ output there is just "read depth"?

Is there any way I can get or derive these 2 counts (reference and variant) from MAQ SNP output?

Thanks!
mixter is offline   Reply With Quote
Reply

Tags
maq, output, snps indels

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:11 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO