SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
going from RNA seq TopHat output to variant calls efoss Bioinformatics 12 11-11-2013 01:15 AM
Where can I get GM19240 HapMap cell line variant calls as VCF or a BED file adrian Bioinformatics 2 09-13-2012 01:27 AM
Variant calls with a low fraction of alt reads Jeremy37 Bioinformatics 9 04-17-2012 06:18 PM
convert CASAVA variant calls to VCF? krish Bioinformatics 0 12-01-2011 08:44 PM
merging and de-duplicating structural variant calls (bedpe) splaisan Bioinformatics 0 06-27-2011 07:29 AM

Reply
 
Thread Tools
Old 04-14-2011, 03:56 AM   #1
quicksand21
Junior Member
 
Location: San Francisco, CA

Join Date: May 2010
Posts: 6
Default Complete Genomics Variant Calls

Hello,

I was wondering if anyone could shed some light on the totalScore column in the VAR files produced by Complete Genomics? Specifically what do these scores mean? Is there a best practice in terms of thresholding for high confidence variants?

Thank you in advance for your advice!
quicksand21 is offline   Reply With Quote
Old 04-22-2011, 05:13 AM   #2
jason.laramie
Junior Member
 
Location: Boston

Join Date: Feb 2011
Posts: 3
Default

Hi,
The totalScore is a likelihood ratio test between the most likely hypothesis (e.g. genotype) and the next most likely, and we express this score in decibels (dB). Bioinformaticists will recognize dB as the basis of the Phred scale: 10 dB means the likelihood ratio is 10:1, 20 dB means 100:1, 30 dB is 1000:1, etc. The variant scores factor in quantity of evidence (read depth), quality of evidence (base call quality values), and mapping probabilities. Therefore, the score measures our confidence in calling the variant. Likewise, we produce a "refScore" value that is calculated in a similar fashion but with the numerator of the likelihood being set to homozygous reference. Finally, the refScore can be used to ask how confident we are in the position being homozygous reference (e.g. high scores = high confidence) and if not homozygous reference the totalScore will then ask how confident are we in the genotype we called.

Scores for variants are not calibrated on an absolute scale to error rate. A score of 30 dB does not necessarily indicate that the P(error)=0.001.

20 dB is presently the minimum score for calling a homozygous variant and 40dB is for a heterozygous variant. Based on empirical testing, these thresholds were chosen to balance call-rate accuracy. Additionally, we add another layer of calls into our assembly process which is the "no-call". Therefore, a call can be homozygous ref, something else, or no-call. The no-call results from one hypothesis not being well separated from the other hypothesizes (>20dB) and, therefore, not sure what the correct answer is.

As for best practices, since we have thresholded these as mentioned above and generated "no-calls" when the information is not well separated for each hypothesis, most of our customers take the genotype calls "as is" without applying another filter.

Jason Laramie, PhD
Principal Field Application Scientist
Complete Genomics, Inc
jason.laramie is offline   Reply With Quote
Old 10-11-2011, 07:21 AM   #3
karenliu
Junior Member
 
Location: Seattle, WA

Join Date: Oct 2011
Posts: 1
Default

Hi Jason,

A follow up question to your answer: you said
20 dB is presently the minimum score for calling a homozygous variant and 40dB is for a heterozygous variant
I see that each allele in a diploid locus is called separately. For example, I can have a genotype AN or GN or NN. Namely, no-calls are determined per allele bases. If this is the case, what does the homozygous vs. heterozygous variant mean in your definition above?

Thanks.
Karen Liu
karenliu is offline   Reply With Quote
Reply

Tags
cgi, variant analysis

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO