Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Complete Genomics Variant Calls

    Hello,

    I was wondering if anyone could shed some light on the totalScore column in the VAR files produced by Complete Genomics? Specifically what do these scores mean? Is there a best practice in terms of thresholding for high confidence variants?

    Thank you in advance for your advice!

  • #2
    Hi,
    The totalScore is a likelihood ratio test between the most likely hypothesis (e.g. genotype) and the next most likely, and we express this score in decibels (dB). Bioinformaticists will recognize dB as the basis of the Phred scale: 10 dB means the likelihood ratio is 10:1, 20 dB means 100:1, 30 dB is 1000:1, etc. The variant scores factor in quantity of evidence (read depth), quality of evidence (base call quality values), and mapping probabilities. Therefore, the score measures our confidence in calling the variant. Likewise, we produce a "refScore" value that is calculated in a similar fashion but with the numerator of the likelihood being set to homozygous reference. Finally, the refScore can be used to ask how confident we are in the position being homozygous reference (e.g. high scores = high confidence) and if not homozygous reference the totalScore will then ask how confident are we in the genotype we called.

    Scores for variants are not calibrated on an absolute scale to error rate. A score of 30 dB does not necessarily indicate that the P(error)=0.001.

    20 dB is presently the minimum score for calling a homozygous variant and 40dB is for a heterozygous variant. Based on empirical testing, these thresholds were chosen to balance call-rate accuracy. Additionally, we add another layer of calls into our assembly process which is the "no-call". Therefore, a call can be homozygous ref, something else, or no-call. The no-call results from one hypothesis not being well separated from the other hypothesizes (>20dB) and, therefore, not sure what the correct answer is.

    As for best practices, since we have thresholded these as mentioned above and generated "no-calls" when the information is not well separated for each hypothesis, most of our customers take the genotype calls "as is" without applying another filter.

    Jason Laramie, PhD
    Principal Field Application Scientist
    Complete Genomics, Inc
    Jason Laramie, PhD
    Principal Application Scientist
    Complete Genomics, Inc.

    Comment


    • #3
      Hi Jason,

      A follow up question to your answer: you said
      20 dB is presently the minimum score for calling a homozygous variant and 40dB is for a heterozygous variant
      I see that each allele in a diploid locus is called separately. For example, I can have a genotype AN or GN or NN. Namely, no-calls are determined per allele bases. If this is the case, what does the homozygous vs. heterozygous variant mean in your definition above?

      Thanks.
      Karen Liu

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      51 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      67 views
      0 likes
      Last Post seqadmin  
      Working...
      X