Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Varscan frequency question

    I've been trying out varscan for indel calling recently but I have an issue with the output, rather the information in the output... The field named Freq doesnt seem to agree with the different Reads fields in my output. Here is a couple of examples (I've formatted the output so that it is a bit more reader-friendly):

    No code has to be inserted here.I'd have thought the frequency would be Reads2/(Reads1+Reads2), the reads supporting the variant over all the reads. This is not the case for most of my entries. They are all ball-park-close but few are spot on (I've included an entry that seems correct, the last one).

    For the first row of my example: 13/14 = 0.9286 != 0.8667
    However, what is 0.8667 is this: 13/15.

    Are the reads wrong? Are the frequencies wrong? Are they calculated with different views on what is a supporting read? Something is going on that I cant seem to be able to figure out, or find searching forums etc.

    I'm using VarScan v2.3.2, mpileup2indel with a few parameters:
    --min-var-freq 0.001
    --min-avg-qual 30
    --min-reads2 10
    --strand-filter 1
    --p-value 0.9

    I'm sure theres a simple answer, does anyone have it?
    Cheers
    //Adrian

  • #2
    Hi Adrian,
    I'm also new to VarScan. I wonder if these figures are affected by the flag:
    "--min-avg-qual Minimum base quality at a position to count a read [15]"
    The Freq of 94.12% given in your second example can be derived from 32/34, so perhaps there are 34 reads over that position, but only 33 that have base qualities of 15 or above. You could eyeball the pileup and see if this is true or try using the
    VarScan readcounts tools with parameters:
    --min-coverage 0
    --min-base-qual 0

    I'd be interested to see how you get on.
    Cheers,
    Graham

    Comment


    • #3
      Thank you Graham, for your input!

      I ran readcounts with parameters:
      --min-coverage 0
      --min-base-qual 30 (Since this is what I ran mpileup2idel with)

      And I think I found the answer to my question.

      No code has to be inserted here.The entries with "deviating" frequencies had additional variants, in the example cases one read each. Adding this read to the total gives us the same frequency as printed in the output file (13/15 and 32/34). Some of my positions had several additional variants in the pileup file making the total even more "wrong" when added together just from the mpileup2indel output file. At least now that I understand it there is no problem anymore, I can trust these values a bit more.

      So in the end the answer was quite simple, I guess.

      Thanks again
      //Adrian


      Edit: I may have broken the forum boundaries with my huge table...

      Comment


      • #4
        Hello Adrian and Graham,

        Thank you for bringing this up and looking into the issue. Adrian, would you mind sending the raw pileup for the two positions that you mentioned?

        Feel free to use VarScan's support forum if you have other issues or questions.

        Note that correctly counting reads (supporting or refuting) for indels is difficult using the first-pass alignments in a BAM file. Optimally, you would use VarScan to discover the indels, and then use realignment (GATK) or indel haplotype remapping (DINDEL) to obtain more accurate read counts and variant allele frequencies.

        Yours,

        Dan Koboldt

        Comment


        • #5
          Originally posted by adrianl View Post
          Thank you Graham, for your input!

          I ran readcounts with parameters:
          --min-coverage 0
          --min-base-qual 30 (Since this is what I ran mpileup2idel with)

          And I think I found the answer to my question.

          No code has to be inserted here.The entries with "deviating" frequencies had additional variants, in the example cases one read each. Adding this read to the total gives us the same frequency as printed in the output file (13/15 and 32/34). Some of my positions had several additional variants in the pileup file making the total even more "wrong" when added together just from the mpileup2indel output file. At least now that I understand it there is no problem anymore, I can trust these values a bit more.

          So in the end the answer was quite simple, I guess.

          Thanks again
          //Adrian


          Edit: I may have broken the forum boundaries with my huge table...
          Can you tell how this result come out? With which program and parameters?

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Innovations in Spatial Biology
            by seqadmin


            Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

            3D Genomics
            While spatial biology often involves studying proteins and RNAs in their...
            01-01-2025, 07:30 PM
          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 01-09-2025, 04:04 PM
          0 responses
          439 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-09-2025, 09:42 AM
          0 responses
          443 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-08-2025, 03:17 PM
          0 responses
          459 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 01-03-2025, 11:18 AM
          1 response
          50 views
          1 like
          Last Post Tonia
          by Tonia
           
          Working...
          X