Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • alexbmp
    Member
    • Oct 2011
    • 30

    Questions about VarScan

    Hi all. I'm a student working to identify tumor-specific mutations.

    Recently I've came up to VarScan and SomaticSniper.

    There are some confusions about using VarScan, and I see that the documentation is somewhat lacking in catching up its own versions, so I ask this question here !

    I'm using VarScan.v2.2.8.jar somatic when doing my analyses.

    1. When What is the "germline p-value" exactly?
    1.1. I see that --somatic-p-value sets a p-value to statistically test (Fisher's exact test) whether "normal variant allele frequency of one locus" is the same as "tumor variant allele frequency of one locus".

    ( Many thanks about the discussion here (it really was a great help! ):
    http://seqanswers.com/forums/showthr...hlight=varscan )

    So for example, I thought such contingency table could be made (I drew it crudely because I could not put in spaces instead of hyphens and underscores; ugly, isn't it?):
    ------------Normal--Tumor
    Reference___10______2
    1 Variant_____2______8 --> Fisher's exact p-value calculated.

    ...where the counts are reads mapped to the locus.

    1.2. However, how can a "germline p-value" be calculated? D. Koboldt (the developer) himself explained that this germline p-value is "the null hypothesis is that the sample is homozygous-reference; under this hypothesis, all reads should support the reference base", which I do not understand exactly. I ask for a comment about this matter.


    2. Can VarScan emit output in VCF (variant call format)?
    It is written that now VarScan supports output to be in VCF, as written on the main web page as "VarScan v2.2.8 released with new somatic calling features: Tumor-normal mpileup compatibility and VCF 4.1 output option." (http://varscan.sourceforge.net/index.html)
    However, all I can find inside the documentation (or I'm doing something wrong) is about v2.2, not v2.2.8.


    I really hope somebody could give me a warm hand.

    Have a great day!!
    Last edited by alexbmp; 02-12-2012, 11:13 PM. Reason: To draw a table and add details
  • dkoboldt
    Member
    • Mar 2009
    • 62

    #2
    Hello,

    The germline P-value in VarScan output is computed when a site has been classified as Germline (the same in both normal and tumor). In this situation, the read counts from normal and tumor are combined into a total number for each allele (reference and variant). This is compared via Fisher's Exact Test to the null hypothesis, a wild-type position, at which all reads should reflect the reference allele.

    The VCF 4.1 output option is available with VarScan 2.2.8 and above. However, it is currently only enabled for multi-sample calling functions mpileup2snp, mpileup2indel, and mpileup2cns. We are working on making VCF output an option for somatic mutation calling for the next release.

    Comment

    • blackgore
      Member
      • Sep 2009
      • 20

      #3
      Hi dkoboldt,
      I've just started using VarScan (v2.2.11) and so far am very happy with what it can do. I am formatting my output from mpilup2snp and mpileup2indel as VCF, but, for me at least, the output is in version 4.0 (see header information below), not 4.1 as stated above. Is there another option distinct from setting "--output-vcf 1" to set it to version 4.1?

      ##fileformat=VCFv4.0
      ##source=VarScan2
      ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
      ##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
      ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
      ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
      ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">

      Comment

      • dkoboldt
        Member
        • Mar 2009
        • 62

        #4
        Hello,

        We've just released VarScan v2.3.1 with better VCF compatibility; it should create VCF4.1 output. If it does not, please let me know!

        Yours,

        Dan Koboldt

        Comment

        • blackgore
          Member
          • Sep 2009
          • 20

          #5
          Hi Dan,
          thanks very much for the announcement, and the new format seems to be working great!
          Gordon

          Comment

          • FrankiB
            Member
            • Dec 2013
            • 23

            #6
            Hi,

            Is it possible to get vcf output using somatic mutation calling? If yes what should be the option in the command line?

            Comment

            • Bukowski
              Senior Member
              • Jan 2010
              • 388

              #7
              Originally posted by FrankiB View Post
              Hi,

              Is it possible to get vcf output using somatic mutation calling? If yes what should be the option in the command line?
              Same as for germline:

              --output-vcf 1

              Comment

              • FrankiB
                Member
                • Dec 2013
                • 23

                #8
                Yes I see. Thank you very much

                Comment

                Latest Articles

                Collapse

                • GATTACAT
                  Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by GATTACAT
                  Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                  07-01-2026, 11:43 AM
                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 11:08 AM
                0 responses
                7 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-30-2026, 05:37 AM
                0 responses
                11 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                19 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                53 views
                0 reactions
                Last Post SEQadmin2  
                Working...