Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • SPRA
    Junior Member
    • Oct 2010
    • 2

    #16
    Samtools mpileup with -B or -E falg for Varscan

    I just ran varscan on mpileup data with -B or -E flag. And there is quite a difference indeed (see below). Should varscan always be used on mpileup data generated with the -B flag?

    samtools mpileup -d 10000 -S -B -C 50 -P Illumina -f hg19.fa normal.bam > normal.mpileup
    samtools mpileup -d 10000 -S -B -C 50 -P Illumina -f hg19.fa tumor.bam > tumor.mpileup
    java -jar VarScan.v2.2.10.jar somatic normal.mpileup tumor.mpileup tumor_vs_normal
    java -jar VarScan.v2.2.10.jar processSomatic tumor_vs_normal.snp
    48198 VarScan calls processed
    3168 were Somatic (1229 high confidence)
    37882 were Germline
    6937 were LOH

    OR

    samtools mpileup -d 10000 -S -E -C 50 -P Illumina -f hg19.fa normal.bam > normal.baq.mpileup
    samtools mpileup -d 10000 -S -E -C 50 -P Illumina -f hg19.fa tumor.bam > tumor.baq.mpileup
    java -jar VarScan.v2.2.10.jar somatic normal.baq.mpileup tumor.baq.mpileup tumor_vs_normal.baq
    java -jar VarScan.v2.2.10.jar processSomatic tumor_vs_normal.baq.snp
    7593 VarScan calls processed
    517 were Somatic (191 high confidence)
    6406 were Germline
    636 were LOH

    Comment

    • swbarnes2
      Senior Member
      • May 2008
      • 910

      #17
      Right, it's definately samtools pileup that is causing the problem. If you comapre the .sam file, and the two pileups (with and without -B) in regions where SNPs are being missed, you can see the problem. pileup -B will faithfully represent the letter quality scores found in the .sam file. pileup without -B will sometimes report the quality of those letters as being terrible, which causes SNP calling softwares to ignore them as too poor quality.

      Using -B is supposed to reduce the number of false positives. There might be applications where that potential trade off is worth it, but for what it's worth, I'm skeptical. In my work, I'd much rather sift through false positives than miss real mutations.

      Comment

      • bpetersen
        Member
        • Mar 2010
        • 20

        #18
        Hello everyone,
        I'm currently also experiencing problems with Varscan somatic, but they seem different from what has been described here. I've tried using the -B option for samtools pileup, but still I get completely wrong allele counts for my tumor sample for supposedly somatic mutations with very low p-values.
        One example:
        Create samtools pileup files with the command:
        Code:
        samtools pileup -Bf ref.fasta in.bam > out.pileup
        run varscan somatic:
        Code:
        java -Xmx16g -jar /home/sukmb205/software/VarScan.v2.2.11.jar somatic normal.pileup tumor.pileup outfile.var
        output varscan:
        Code:
        chrom	position	ref	var	normal_reads1	normal_reads2	normal_var_freq	normal_gt	tumor_reads1	tumor_reads2	tumor_var_freq	tumor_gt	somatic_status	variant_p_value	somatic_p_value	tumor_reads1_plus	tumor_reads1_minus	tumor_reads2_plus	tumor_reads2_minus
        chr5	88171909	A	T	46	0	0%	A	91	95	51.08%	W	Somatic	1.0	6.837592912705086E-13	0	91	0	95
        Looks like quite a good somatic mutation concerning the p-value, but when I look into IGV, NOT ONE read actually supports the T in the tumor sample. Instead, there is a single read at this position that has been aligned with a 308bp deletion, could this be the problem??

        pileup tumor:
        Code:
        chr5	88171909	A	63	.,,,,,,,,,,,,,,,,,,,,,,..,,,.$.,.,,,,,,.,,,..,,-308actcctcaaactaccttcccacaaagccatttaagttaaatggtacatttacagactcacctacatgaaggatataacttaaaacatctgcttagacacatacgttctgttcagatataaaaaatgtggcaaaaatttttaaaaatataggaccactatattcttaaaatgtgtgttcttctgtgtgtgtgtgttcattcattcaagagatctttgactgcaattaggtagtcggtcctataaaggcttccttgtgtgacgataatttctaaaagtaaaatgctccagtgaatatttctgctaaataa,,,,,,.,....,,...	/2.*114I,1<(%040":+3/--+?%.-.CE2;/3*0-B8=).0II--7,+,II%I09@*CII
        pileup normal:
        Code:
        chr5	88171909	A	71	...$.,,,,,,,,,,,.,,,,,,,,,.,,,,.,,.,,,,,,..,,,.,.,,,..,,,,,,.,...,....^X.^(.	BE,9=2++H8I31.)89,=I).2II=1I,+/8I51I>*2,7+.1I.I%-*-1I/2A-I/3@:I.5I8II(I
        I'm seeing similar things at all the high quality somatic mutations that were called and I've tried using mpileup instead of pileup, with option -B, without -B etc. I just don't know what the problem is!
        I would really appreciate some comments on this, thanks!

        Comment

        • bpetersen
          Member
          • Mar 2010
          • 20

          #19
          Am I the only one with this issue? I still can't figure out what the problem is, in other datasets Varscan worked fine for me, but if this isn't fixed I'm going to have to use a different tool...

          Comment

          • david.tamborero
            Member
            • Feb 2011
            • 60

            #20
            Hi!

            I've just landed in this post after a long time without using somatic calling tools.

            I am wondering if you solved this, I have no idea why it happened.

            Anyway, I've just downloaded the Varscan2, and I've noticed that it incorporates a filter that, among other things, it seems to remove those SNPs which appear close to indels. Could it help to you?

            Comment

            • stvos
              Junior Member
              • Aug 2011
              • 7

              #21
              Varscan results vs Genomeview

              Dear all,
              I am using /VarScan/x86_64/2.3.3
              VarScan Somatic
              I seem to have really big differences in SNP coverages in the SNP output file of Varscan and that of the visualized .bam file in Genomeview (similar to IGV).
              Has anyone found out yet what causes this?

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              12 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              47 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              106 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              125 views
              0 reactions
              Last Post SEQadmin2  
              Working...