Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Varscan somatic bug report and interpret mpileup2cns result

    I'm facing the following bug report while running varscan somatic
    The bug report shown as below:
    Code:
    Bug report:
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x00007f4a3bf04fe8, pid=21559, tid=139956786239248
    #
    # JRE version: 6.0_17-b17
    # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )
    # Derivative: IcedTea6 1.7.4
    # Distribution: Custom build (Thu Jul 29 16:49:18 EDT 2010)
    # Problematic frame:
    # V  [libjvm.so+0x57dfe8]
    #
    # If you would like to submit a bug report, please include
    # instructions how to reproduce the bug and visit:
    #   http://icedtea.classpath.org/bugzilla
    #
    
    ---------------  T H R E A D  ---------------
    
    Current thread (0x00007f4a34012000):  GCTaskThread [stack: 0x00007f4a3a771000,0x00007f4a3a872000] [id=21561]
    
    siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000
    7fff101ff000-7fff10200000 r-xp 00000000 00:00 0                          [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
    
    VM Arguments:
    java_command: java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1
    
    Launcher Type: SUN_STANDARD
    
    Environment Variables:
    PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/edge1987/bin
    LD_LIBRARY_PATH=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64
    SHELL=/bin/bash
    log file:
    Code:
    [b]Min coverage:	8x for Normal, 6x for Tumor
    Min reads2:	2
    Min strands2:	1
    Min var freq:	0.08
    Min freq for hom:	0.75
    Normal purity:	1.0
    Tumor purity:	1.0
    Min avg qual:	15
    P-value thresh:
    	0.1
    Somatic p-value:	0.05
    Reading input from normal_tissue.mpileup
    Reading mpileup input...
    Parsing Exception on line:
    normal_tissue_seq1_630	286	A	40	^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.	?@CCCCCB@@C<@CC@@?CCCCC@@@@CCCCC<@@C@@<@
    6
    [/b]
    The command I run is shown as:
    samtools mpileup -f reference.fasta normal.bam > normal_tissue.mpileup
    samtools mpileup -f reference.fasta infected.bam > infected_tissue.mpileup
    java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1

    Apart from that, below is the output result after running the command:
    samtools mpileup -f reference.fasta normalA.bam infectedA.bam normalB.bam infectedB.bam | java -jar VarScan.jar mpileup2cns --min-var-freq 0.08 --p-value 0.05 --output-vcf 1 >cross-sample.varScan.vcf
    Code:
    ##FORMAT=<ID=ADR,Number=1, Type=Integer,Description=" Depth of variant-supporting bases on reverse strand (reads2minus)">
    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1 Sample2 Sample3 Sample4
    normal_tissue_seq1_630     101     .       A       .       .       PASS    ADP=0;WT=0;HET=0;HOM=0;NC=4     GT:GQ:SDP:DP:RD:AD:FREQ:PVAL: RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:0 ./.:.:1 ./.:.:0 ./.:.:0
    normal_tissue_seq5_580      532     .       A       .       .       PASS    ADP=1548;WT=4;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:2147483647:1957:1820:1817:2:0.11%:5E-1:33:23:923:894:0:2    0/0:2147483647:1987:1894:1893:1:0.05%:7.5007E-1:34:17:1189:704:0:1
    normal_tissue_seq10_950      533     .       C       T       .       PASS    ADP=1611;WT=3;HET=1;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:303:1969:1843:1820:23:1.25%:1.3987E-6:33:24:880:940:4:19    0/0:2147483647:1981:1916:1908:8:0.42%:1.9421E-2:35:23:1162:746:2:6
    I not sure how to interpret the output result of mpileup2cns
    Thanks for any advice.

  • #2
    Which version of Varscan are you using?
    I never noticed this option --mpileup. What is it for?

    Comment


    • #3
      I used the latest version of Varscan.
      The mpileup is replaced the pileup right now.
      I able to run Varscan right now.
      The above error is due to the problem of my java version

      Apart from that, below is one of the output result after running VarScan somatic:
      Code:
      read9786_577      111     .       G       A       .       PASS    DP=951;SS=3;SSC=32;GPV=1E0;SPV=5.8927E-4        GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:8:4:2:33.33%:3,1,1,1      0/0:.:943:859:4:0.46%:611,248,2,2
      As I know, 8 is refer to data depth, 4 is refer to total number of reference read and 2 is refer to total number of allele read.
      Just wondering why the sum of total number of reference read and total number of allele read is less than total data depth?
      Is it due to the quality score of bases that consider good quality bases just only 6 bases?
      The above output result pattern is looked quite frequent at my data set.

      Apart from that, do you mind to share more or perhaps just provided me some simple example regarding how to interpret genotype in the output result?
      As I know, 0/0 = homozygote reference, 1/1 homozygote alternate, 0/1 is heterozygous and -/- is no call.
      But I just a bit blur to distinguish 3 of the above cases, especially "1/1"

      Comment


      • #4
        Edge,

        I'm glad you figured out the Java JRE issue behind that exception. As for your second question, the differences in read depth are because of the minimum base quality requirement. DP reflects the SAMtools depth (no base quality requirement), but RD/AD are VarScan's readcounts (by default, qual>15).

        I'm confused by your question about the genotype... its interpretation is spelled out quite clearly in the VCF specification. In your example:

        Sample 1 is 0/1, or heterozygous-variant, with genotype GA.
        Sample 2 is 0/0, or wildtype, with genotype GG.

        If there were a third sample that was 1/1, its genotype would be AA.

        Comment


        • #5
          Hi Edge,

          In relation to the genotypes, I am using VarScan v2.3.6. I found several lines in which the genotypes are marked as 1/1 while both samples are equal to the reference (0/0).

          Here few examples:
          chr1 721668 . C . PASS DP=168;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:78:78:0:0%:34,44,0,0 1/1:.:90:90:0:0%:38,52,0,0

          REFERENCE: chr1 721687 . C . PASS DP=139;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:71:71:0:0%:22,49,0,0 1/1:.:68:67:0:0%:22,45,0,0

          Do you know why have this genotypes been classified as 1/1?
          Thank you in advance,
          Lucia

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Working...
          X