Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Varscan somatic bug report and interpret mpileup2cns result

    I'm facing the following bug report while running varscan somatic
    The bug report shown as below:
    Code:
    Bug report:
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x00007f4a3bf04fe8, pid=21559, tid=139956786239248
    #
    # JRE version: 6.0_17-b17
    # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 )
    # Derivative: IcedTea6 1.7.4
    # Distribution: Custom build (Thu Jul 29 16:49:18 EDT 2010)
    # Problematic frame:
    # V  [libjvm.so+0x57dfe8]
    #
    # If you would like to submit a bug report, please include
    # instructions how to reproduce the bug and visit:
    #   http://icedtea.classpath.org/bugzilla
    #
    
    ---------------  T H R E A D  ---------------
    
    Current thread (0x00007f4a34012000):  GCTaskThread [stack: 0x00007f4a3a771000,0x00007f4a3a872000] [id=21561]
    
    siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000
    7fff101ff000-7fff10200000 r-xp 00000000 00:00 0                          [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
    
    VM Arguments:
    java_command: java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1
    
    Launcher Type: SUN_STANDARD
    
    Environment Variables:
    PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/edge1987/bin
    LD_LIBRARY_PATH=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64
    SHELL=/bin/bash
    log file:
    Code:
    [b]Min coverage:	8x for Normal, 6x for Tumor
    Min reads2:	2
    Min strands2:	1
    Min var freq:	0.08
    Min freq for hom:	0.75
    Normal purity:	1.0
    Tumor purity:	1.0
    Min avg qual:	15
    P-value thresh:
    	0.1
    Somatic p-value:	0.05
    Reading input from normal_tissue.mpileup
    Reading mpileup input...
    Parsing Exception on line:
    normal_tissue_seq1_630	286	A	40	^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.	?@CCCCCB@@C<@CC@@?CCCCC@@@@CCCCC<@@C@@<@
    6
    [/b]
    The command I run is shown as:
    samtools mpileup -f reference.fasta normal.bam > normal_tissue.mpileup
    samtools mpileup -f reference.fasta infected.bam > infected_tissue.mpileup
    java -jar VarScan.jar somatic normal_tissue.mpileup infected_tissue.mpileup normal_infected_comparison --mpileup 1 --min-var-freq 0.08 --p-value 0.10 --somatic-p-value 0.05 --output-vcf 1

    Apart from that, below is the output result after running the command:
    samtools mpileup -f reference.fasta normalA.bam infectedA.bam normalB.bam infectedB.bam | java -jar VarScan.jar mpileup2cns --min-var-freq 0.08 --p-value 0.05 --output-vcf 1 >cross-sample.varScan.vcf
    Code:
    ##FORMAT=<ID=ADR,Number=1, Type=Integer,Description=" Depth of variant-supporting bases on reverse strand (reads2minus)">
    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1 Sample2 Sample3 Sample4
    normal_tissue_seq1_630     101     .       A       .       .       PASS    ADP=0;WT=0;HET=0;HOM=0;NC=4     GT:GQ:SDP:DP:RD:AD:FREQ:PVAL: RBQ:ABQ:RDF:RDR:ADF:ADR    ./.:.:0 ./.:.:1 ./.:.:0 ./.:.:0
    normal_tissue_seq5_580      532     .       A       .       .       PASS    ADP=1548;WT=4;HET=0;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:2147483647:1957:1820:1817:2:0.11%:5E-1:33:23:923:894:0:2    0/0:2147483647:1987:1894:1893:1:0.05%:7.5007E-1:34:17:1189:704:0:1
    normal_tissue_seq10_950      533     .       C       T       .       PASS    ADP=1611;WT=3;HET=1;HOM=0;NC=0  GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/0:303:1969:1843:1820:23:1.25%:1.3987E-6:33:24:880:940:4:19    0/0:2147483647:1981:1916:1908:8:0.42%:1.9421E-2:35:23:1162:746:2:6
    I not sure how to interpret the output result of mpileup2cns
    Thanks for any advice.

  • #2
    Which version of Varscan are you using?
    I never noticed this option --mpileup. What is it for?

    Comment


    • #3
      I used the latest version of Varscan.
      The mpileup is replaced the pileup right now.
      I able to run Varscan right now.
      The above error is due to the problem of my java version

      Apart from that, below is one of the output result after running VarScan somatic:
      Code:
      read9786_577      111     .       G       A       .       PASS    DP=951;SS=3;SSC=32;GPV=1E0;SPV=5.8927E-4        GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:8:4:2:33.33%:3,1,1,1      0/0:.:943:859:4:0.46%:611,248,2,2
      As I know, 8 is refer to data depth, 4 is refer to total number of reference read and 2 is refer to total number of allele read.
      Just wondering why the sum of total number of reference read and total number of allele read is less than total data depth?
      Is it due to the quality score of bases that consider good quality bases just only 6 bases?
      The above output result pattern is looked quite frequent at my data set.

      Apart from that, do you mind to share more or perhaps just provided me some simple example regarding how to interpret genotype in the output result?
      As I know, 0/0 = homozygote reference, 1/1 homozygote alternate, 0/1 is heterozygous and -/- is no call.
      But I just a bit blur to distinguish 3 of the above cases, especially "1/1"

      Comment


      • #4
        Edge,

        I'm glad you figured out the Java JRE issue behind that exception. As for your second question, the differences in read depth are because of the minimum base quality requirement. DP reflects the SAMtools depth (no base quality requirement), but RD/AD are VarScan's readcounts (by default, qual>15).

        I'm confused by your question about the genotype... its interpretation is spelled out quite clearly in the VCF specification. In your example:

        Sample 1 is 0/1, or heterozygous-variant, with genotype GA.
        Sample 2 is 0/0, or wildtype, with genotype GG.

        If there were a third sample that was 1/1, its genotype would be AA.

        Comment


        • #5
          Hi Edge,

          In relation to the genotypes, I am using VarScan v2.3.6. I found several lines in which the genotypes are marked as 1/1 while both samples are equal to the reference (0/0).

          Here few examples:
          chr1 721668 . C . PASS DP=168;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:78:78:0:0%:34,44,0,0 1/1:.:90:90:0:0%:38,52,0,0

          REFERENCE: chr1 721687 . C . PASS DP=139;SS=0;SSC=0;GPV=1E0;SPV=1E0 GT:GQP:RD:AD:FREQP4 1/1:.:71:71:0:0%:22,49,0,0 1/1:.:68:67:0:0%:22,45,0,0

          Do you know why have this genotypes been classified as 1/1?
          Thank you in advance,
          Lucia

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          26 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          42 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-11-2024, 07:45 AM
          0 responses
          42 views
          0 likes
          Last Post seqadmin  
          Working...
          X