Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VarScan copynumber error when filtering with copycaller, plus GC content > 100

    Hi,

    Im using VarScan.v2.3.2 to do CNV analysis on HighSeq exome data from tumor-normal pairs. When running copynumber, all appears ok, but when filtering the resulting .copynumber file using Copycaller, I get an error ("Parsing Exception", please see below).

    When running VarScan copynumber like this:
    java -jar /VarScan.v2.3.2.jar copynumber $NOR $TUM $BASENAME

    I got the following output:
    ########
    Normal Pileup: /177_1N.prmdup.realign.recal_sorted.mpileup
    Tumor Pileup: /177_1T.prmdup.realign.recal_sorted.mpileup
    Min coverage: 10
    Min avg qual: 15
    P-value thresh: 0.01
    Not resetting normal file because chrM < chrY
    561343988 positions in tumor
    557785077 positions shared in normal
    38214383 had sufficient coverage for comparison
    482476 raw copynumber segments with size > 10
    474997 good copynumber segments with depth > 10
    ##########

    So we have an error stating "Not resetting normal file because chrM < chrY".

    I saw an answer that dkobolt had given regarding this error message saying that "This is just a warning printed by VarScan as it's simultaneously parsing normal and tumor files. As long as your output files contain all of the chromosomes that you expect, you can safely ignore it."

    So I double checked and all chromosomes are present in the copynumber file.

    Then I ran Copycaller like this:
    java -jar /VarScan.v2.3.2.jar copyCaller $IN --output-file ${IN}.called

    The output I get from the CopyCaller is the following:
    #####################
    Min coverage: 20
    Reading input from /177_1T.copynumber
    Parsing Exception on line:
    chr1 10010 10109 100 30,4 28,4 -0,097 51,0
    For input string: "30,4"
    Error parsing input: null
    java.lang.NullPointerException
    at net.sf.varscan.CopyCaller.<init>(CopyCaller.java:293)
    at net.sf.varscan.VarScan.copyCaller(VarScan.java:344)
    at net.sf.varscan.VarScan.main(VarScan.java:173)

    ##################
    I am not sure what is wrong with this line, however when looking at the output from "copynumber" (please see below for a sample) I noticed that the GC content sometimes exceeds 100, please see attached picture.

    ##################
    chrom chr_start chr_stop num_positions normal_depth tumor_depth log2_ratio gc_content
    chr1 10010 10109 100 30,4 28,4 -0,097 51,0
    chr1 10110 10209 100 23,4 21,4 -0,132 51,0
    chr1 10210 10240 31 14,3 11,9 -0,260 51,6
    chr1 10359 10458 100 20,7 14,1 -0,556 51,0
    chr1 12202 12226 25 10,2 2,0 -2,350 48,0
    chr1 13425 13438 14 10,0 5,9 -0,754 50,0
    chr1 69005 69104 100 22,5 24,1 0,098 40,0
    chr1 69105 69204 100 41,9 38,2 -0,133 77,0
    chr1 69205 69304 100 74,6 70,9 -0,074 127,0
    chr1 69305 69404 100 42,5 45,8 0,108 171,0
    chr1 69405 69504 100 20,0 20,1 0,003 216,0
    chr1 69505 69604 100 26,9 22,2 -0,277 265,0
    chr1 69605 69704 100 66,4 64,2 -0,049 308,0
    chr1 69705 69804 100 86,8 83,1 -0,064 42,0
    chr1 69805 69904 100 73,5 71,1 -0,047 78,0
    chr1 69905 70004 100 55,7 49,0 -0,185 119,0
    chr1 70005 70043 39 22,4 23,9 0,096 25,6
    chr1 367991 368039 49 11,2 5,3 -1,078 51,0
    ################

    Any feedback would be greatly appreciated!

    Thank you in advance.

  • #2
    Hello,

    Thank you for posting this message... I have seen this issue before, and thought it was fixed in v2.3.2. You're encountering a "locale error" caused by European representation of floating-point numbers (decimal numbers) with a comma (e.g. 3,1415926) rather than a decimal (3.1415926).

    If it's possible for you to change the locale preference in your java setting, that's one way to address the problem. Another way would be a global search-and-replace (perl -pi -e s'/\,/\./g' output.copynumber

    I will look again in the code to see if I can determine why the locale parsing correction isn't working.

    Comment


    • #3
      Hello Dan,

      I have a question about Varscan. It is posted at

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc



      It will be so grateful if you can answer me.

      Thank you very much.

      Comment


      • #4
        Thank you again for following up. I'd thought that the locale-parsing issue was fixed, but using your file was able to duplicate the problem and fix it. Now VarScan copyCaller should work for you even with European-style floating point numbers.

        Also, the GC content values > 100 were traced to a bug in GC counting that has also been fixed.

        These fixes are all in VarScan v2.3.3, which was posted today.

        Comment


        • #5
          Thanks a thousand!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-27-2024, 06:37 PM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-27-2024, 06:07 PM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          69 views
          0 likes
          Last Post seqadmin  
          Working...
          X