Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Varscan v2.3.5 headers missing

    I am running varscan to call variants on a single file set. I have run it twice now and the header files are missing on the output vcf files. All other samples I analyzed with varscan this week (11), have the header files. What might cause the header to not be included?

    I used bwa to align my illumina fastq files.
    I am using samtools v 1.1.19 for the mpileup.
    My command line argument:
    samtools mpileup -q 1 -C50 -DSf /ref/human_g1k_v37.fasta /myData/S12_sorted.bam | java -jar VarScan.v2.3.5.jar mpileup2snp --min-coverage 8 --min-reads2 2 --min-var-freq 0.01 --min-avg-qual 15 --p-value 0.01 --strand-filter 0 --output-vcf 1 --variants 0 > /myData/S12_snp.vcf

    Here is the header on my .bam file:
    -bash-3.2$ samtools view -H /myData/S12_sorted.bam
    @SQ SN:1 LN:249250621
    @SQ SN:2 LN:243199373
    @SQ SN:3 LN:198022430
    @SQ SN:4 LN:191154276
    @SQ SN:5 LN:180915260
    @SQ SN:6 LN:171115067
    @SQ SN:7 LN:159138663
    @SQ SN:8 LN:146364022
    @SQ SN:9 LN:141213431
    @SQ SN:10 LN:135534747
    @SQ SN:11 LN:135006516
    @SQ SN:12 LN:133851895
    @SQ SN:13 LN:115169878
    @SQ SN:14 LN:107349540
    @SQ SN:15 LN:102531392
    @SQ SN:16 LN:90354753
    @SQ SN:17 LN:81195210
    @SQ SN:18 LN:78077248
    @SQ SN:19 LN:59128983
    @SQ SN:20 LN:63025520
    @SQ SN:21 LN:48129895
    @SQ SN:22 LN:51304566
    @SQ SN:X LN:155270560
    @SQ SN:Y LN:59373566
    @SQ SN:MT LN:16569
    @SQ SN:GL000207.1 LN:4262
    @SQ SN:GL000226.1 LN:15008
    @SQ SN:GL000229.1 LN:19913
    @SQ SN:GL000231.1 LN:27386
    @SQ SN:GL000210.1 LN:27682
    @SQ SN:GL000239.1 LN:33824
    @SQ SN:GL000235.1 LN:34474
    @SQ SN:GL000201.1 LN:36148
    @SQ SN:GL000247.1 LN:36422
    @SQ SN:GL000245.1 LN:36651
    @SQ SN:GL000197.1 LN:37175
    @SQ SN:GL000203.1 LN:37498
    @SQ SN:GL000246.1 LN:38154
    @SQ SN:GL000249.1 LN:38502
    @SQ SN:GL000196.1 LN:38914
    @SQ SN:GL000248.1 LN:39786
    @SQ SN:GL000244.1 LN:39929
    @SQ SN:GL000238.1 LN:39939
    @SQ SN:GL000202.1 LN:40103
    @SQ SN:GL000234.1 LN:40531
    @SQ SN:GL000232.1 LN:40652
    @SQ SN:GL000206.1 LN:41001
    @SQ SN:GL000240.1 LN:41933
    @SQ SN:GL000236.1 LN:41934
    @SQ SN:GL000241.1 LN:42152
    @SQ SN:GL000243.1 LN:43341
    @SQ SN:GL000242.1 LN:43523
    @SQ SN:GL000230.1 LN:43691
    @SQ SN:GL000237.1 LN:45867
    @SQ SN:GL000233.1 LN:45941
    @SQ SN:GL000204.1 LN:81310
    @SQ SN:GL000198.1 LN:90085
    @SQ SN:GL000208.1 LN:92689
    @SQ SN:GL000191.1 LN:106433
    @SQ SN:GL000227.1 LN:128374
    @SQ SN:GL000228.1 LN:129120
    @SQ SN:GL000214.1 LN:137718
    @SQ SN:GL000221.1 LN:155397
    @SQ SN:GL000209.1 LN:159169
    @SQ SN:GL000218.1 LN:161147
    @SQ SN:GL000220.1 LN:161802
    @SQ SN:GL000213.1 LN:164239
    @SQ SN:GL000211.1 LN:166566
    @SQ SN:GL000199.1 LN:169874
    @SQ SN:GL000217.1 LN:172149
    @SQ SN:GL000216.1 LN:172294
    @SQ SN:GL000215.1 LN:172545
    @SQ SN:GL000205.1 LN:174588
    @SQ SN:GL000219.1 LN:179198
    @SQ SN:GL000224.1 LN:179693
    @SQ SN:GL000223.1 LN:180455
    @SQ SN:GL000195.1 LN:182896
    @SQ SN:GL000212.1 LN:186858
    @SQ SN:GL000222.1 LN:186861
    @SQ SN:GL000200.1 LN:187035
    @SQ SN:GL000193.1 LN:189789
    @SQ SN:GL000194.1 LN:191469
    @SQ SN:GL000225.1 LN:211173
    @SQ SN:GL000192.1 LN:547496
    @RG ID:work SM:S12 PL:Illumina PU:S12
    @PG ID:bwa PN:bwa VN:0.5.9-r16

    Thanks for any advice.

  • #2
    Hello, and thanks for posting. Do you mean that the VCF header lines are missing? That's strange behavior and I'm happy to help you investigate.

    Would you mind letting me know what your output file (/myData/S12_snp.vcf) looked like?

    Comment


    • #3
      Thanks for your help. Here is what the first 10 lines look like:
      1 10061 . T G . PASS ADP=268;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:270:268:250:11:4.12%:4.385E-4:33:29:210:40:4:7
      1 10067 . T G . PASS ADP=270;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:24:274:270:260:8:2.96%:3.7047E-3:33:31:207:53:7:1
      1 10079 . T G . PASS ADP=266;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:268:266:254:11:4.14%:4.3923E-4:32:34:196:58:11:0
      1 10083 . C G . PASS ADP=284;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:71:284:284:261:23:8.1%:7.496E-8:34:32:188:73:23:0
      1 10097 . T G . PASS ADP=236;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:55:240:236:213:18:7.66%:2.7036E-6:30:29:149:64:17:1
      1 10108 . C T . PASS ADP=188;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:49:189:188:170:16:8.51%:1.0897E-5:32:33:111:59:4:12
      1 10109 . A T . PASS ADP=183;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:145:184:183:139:44:24.04%:2.9982E-15:31:30:81:58:29:15
      1 10114 . T G . PASS ADP=172;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:37:173:172:155:12:7.14%:1.9895E-4:31:25:88:67:5:7
      1 10147 . C A . PASS ADP=49;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:39:49:49:37:12:24.49%:1.1358E-4:33:27:12:25:11:1
      1 10177 . A C . PASS ADP=39;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:39:39:20:10:25.64%:3.9851E-4:27:32:4:16:4:6

      Comment


      • #4
        My apologies I forgot to disable the smilies in text:
        1 10061 . T G . PASS ADP=268;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:270:268:250:11:4.12%:4.385E-4:33:29:210:40:4:7
        1 10067 . T G . PASS ADP=270;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:24:274:270:260:8:2.96%:3.7047E-3:33:31:207:53:7:1
        1 10079 . T G . PASS ADP=266;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:268:266:254:11:4.14%:4.3923E-4:32:34:196:58:11:0
        1 10083 . C G . PASS ADP=284;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:71:284:284:261:23:8.1%:7.496E-8:34:32:188:73:23:0
        1 10097 . T G . PASS ADP=236;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:55:240:236:213:18:7.66%:2.7036E-6:30:29:149:64:17:1
        1 10108 . C T . PASS ADP=188;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:49:189:188:170:16:8.51%:1.0897E-5:32:33:111:59:4:12
        1 10109 . A T . PASS ADP=183;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:145:184:183:139:44:24.04%:2.9982E-15:31:30:81:58:29:15
        1 10114 . T G . PASS ADP=172;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:37:173:172:155:12:7.14%:1.9895E-4:31:25:88:67:5:7
        1 10147 . C A . PASS ADP=49;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:39:49:49:37:12:24.49%:1.1358E-4:33:27:12:25:11:1
        1 10177 . A C . PASS ADP=39;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDPP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:33:39:39:20:10:25.64%:3.9851E-4:27:32:4:16:4:6

        Comment


        • #5
          Should I provide a different type of data in order to trouble shoot? Thanks

          Comment


          • #6
            Hi, I am having a similar problem with VarScan 2.3.6. Is there any workaround for this? I can provide sample input/output if you need it.

            Comment


            • #7
              Anyone have fixed this problem? I also meet this issue, VCF header is missing in some, while is right in others. I don't know why could this happen.

              Comment


              • #8
                I have not resolved this issue yet and am working on it today. I will let you know if I find anything.

                Comment


                • #9
                  Has anyone else had this issue. I am testing this software again. I've run the same exome sample with three different aligners. All of the bam files have headers. Only 2 of the 6 analyses I've run were output with headers. I'm still stumped as to why this is occurring. Thanks.

                  Comment


                  • #10
                    Yep. Me too. Exactly the same situation: illumina fastq to bam via bwa; samtools mpileup to vcf via varscan --output-vcf. Resulting vcf has no header.
                    Latest Ubuntu, latest samtools, latest varscan as of March 2015.

                    Comment


                    • #11
                      I still have not resolved this issue. Please keep us posted if you find a solution. Thank you

                      Comment


                      • #12
                        The problem is that --output-vcf 1 doesn't work for pileup2snp, it only works for mpileup2snp. Lack of documentation for this flag, plus lack of error checking for command line args, makes it hard to figure this out.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM
                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 06:37 PM
                        0 responses
                        8 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, Yesterday, 06:07 PM
                        0 responses
                        8 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-22-2024, 10:03 AM
                        0 responses
                        49 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-21-2024, 07:32 AM
                        0 responses
                        66 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X