Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK UnifiedGenotyper with reducebam error

    I am using GATK UnifiedGenotyper on multiple bam files for target enrichment sequencing. I get an error which although is a "user error", I can't seem to work it out! Any help would be greatly appreciated.

    I am using Bam files that have been reduced by ReduceReads, I have many bam files, but even when I try the command with just a few I get the same error. Here is an example;

    java -Xmx20g -jar GenomeAnalysisTK.jar -T UnifiedGenotyper \
    -R human_g1k_v37.fasta \
    -D dbsnp_131_b37.final.rod \
    -L baitgroupfile.picard \
    -I sample1.reduced.bam \
    -I sample2.reduced.bam \
    -I sample3.reduced.bam \
    -o out.vcf \
    -stand_call_conf 50.0 \
    -stand_emit_conf 10.0 \
    -G Standard \
    -metrics out.metrics

    here is the error;


    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A USER ERROR has occurred (version 2.4-3-g2a7af43):
    ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
    ##### ERROR Please do not post this error to the GATK forum
    ##### ERROR
    ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
    ##### ERROR Visit our website and forum for extensive documentation and answers to
    ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ##### ERROR
    ##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add a
    n explicit type tag :NAME listing the correct type from among the supported types:
    ##### ERROR Name FeatureType Documentation
    ##### ERROR BCF2 VariantContext http://www.broadinstitute.org/gatk/g...BCF2Codec.html
    ##### ERROR VCF VariantContext http://www.broadinstitute.org/gatk/g..._VCFCodec.html
    ##### ERROR VCF3 VariantContext http://www.broadinstitute.org/gatk/g...VCF3Codec.html




  • #2
    I think you need to provide a vcf file format for dbSNP file or mention which format it is in this option:

    -D dbsnp_131_b37.final.rod \

    Comment


    • #3
      Thanks for your help! Yes I needed;

      -D:dbsnp,vcf dbsnp_132.b37.vcf

      Comment


      • #4
        Calling variants from multiple BAM files

        Hello everybody,

        I am pretty new in bioinformatics and I am still learning. Sorry, if my problem is explain here, but I did not find it.. Right know I have some troubles. I have 96 FASTQ files from MiSeq (it is 96x2 - pair-read) and corresponding 96 BAM files. One FASTQ file represent one patient (it was amplicon sequencing workflow - BRCA1,2). I would like to use GATK to find SNPs and annotate them all together. I know how to use GATK -T UnifiedGenotyper for call variants, but when I create BAM.list I still have on my output one vcf file (and I could not assign each vcf to coressponding BAM) :-( so my question is, if I can use my BAM list (each sample have specify name) and get on output the vcf files (with the same name of my input BAM file). so finally I have lets say 96 BAMs and corresponding 96 VCF files with same name. and then use GATK for annotation.

        I hope my question is clear. I am not programmer, so if you can show just example of syntaxes? Or if you have some advice?

        Thank you very much for your time,

        Paul.

        Comment


        • #5
          Hi Paul,
          If I understand correctly you are trying to call SNPs from 96 bam files, but get one vcf file with only one individual?
          You should be calling all your BAM files together to get one VCF, but all your 96 individuals should be contained in that one vcf. If you only get genotypes for one person, then your BAM file headers might be incorrect without the sample label. Check your BAM files and see if the read group section have the sample ID.
          Here is an good explanation of the BAM/SAM header



          Hope that helps!

          Comment


          • #6
            Hello mimi lupton,

            first thank you for fast response :-)

            Ok, thats right - I have 96 BAM files (it is 96 individual patients) and when I create BAM list ( each row in my list is path/to/my/vcf/file) - and when I use GATK for call variants I will get just one single vcf file on my output - and I dont know how to split it to 96 single vcf files :-( read group is different at each BAM file.

            And I would like to keep naming in my files - so lets say - I have patient1.BAM, patient2.BAM ... patient96.bam > patient1.vcf, patient2.vcf ... patient96.vcf :-)

            It takes long time to rename each sample to original name :-)

            Thank you for help!!

            Paul.

            Comment


            • #7
              Ok, right now I have multiple vcf file, but I dont know, how to separate it by my input BAMs :-( And I dont know how to annotate my multiple vcf file :-( Please help me somebody !!


              Thank you!!

              Comment


              • #8
                Hi Paul,

                to separate out individuals from you VCF you can use VCFtools



                Annovar is good for annotation;

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X