Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Michael.James.Clark
    Senior Member
    • Apr 2009
    • 207

    GATK: -glm DINDEL question

    I'm following this note in the GATK documentation: http://www.broadinstitute.org/gsa/wi...fied_Genotyper

    However, I get no output when I run the UnifiedGenotyper tool with the -glm DINDEL option. If I run without the -glm DINDEL option, they work fine. There is no error or anything. The program just runs to completion and doesn't seem to report anything.

    My command looks like this:

    Code:
    java -jar GenomeAnalysisTK.jar \
    -l INFO \
    -R human_g1k_v37.fasta \
    -D dbsnp_130_b37.rod \
    -T UnifiedGenotyper \
    -baq CALCULATE_AS_NECESSARY \
    -I recal.bam \
    -o indels.raw.vcf \
    -stand_call_conf 50.0 \
    -stand_emit_conf 10.0 \
    -A AlleleBalance \
    -A DepthOfCoverage \
    -A HaplotypeScore \
    -glm DINDEL \
    -nt 8 \
    -L intervals.interval_list
    Has anyone used this successfully? If so, any tips? I know it's still a bit premature in its development, but I'd like to use it if possible.
    Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
    Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
    Projects: U87MG whole genome sequence [Website] [Paper]
  • RockChalkJayhawk
    Senior Member
    • Mar 2009
    • 192

    #2
    Originally posted by Michael.James.Clark View Post
    I'm following this note in the GATK documentation: http://www.broadinstitute.org/gsa/wi...fied_Genotyper

    However, I get no output when I run the UnifiedGenotyper tool with the -glm DINDEL option. If I run without the -glm DINDEL option, they work fine. There is no error or anything. The program just runs to completion and doesn't seem to report anything.

    My command looks like this:

    Code:
    java -jar GenomeAnalysisTK.jar \
    -l INFO \
    -R human_g1k_v37.fasta \
    -D dbsnp_130_b37.rod \
    -T UnifiedGenotyper \
    -baq CALCULATE_AS_NECESSARY \
    -I recal.bam \
    -o indels.raw.vcf \
    -stand_call_conf 50.0 \
    -stand_emit_conf 10.0 \
    -A AlleleBalance \
    -A DepthOfCoverage \
    -A HaplotypeScore \
    -glm DINDEL \
    -nt 8 \
    -L intervals.interval_list
    Has anyone used this successfully? If so, any tips? I know it's still a bit premature in its development, but I'd like to use it if possible.
    I ran across the same thing and contacted GSA. They said it is a bug and it is still in development, that's why there isn't any documentation yet. I guess we just have to be patient, but I'm keeping an eye on it.

    Comment

    • NGSfan
      Senior Member
      • Apr 2009
      • 181

      #3
      whoa... indel calling in the unified genotyper... looking forward to this!

      I'm currently using indel Genotyper V2. Has anyone come up with some thresholds using some of these interesting attributes described in their VCF output?


      ##INFO=<ID=AC,Number=2,Type=Integer,Description="# of reads supporting consensus indel/any indel at the site">
      ##INFO=<ID=DP,Number=1,Type=Integer,Description="total coverage at the site">
      ##INFO=<ID=MM,Number=2,Type=Float,Description="average # of mismatches per consensus indel-supporting read/per reference-supporting read">
      ##INFO=<ID=MQ,Number=2,Type=Float,Description="average mapping quality of consensus indel-supporting reads/reference-supporting reads">
      ##INFO=<ID=NQSBQ,Number=2,Type=Float,Description="Within NQS window: average quality of bases from consensus indel-supporting reads/from reference-supporting reads">
      ##INFO=<ID=NQSMM,Number=2,Type=Float,Description="Within NQS window: fraction of mismatching bases in consensus indel-supporting reads/in reference-supporting reads">
      ##INFO=<ID=SC,Number=4,Type=Integer,Description="strandness: counts of forward-/reverse-aligned indel-supporting reads / forward-/reverse-aligned reference supporting reads">


      I know from their regular SNP calling Unified Genotyper that one can make use of Strand Bias and Quality by Depth and other calculateed variant characteristics beyond just the usual "read depth" threshold most people use and publish with. In fact, I can clean up a lot of FPs with hardfiltering. Their more sophisticated clustering approach doesn't seem to work with my data though...

      I would like to clean up indels similarly, but there are no guidelines.

      Comment

      • Michael.James.Clark
        Senior Member
        • Apr 2009
        • 207

        #4
        Originally posted by RockChalkJayhawk View Post
        I ran across the same thing and contacted GSA. They said it is a bug and it is still in development, that's why there isn't any documentation yet. I guess we just have to be patient, but I'm keeping an eye on it.
        Alright, thanks for your input. That is unfortunate, but I'll just keep using the regular Dindel program instead.

        As for filters people use, not sure on my end. I'm just dipping my toes into the new ways of analyzing indels. Right now I'm taking a look at the default filters in Dindel.
        Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
        Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
        Projects: U87MG whole genome sequence [Website] [Paper]

        Comment

        • mdepristo
          Junior Member
          • Jan 2011
          • 3

          #5
          The indel calling capabilities of the Unified Genotyper are now working as expected, and we are getting good results now on whole genomes and exomes. The exact approach to filtering indels isn't yet clear, but all of the machinery is well. Please have another go at it if you are still looking for indel calling with the GATK.

          Best,

          Mark DePristo

          Comment

          • gaffa
            Member
            • Oct 2010
            • 82

            #6
            How does this implementation compare to the original Dindel program?

            Comment

            • RockChalkJayhawk
              Senior Member
              • Mar 2009
              • 192

              #7
              Originally posted by mdepristo View Post
              The indel calling capabilities of the Unified Genotyper are now working as expected, and we are getting good results now on whole genomes and exomes. The exact approach to filtering indels isn't yet clear, but all of the machinery is well. Please have another go at it if you are still looking for indel calling with the GATK.

              Best,

              Mark DePristo
              Thanks Mark, for both your response and your set of tools. You guys are doing some great stuff over there.

              BTW, do you have any specs on what the empirical sensitivity/specificity/FDR of the Unified Genotyper?

              Comment

              • Michael.James.Clark
                Senior Member
                • Apr 2009
                • 207

                #8
                Originally posted by mdepristo View Post
                The indel calling capabilities of the Unified Genotyper are now working as expected, and we are getting good results now on whole genomes and exomes. The exact approach to filtering indels isn't yet clear, but all of the machinery is well. Please have another go at it if you are still looking for indel calling with the GATK.

                Best,

                Mark DePristo
                Thanks Mark. I just updated through SVN and I'll give it a shot.

                Question: Any idea how the results compare to the Dindel program from Kees Albers? Wondering if it'll be safe to cross-compare between the two because I have some data that's already been processed by that program that likely isn't going to be rerun through the UnifiedGenotyper DINDEL program.

                (Also, thanks for pushing a release that appears to make the whole thing compatible with hard clipping from Novoalign. )
                Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                Projects: U87MG whole genome sequence [Website] [Paper]

                Comment

                • apmatchan
                  Junior Member
                  • Jan 2011
                  • 4

                  #9
                  Getting the Indel option to work

                  Hi there,
                  I'm new to working with GATK, just downloaded it today.
                  I am able to get the SNP genotyping to work correctly but when I add the -glm DINDEL option and try to run the UnifiedGenotyper, I get the following message:
                  __________________________________________________________

                  org.broadinstitute.sting.utils.cmdLine.InvalidArgumentException:
                  Argument with name 'glm' isn't defined.
                  at org.broadinstitute.sting.utils.cmdLine.ParsingEngine.validate(ParsingEngine.java:185)
                  at org.broadinstitute.sting.utils.cmdLine.ParsingEngine.validate(ParsingEngine.java:158)
                  at org.broadinstitute.sting.utils.cmdLine.CommandLineProgram.start(CommandLineProgram.java:175)
                  at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:89)
                  ------------------------------------------------------------------------------------------
                  The following error has occurred:


                  Argument with name 'glm' isn't defined.:

                  __________________________________________________________


                  I am entering the following at the command line:
                  java -jar GenomeAnalysisTK.jar \
                  -R human_g1k_v37.fasta \
                  -T UnifiedGenotyper \
                  -I NA12891.chr21.GATK.Reg.sorted.bam \
                  -o NA12891-Indels.vcf \
                  -glm DINDEL

                  Am I msising an argument?

                  Thanks

                  Comment

                  • NGSfan
                    Senior Member
                    • Apr 2009
                    • 181

                    #10
                    Hmm, perhaps it is only available in the SVN checkout version of GATK?

                    What version number do you have?

                    Comment

                    • apmatchan
                      Junior Member
                      • Jan 2011
                      • 4

                      #11
                      Ah ok, that may well be it, not very up to speed with where to find everything yet.

                      my version is..
                      The Genome Analysis Toolkit (GATK) v1.0.2695, Compiled 2010/01/26 17:53:46

                      Comment

                      • apmatchan
                        Junior Member
                        • Jan 2011
                        • 4

                        #12
                        Ignore previous comment - new version works fine now, thanks!

                        Comment

                        • NGSfan
                          Senior Member
                          • Apr 2009
                          • 181

                          #13
                          Cool. Let us know how you like it vis-a-vis against the old indel genotyper V2. I have my pipeline set to use the indel genotyper V2 so I'm waiting to see some filtering methods mature before I make the switch.. unless anyone thinks the overall quality of the calls is better ?

                          Comment

                          • apmatchan
                            Junior Member
                            • Jan 2011
                            • 4

                            #14
                            Thanks I will. Trying both out at the moment.

                            Comment

                            • Michael.James.Clark
                              Senior Member
                              • Apr 2009
                              • 207

                              #15
                              Originally posted by NGSfan View Post
                              Cool. Let us know how you like it vis-a-vis against the old indel genotyper V2. I have my pipeline set to use the indel genotyper V2 so I'm waiting to see some filtering methods mature before I make the switch.. unless anyone thinks the overall quality of the calls is better ?
                              I think a lot of us were using Dindel over the old Indel Genotyper V2 because even the GATK group were adamant about it being superior to their own tool (thus them adding it to the UnifiedGenotyper). I personally wouldn't replace part of your pipeline yet, but it might be worth making a module to do either Dindel or the UnifiedGenotyper with DINDEL on to try it out.

                              I'll try this out later today probably and let people know how it looks.

                              Also yeah, GATK is updated constantly so it's really worth using SVN to keep it updated rather than going by the releases.
                              Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                              Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                              Projects: U87MG whole genome sequence [Website] [Paper]

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              39 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              102 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              123 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              114 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...