Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The N+1 problem in GATK 4

    At our place we use GATK (3-series) for germline SNV/INDEL calling in a clinical setup, and we are now considering how to make the move to the new GATK 4.

    One of the benefits of GATK (and often often emphasized as a sales point from Broad) is the solution to the N+1 problem: When a new sample arrives, one can run GenotypeGVCFs on that sample together with a huge GVCF catalogue of previous samples, thus improving the accuracy of calling.

    However, with GATK 4 this functionality has changed tremendeously. It is now recommended to use a GenomicsDB object instead of a combined GVCF file and use that as input to GenotypeGVCFs. In itself this is not a problem, but GenotypeGVCFs now only accepts one "-V" input. Thus, one cannot use both the the large GenomicsDB and the GVCF file from a new sample.

    Our first thought was to add the new GVCF file to the GenomicsDB, but that is not supported by the GenomicsDBImport tool. The only solution appears to be to create a new GenomicsDB object from scratch each time a new sample arrives, but that takes days (if not weeks) of computing and is just not feasable. It all seems very odd.

    Has anybody here found a way of solving the N+1 problem in GATK 4?

  • #2
    I assume you have checked GATK support forums? You must not be the first person to have run into this? You may want to post there to get an official response. If you do please post the relevant link here so anyone finding this thread in future will know the answer.

    Comment


    • #3
      Yes, that was my first attempt, but the discussion seems to have died. I hoped to find more users here.
      Last edited by micknudsen; 05-10-2019, 05:06 AM.

      Comment


      • #4
        I have posted in the GATK forum (see here), but the discussion seems to have died out. I am trying here hoping to reach more users.

        Comment


        • #5
          I have posted the question in this thread on the GATK forums, but the discussion has died out. I was hoping to maybe find more users here.

          Comment


          • #6
            I'd suggest raising an issue on the GATK support forums. One of the other issues I had been encountering with GATK4 is that there is a discordance between the command line options in the online documentation and what the tool actually expects, I've often had to search through their forums to find other such issues to fix my commands. The rollout of GATK4 has been far from smooth.

            Comment


            • #7
              There is no discordance in this case. In fact, they have admitted that the situation really is as described (see here), but they haven't suggested a solution. I was wondering if anybody else had figured something out on their own.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-27-2024, 06:37 PM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-27-2024, 06:07 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              69 views
              0 likes
              Last Post seqadmin  
              Working...
              X