Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The N+1 problem in GATK 4

    At our place we use GATK (3-series) for germline SNV/INDEL calling in a clinical setup, and we are now considering how to make the move to the new GATK 4.

    One of the benefits of GATK (and often often emphasized as a sales point from Broad) is the solution to the N+1 problem: When a new sample arrives, one can run GenotypeGVCFs on that sample together with a huge GVCF catalogue of previous samples, thus improving the accuracy of calling.

    However, with GATK 4 this functionality has changed tremendeously. It is now recommended to use a GenomicsDB object instead of a combined GVCF file and use that as input to GenotypeGVCFs. In itself this is not a problem, but GenotypeGVCFs now only accepts one "-V" input. Thus, one cannot use both the the large GenomicsDB and the GVCF file from a new sample.

    Our first thought was to add the new GVCF file to the GenomicsDB, but that is not supported by the GenomicsDBImport tool. The only solution appears to be to create a new GenomicsDB object from scratch each time a new sample arrives, but that takes days (if not weeks) of computing and is just not feasable. It all seems very odd.

    Has anybody here found a way of solving the N+1 problem in GATK 4?

  • #2
    I assume you have checked GATK support forums? You must not be the first person to have run into this? You may want to post there to get an official response. If you do please post the relevant link here so anyone finding this thread in future will know the answer.

    Comment


    • #3
      Yes, that was my first attempt, but the discussion seems to have died. I hoped to find more users here.
      Last edited by micknudsen; 05-10-2019, 05:06 AM.

      Comment


      • #4
        I have posted in the GATK forum (see here), but the discussion seems to have died out. I am trying here hoping to reach more users.

        Comment


        • #5
          I have posted the question in this thread on the GATK forums, but the discussion has died out. I was hoping to maybe find more users here.

          Comment


          • #6
            I'd suggest raising an issue on the GATK support forums. One of the other issues I had been encountering with GATK4 is that there is a discordance between the command line options in the online documentation and what the tool actually expects, I've often had to search through their forums to find other such issues to fix my commands. The rollout of GATK4 has been far from smooth.

            Comment


            • #7
              There is no discordance in this case. In fact, they have admitted that the situation really is as described (see here), but they haven't suggested a solution. I was wondering if anybody else had figured something out on their own.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              50 views
              0 likes
              Last Post seqadmin  
              Working...
              X