Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Run Time Performance of Variant (SNP) Calling Tools?

    Hi all,

    I read lots of papers about SNP calling and the different implementations (GATK, SAMtools, SOAPsnp, etc) in the last time. Something I noticed was that in no paper run time performances where presented, neither in the papers introducing new SNP callers nor in surveys.

    The only thing I came across where comments like "xy is comparably slow" (e.g., about SAMtools or GATK), but I cannot find exact numbers for run time performances anywhere.

    Does anyone know why this is the case? Despite the comments I found, are SNP calling tools maybe that performant that run time performance is nothing to speak of?

    Or even better, can anyone point me to any resource (papers, websites, forums, etc) where SNP calling tools have been compared to each other regarding run time performance or at least provide ANY data about run time performance?

    Thank you,
    CindyF

  • #2
    Brad Chapman has a blog post having numbers you may be interested in:

    Scaling for whole genome sequencing Moving from exome to whole genome sequencing introduces a myriad of scaling and informatics challenges. In addition to the biological component of correctly iden…


    For samtools, calling a single human chr1 at 60X coverage took overnight on a single 2.3GHz Opteron CPU. I do not have the exact timing. Disabling BAQ will make samtools several times faster at minor cost of accuracy.

    Comment


    • #3
      Hi lh3,

      thanks for the link, I have not come across it yet. It is quite interesting to see that run time performance indeed seems to be a challenge for variant calling.

      Regarding your overnight samtools run - did you call SNPs and Indels? Did really only the variant calling without preprocessing (base quality score recalibration etc.) took that long?

      Regards,

      CindyF

      Comment


      • #4
        Variant calling usually include both SNP calling and INDEL calling. In the old days, samtools was usually faster than GATK. It still seems faster than GATK used in Brad's post.

        I don't think variant calling is slow. Brad's post explains well: variant calling alone takes ~25% of total CPU time starting from alignment. Alignment and post processing are both slower than variant calling. That is why samtools/gatk/etc add steps known to greatly impact the speed but with only marginal gains in accuracy. As long as variant calling is not the bottleneck, we can afford that.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin


          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
          Yesterday, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        39 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        41 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        35 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        55 views
        0 likes
        Last Post seqadmin  
        Working...
        X