Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about the concordance between GATK and Samtools

    Hi,I'm using GATK and samtools to call SNP.

    For GATK:
    java -Xmx6g -jar GenomeAnalysisTK.jar -R example.fa -T UnifiedGenotper -I sampleSort.bam -o GATK.vcf -mbq 30 -glm BOTH
    For samtools:
    samtools mpileup -Q 13 -ugf example.fa sampleSort.bam | bcftools view -bvcg > sample.raw.bcf
    bcftools view sample.raw.bcf | vcfutils.pl varFilter -d 10 -w 5 -D 100 > samtools.vcf

    But I find there is less common SNP between GATK.vcf and samtools.vcf, about 50% or less.
    I don't know why ? Even though I have used a lot default value or other parameter , but the result between GATK and samtools is still out of my expectation.

    How should I improve the concordance between GATK and Samtools ?
    What's about your command or what should I pay attention to?

    Thanks for your answer.
    sunbert

  • #2
    I recommend you show the details of your analysis. Say what data you analyzed, how you did it, what are the results, and why you think the results are problematic. If you do not give this information, nobody can help you.

    Comment


    • #3
      At first guess, I'd try doing a much more stringent quality filter, and you might get more concordance. My guess is that a lot of the discrepancies are stupid false positives.

      Comment


      • #4
        Originally posted by vdauwera View Post
        I recommend you show the details of your analysis. Say what data you analyzed, how you did it, what are the results, and why you think the results are problematic. If you do not give this information, nobody can help you.
        okay,thanks for your advise 。
        I hava a sam formate file which is from bowtie alignment after sequencing(whole genome sequence ).
        I would like to know the SNPs in this data for next analysis.
        First, I convert the sam file to bam file. Second, I use samtools to sort the bam file. Then I use samtools and GATK to call SNP.
        But I find the proportion of common SNP is too low ,about 50% or less. That's to say some SNPs GATK can call out but samtools can't, and some SNPs samtools can call out but GATK can't. That's so strange!
        There must be wrong somewhere.

        Comment


        • #5
          Originally posted by swbarnes2 View Post
          At first guess, I'd try doing a much more stringent quality filter, and you might get more concordance. My guess is that a lot of the discrepancies are stupid false positives.
          Yeah, I agree with you .
          I think there have lots of false positives in the result.
          More stringent quality filter dosen't seem to work for I have tried . I may be forget other parameter or step.But I can't find where to correct, my command seem normal with others.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X