Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • SamH
    Member
    • Sep 2010
    • 15

    samtools/bcftools missing obvious SNPs?

    Hi, I am having some trouble with what appears to be missed calls for SNPs while using the samtools/bcftools package.
    I am using the latest release samtools 0.1.18 (r982:295), and bcftools 0.1.17-dev (r973:277). My reads are SE Illumina, which have been trimmed of low-quality ends and discarded if they contain low-quality internal bases. Reads are ~55bp long, and coverage is 300-500x per sample.

    I first aligned against a reference genome using bwa and sorted using samtools:
    bwa aln -t 3 ./ST3-index/NC_015433_mod.fna ../03-Trimmed/s_3_trimmed.fastq > s_3.aln.sai
    bwa samse ./ST3-index/NC_015433_mod.fna s_3.aln.sai ../03-Trimmed/s_3_trimmed.fastq | gzip > s_3.sam.gz
    samtools view -uS s_3.sam.gz | samtools sort - s_3

    I then calculated coverage:
    samtools mpileup -BQ0 -d10000000 -f ./ST3-index/NC_015433_mod.fna s_1.bam s_2.bam s_3.bam > SSIM_1-2-3_coverage.mpileup

    As well as unfiltered variants:
    samtools mpileup -uf ./ST3-index/NC_015433_mod.fna s_1.bam s_2.bam s_3.bam | bcftools view -bvcg - > SSIM_1-2-3_variants.bcf
    bcftools view SSIM_1-2-3_variants.bcf > SSIM_1-2-3_variants.vcf

    I then started spot checking some of the more interesting variants using a bam viewer (tablet) when I noticed something odd. One of the variants which showed up in the bam file, wasn't listed at all in the vcf file. I checked this position in the mpileup file, and it appears as if it should be an obvious SNP with very deep coverage. I am hoping that someone can please help me understand why this SNP wouldn't be called by bcftools.

    Relevant mpileup line for the missing SNP (position 569984 in ref) is:
    NC_015433.1_Ss_ST3 569984 A 376 C$C$C$c$c$CCcCCCccCcccCccccCCcccccCccCCcccCCCCCccccccccCcCCCcccCCCCCCCcCCcCCccccCCCccccCccCCCccCCCcCcCCCcCCccCCCCCccccccccCCccccCccccCccccccCCccccCCcCCCCcccccCCCcCCCCCcCCCCCcccCCCcCcCCcCcCcCcCcccCCCccCCccCCcCcCCCCCcCCCCCcccCCccccccccCCCCcCCCcccCCCcccCCcCCCCccCccCCCCCCCccccCCCccCCCcccCCCCcCCCcccCCCCCCcccCCCCccCcCcCCCCCCCCCcCccCCCCCCCcccccCCCCCCCccCCcCCCccCccCCCCCccccCCcCcccc^FC^FC^Fc^FC^Fc hhhghgghgghfhhfhggehdhhhhhfhhhffgfchYhhghhfhhhghghcfgghhhhcghhhggchgfchhhffhggh_ffhhggghffcghghfafhhhhchhhhhhchgghcgghfhhffffa_chdhhghhhhhfhhfgfghhgghhhghhfghfhghhchgfghhhhhghhfhhfghhhgdffgghhghfhhchhhhfgfhgggghfgghhfhghfhfhdhhhhg^ghchhfeghhhhhhhghgggghghhhhgdghhhhhfhhhghhhchhgcgghhgghhcghehhfh]hhgfhhhhhhhhhghhgghgafehcgfhhhhghhhfhhhhhhdghhhhhhhfhhhhfhhhhhhcghgh[ghhgh_hh`hh 505 C$C$C$C$C$C$c$c$C$C$CCcccccctccccCCCCCcccCccccCCcccccCCCcCcccCCCCcccccCCCcccccCCcCccCCCcCccccCCcCcCccCCCCcccccCcCcCccccccccccccccccCCCCCCcCcccCCCccccccCCccCcCCCCCCcccCCCCCCCCcccCCCCCCccCcCccccCcccCccCCCCcccCCCCcccCCCCCcCCcccCCCCCCCcccccCCCCCCCcCccCcccccCCCCCcccCCCCCCCCcCcCcCcccCCCCCCcCcCCCCCCcCcCcCCCccccCCCCCCccCCcCCCCCccCCCCCcCCCCCCccCCCccCCCCCCCccccccCccCCcccCCccccCcccccCccCcccCcCCCcccCCCCCCCcccCcCCccccccCCcccccCccCCCCCccCccccCCCCcccccCCcCCCcCCCCCCCccccccCcCCCcccCcCcccCCCCCCCccccCCCccCccCCCCCcccCc^FC^FC^FC^FC^Fc^Fc^Fc^Fc^Fc^FC^Fc fhhhggghhgghdhhfhhhhhghhcchfhhhfhhhhhhhhhh_hghhhhhhhhhfgaffhhhheghdfdghdhhgggfhfhhhgefhdhf_ghghfh_ghghghcfhhghchhghgggfgchghghhhhfYdfghfagggehafggfdhhhgde_hghhhhhhhhfefhfhhhhhfggahhdhhhhgffhhhhfhfhhhhfhhhhghfahhgghfhhhhhhgehhhedhhfhdehghafghhhhhghhhfhhhhhdhehghghhhhhhgghhhghhhhhfhhhhhhhhhhhhghhhhhhhhhhhhhfhfghhhggghhhhfhghggehaghhgh_hhhhhhchgfhhchhfhhdfhfhhhhhhhbhhghhahhghhhhaghghhhhhhh\hhfhhhhghhhhghhfhhhghhfh_cchhhhghghhffhhhahhgcgehhfhghhhdghhhhhhhfhhehhhhghhchhfhchfhfhhhhhhhfchhhfggdhghgfhhhghhhf 447 C$C$C$C$C$C$c$CCCCCCcccCCCccCCCCCCccccCcCccccCcCCccccCcccCCCCcccCCcCCCCCcCcccCCcccCCCCCCCCCCCcCccccCccCCcccCcCCcCccccCCcccCcCcccCCCccCccCCccccCccCcCcccCCccCCcccccCCCCCcCccCCccccCCCCCcccccCCCcCcCCCCCccCcCCCcCCCccccCCCCCcCcCccCCCCCCccCCCCCCccCcccccCCCCCCCCCCccccCCcCCCCCCCccCCCCCCCCcccCcCCCCCCcCCCCCccccccCccCCCCcccccCCcCcCCCCccCCccCCCccCCccccCCCCCCccCCCCcCCCCCCCCCCcCCccCccCccCCCccccccCcccCCCCCCCCccccCCCCcccCcCcccCCCCCCCCcccccCCCCCCccCcCCCCcc^FC^FC^FC^FC^FC^FC^FC^FC^Fc^FC^Fc^Fc hfhhghhhhfhhfhhhfhghhhhhchegfhaghhhhhhhhhgghfgcfhhhhghghhhhghhgfchfhghhgghdhgfhhhhghggghhhhfhg\gfhcehhghfh^Whfdhggghfgchhhhfhahgfchgfgghgghghhgeehdhdggdfffhhfhghggfgdfghhfhghhhhehfghhhhhehghhhdhhghdgghggdchhhghhhhhhhhhhhhhggghgdhhfghfgahfhghfhhhehhhghfhfhhhhhhcghhhghgghhhhahhhffgdghhghhgghhghghhchhhhhhhhfhgcfhh^fhghhhhhhfhhghfhhfhchhgghghhhhhhffahhg^hgghhhhhfghhggdch\hhhhhcffhghhghhhhhgfhgghdghhhhehhhhhhhhhgffahghhhhfahchgghgfhhghhhfhhhchfghgd

    A nearby SNP which did show up in the vcf file and is very similar:
    NC_015433.1_Ss_ST3 542551 A 237 g$GGGgGGgGgGGGggggGGGGGGGggGGGGGggGgGgGGGGGGgggGgggGGGGGGgggGggGGGGGGGggGgGGgGgGGGGGGgggGGGGggGGGGgGGGggGGggGggGGggggGggggGggggGGGGGgggGgGGGgggGgGGGGGggGGGgggGGGGgggGgGgGggGggggGgGgggGGGGGGggGGGggGgGGGGgGGGGGgGGgGgGGGGgGGGGGgGGGGGGGggGggg hhfhhfghhhfdghhghcfgfffbhhfgchhhhghhhhfhhhahgfdhhdfhfeffgggfhghhhhghhfhfhhhhfhggf]gfhghhhhhgh_fdhhfgghaggagghhfghhhhdhghgghhhghgggchhhggggghhgddhgafgahhhgfhgfggfghhfhfhfhhchghhg\fhhhcffgcgghfdchgchggfghhfhdhhgghfhhfhghfhhffhfhghgfghhdhhh 270 G$G$G$g$g$GGGGGGgGgGGGGGGggGGGgggGGgggGGGgGgGgGGGGGGgGGgGGggGGGGGGggggGGggGGGGggGGGGGGgGgGGggggggGGgGGGGGGgggGGgGggGgGggggGggGgGGGGggGgggGGGggggGGGGGGgGGGGGGGGGgGggggGGGGGgGGGGGggGggGGGGGGgggGgGGGGgGgGGGGGGGggGggggGGGGGGGGgggGgGGGGgGGGGgggggGGGGGGGggGGGGGGGGGGGGGGGGGGGGggGGg dhhhgfgehffhghdffccghh\chhhgfchghfhahhhhgchehgdhcghhfghgeffdhhfhhhgghhhhfhhcfhdfhhfffcchghhhfghhhfhffhehdchhcgdhhhhhgghhcgfhfhhhcghhhffhhhhfgggghhghdeggggchchhhhhfggf\egfcgghghhhhffffhhhfh\dfdhfhfgffgfahg\hhhhdfcc_affhfhgfhffhhfhhhhhhhgdffhdffhhghhhghhghhfafghchfhhfgfXh 285 G$G$G$g$g$GGggGGGgggggggGgGggggGggggGGgggggggggGGGggGGgGgGGggGGGggGGGGGGggGGgggGGGGGGggGggGGGGGGGGgGggGgGGGgGGGGgGgGgGGGgggggggGggggGGGGGGgggGggGGGggGGGGGGgggggggggGggGggGGGGGGGGGGggggGGGGggGgGggGGgGGggGggggGGGGGggGGGGGGGGGGggGGGGGgggGGGGGGgggGGGGggggGGGGggggGgGgggGGGGgGGGGGGGGGgGGGGGGGGgg gchhffhgh^gfhghhhhgdecghhfdhhhhdfhhffhhhhgbfhhfehhfhc^hhfffhhcfdhchhhaahfhggghfdghfhegfhfdefffehhghghYhdgahhfhhhfffhhhhdhhfhhchcgffffhhggfhhhdgcdfgfgfhhghahghhghcahdgghhfffgdfhhhhefgfhhghchhhfgfchhghhggfgdffhhfdffffffafhhgfffchhhfafddfhfhcgcfhhhfffhfchggfgfhhhhhfhhhhhhbhhhhhhhhf\fhdfh


    And the two lines from the vcf file which should be flanking the missing SNP:

    NC_015433.1_Ss_ST3 543182 0 T G 999 0 DP=1575;VDB=0.0742;AF1=1;AC1=6;DP4=1,0,597,717;MQ=37;FQ=-260;PV4=0.45,7.8e-131,0.46,0.4 GT:PL:GQ 1/1:244,255,0:99 1/1:235,255,0:99 1/1:217,255,0:99 2 2 2
    NC_015433.1_Ss_ST3 570005 0 C T 999 0 DP=1403;VDB=0.0710;AF1=0.6667;AC1=4;DP4=239,177,498,334;MQ=37;FQ=999;PV4=0.43,0,0.24,1 GT:PL:GQ 0/0:0,255,255:99 1/1:255,255,0:99 1/1:255,255,0:99 0 2 2

    Although I have only discovered one of these cases, I am concerned that I might be missing other variants as well.

    Thanks for taking a look.

    Sam
  • colindaven
    Senior Member
    • Oct 2008
    • 417

    #2
    Try running Samtools mpileup with the BAQ option turned off (-B) and checking the same position. The SNP may have been masked.

    Comment

    • rand
      Junior Member
      • Jul 2009
      • 4

      #3
      Thanks, it helped me.

      Comment

      • SamH
        Member
        • Sep 2010
        • 15

        #4
        Yeah, adding -B made that SNP appear:

        NC_015433.1_Ss_ST3 569984 . A C 999 . DP=1328;VDB=0.0735;AF1=1;AC1=6;DP4=0,0,657,577;MQ=37;FQ=-288 GT:PL:GQ 1/1:255,255,0:99 1/1:255,255,0:99 1/1:255,255,0:99


        thanks for the tip!

        Sam

        Comment

        Latest Articles

        Collapse

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        10 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        22 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        28 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 11:40 AM
        0 responses
        22 views
        0 reactions
        Last Post SEQadmin2  
        Working...