The vcf output after running mpileup and bcftools seems to fail to report dbSNP ids. Here are the commands I used against my BAM file...pretty much identical to the example code provided with the mpileup documentation.
samtools mpileup -ugf /data/genomes/hg19/hg19.fa aln.bam > aln_varRaw.bcf
bcftools view -cvg aln_varRaw.bcf | vcfutils.pl varFilter -D100 > aln.vcf
And here are the first 10 lines of the output vcf file.
chr1 16571 . G A 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,0,1;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 74817 . G A 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,0,1;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 118617 . T C 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,1,0;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 231504 . G A 23.8 . AF1=1.000;AFE=0.829;DP4=0,0,2,0;MQ=30 PL:GT:GQ 55,6,0:1/1:49
chr1 232960 . C A 11.1 . AF1=1.000;AFE=0.768;DP4=0,0,0,2;MQ=30 PL:GT:GQ 42,6,0:1/1:49
chr1 233473 . C G 4.13 . AF1=0.500;AFE=0.307;DP4=1,1,2,0;MQ=30;PV4=1,0.0048,1,0.23 PL:GT:GQ 32,0,48:0/1:35
chr1 235976 . C A 52 . AF1=1.000;AFE=0.899;DP4=0,0,2,1;MQ=30 PL:GT:GQ 84,9,0:1/1:63
chr1 726481 . T G 13.9 . AF1=1.000;AFE=0.799;DP4=0,0,2,0;MQ=30 PL:GT:GQ 45,6,0:1/1:49
chr1 726939 . G C 14.9 . AF1=1.000;AFE=0.806;DP4=0,0,1,1;MQ=30 PL:GT:GQ 46,6,0:1/1:49
chr1 726944 . C G 21.8 . AF1=1.000;AFE=0.827;DP4=0,0,1,1;MQ=30 PL:GT:GQ 53,6,0:1/1:49
I realize that these 10 calls may not have a match in dbSNP but NONE of my calls have an associated rsID. rsID should be listed in the 3rd column.
I looked for specific parameters for setting up a connection to dbSNP but I cannot seem to find that information. Seems like I'm missing something.
Thanks.
Dan
samtools mpileup -ugf /data/genomes/hg19/hg19.fa aln.bam > aln_varRaw.bcf
bcftools view -cvg aln_varRaw.bcf | vcfutils.pl varFilter -D100 > aln.vcf
And here are the first 10 lines of the output vcf file.
chr1 16571 . G A 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,0,1;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 74817 . G A 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,0,1;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 118617 . T C 3.02 . AF1=1.000;AFE=0.375;DP4=0,0,1,0;MQ=30 PL:GT:GQ 30,3,0:1/1:41
chr1 231504 . G A 23.8 . AF1=1.000;AFE=0.829;DP4=0,0,2,0;MQ=30 PL:GT:GQ 55,6,0:1/1:49
chr1 232960 . C A 11.1 . AF1=1.000;AFE=0.768;DP4=0,0,0,2;MQ=30 PL:GT:GQ 42,6,0:1/1:49
chr1 233473 . C G 4.13 . AF1=0.500;AFE=0.307;DP4=1,1,2,0;MQ=30;PV4=1,0.0048,1,0.23 PL:GT:GQ 32,0,48:0/1:35
chr1 235976 . C A 52 . AF1=1.000;AFE=0.899;DP4=0,0,2,1;MQ=30 PL:GT:GQ 84,9,0:1/1:63
chr1 726481 . T G 13.9 . AF1=1.000;AFE=0.799;DP4=0,0,2,0;MQ=30 PL:GT:GQ 45,6,0:1/1:49
chr1 726939 . G C 14.9 . AF1=1.000;AFE=0.806;DP4=0,0,1,1;MQ=30 PL:GT:GQ 46,6,0:1/1:49
chr1 726944 . C G 21.8 . AF1=1.000;AFE=0.827;DP4=0,0,1,1;MQ=30 PL:GT:GQ 53,6,0:1/1:49
I realize that these 10 calls may not have a match in dbSNP but NONE of my calls have an associated rsID. rsID should be listed in the 3rd column.
I looked for specific parameters for setting up a connection to dbSNP but I cannot seem to find that information. Seems like I'm missing something.
Thanks.
Dan
Comment