View Single Post
Old 08-17-2011, 01:33 PM   #1
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default GATK UnifiedGenotyper calling way too many SNPs in vcf

Alignments were done with bwa, paired end, converted to .bam and sorted with samtools.

The following command works as expected. It makes a vcf file that is 10 kb, which is right.

Quote:
java -jar ../../../../GATK/1.1.23/GenomeAnalysisTK.jar -T UnifiedGenotyper -R genome3.fasta -I sort_filtered.bam -o GATK/regular.vcf -L Staphylococcus:164665-2689091 -dcov 100
The following command makes a vcf file that is 57.5 kb

Quote:
java -jar ../../../../GATK/1.1.23/GenomeAnalysisTK.jar -T UnifiedGenotyper -R genome3.fasta -I sort_filtered.bam -o GATK/regular.vcf -L Staphylococcus:164663-2689091 -dcov 100
What happens is that at base 2256558, something gets off by a base, so that it calls another 300,000 SNPs, because it's all out of sync. This doesn't happen in the first command, even though it covered almost exactly the same ground. The fasta looks fine. Has anyone else ever observed this?
swbarnes2 is offline   Reply With Quote