SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
snp calling with samtools maize Bioinformatics 2 03-26-2013 05:51 PM
Samtools variant calling questions Chiel Bioinformatics 2 06-07-2011 09:10 AM
Samtools SNP calling vidhya Bioinformatics 3 04-07-2011 06:17 AM
SAMTOOLS SNP calling question harrb Bioinformatics 2 12-10-2010 06:37 AM
SAMtools and SNP calling Jan Bioinformatics 2 09-16-2010 01:01 PM

Reply
 
Thread Tools
Old 02-10-2015, 08:34 AM   #1
sfh838t
Member
 
Location: Mountain Grove, MO, USA

Join Date: Apr 2014
Posts: 29
Default more samtools SNP calling questions

I have used the samtools/bcftools/vcfutils pipeline to do variant calling.
step1:
samtools mpileup -uf reference.fa align-file1.bam align-file2.bam | bcftools view -bvcg >unfiltered-variants.bcf
step2:
bcftools view unfiltered-variants.bcf | vcfutils.pl varFilter -D100 >filtered-variants.vcf

This gives me about 30 variants in a format I can open in a spreadsheet.

However, I would also like to see the unfiltered variant list, so it needs to be in vcf format. I should be able to do this by changing the first step bcftools arguments to -vcg and the output file name to *.vcf. Actually this works up to certain point and I get partial output which matches the filtered variants, but it crashes at a certain line and I get sync error the number of fields does not match.
is there another way to just convert the bcf file into vcf to see all variants?

my other concern is that just about ALL variants found are indels. I have used IGV to look at the aligned contigs and I clearly see a number of SNPs.
I am trying to decipher the arguments it seems to me that the -c switch in bcftools should list SNPs?
I have seen a number of pages and read other posts, some of which are similar but just not quite the answer I need.
sfh838t is offline   Reply With Quote
Old 02-10-2015, 10:54 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

1) It looks like you are using the pre-version 1 of samtools/bcftools. I don't blame you since samtools v.1 changed a bunch. I am still getting use to it. That said if you want to see a lot of SNPs using the old method then you could use the 'flat' prior via the switch '-P flat'

2) GATK give you better SNP calls.
westerman is offline   Reply With Quote
Old 02-10-2015, 11:16 AM   #3
sfh838t
Member
 
Location: Mountain Grove, MO, USA

Join Date: Apr 2014
Posts: 29
Default

Thanks Rick. I have no training and I do not use any of these assorted tools often, so it is always new for me. I also dread the installation process of a program/tool more than anything! I guess the new version came out between the last time I was fiddling with this and now. I will see if I can find/work with GATK.
Not sure what you mean by 'flat'?
sfh838t is offline   Reply With Quote
Old 02-10-2015, 11:28 AM   #4
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Since I am not a statistician I would stumble trying to explain a 'prior' so I refer you to Wikipedia:

https://en.wikipedia.org/wiki/Bayesian_probability

Also quoting from Heng Li on the Samtools list from 2010:

Quote:
With any Bayesian SNP callers, you need a prior. Full is the standard Wright-Fisher infinite site prior. Flat is a flat prior and cond2 is the prior distribution conditional on hets discovered in two chromosomes, assuming Wright-Fisher.
As mentioned above the old bcftools had three different set priors. The new bcftools using the 'call' option has a more variable set of priors as per:

Quote:
-P, --prior <float> mutation rate (use bigger for greater sensitivity)
westerman is offline   Reply With Quote
Reply

Tags
bcf/vcf, bcftools, snp calling

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:30 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO