SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools mpileup calls way too less SNPs TuA Bioinformatics 17 03-01-2018 04:17 PM
Question on calling SNPs using samtools/bcftools nkwuji Bioinformatics 6 02-19-2013 08:52 AM
Samtools mpileup calls drastically more SNPs with -I agel Bioinformatics 0 01-20-2012 01:20 PM
calling Heterozygous SNPs with samtools mpileup egatti Bioinformatics 1 07-21-2011 08:16 AM
samtools mpileup filter SNPs Hit Bioinformatics 3 05-25-2011 04:55 PM

Reply
 
Thread Tools
Old 08-01-2011, 12:41 PM   #1
combiochem
Member
 
Location: Los Angeles

Join Date: Jul 2009
Posts: 11
Default samtools/mpileup heterozygous SNPs calling

I'm trying to get heterozygous SNPs from illumina DNA sequencing data.
I've used samtools/mpileup pipeline and have some questions about the options.

There are many posts related in the BAQ calculation. As known, the calculation is default and if the options -B is used, more SNPs could be detected (sacrificing the specificity). I've tested with my samples and the difference is almost ~2x or ~3x difference. (-uf vs -Buf, -Euf didn't make a big difference compared -uf)
Is there anyone with experience with -B -E and -I options?

Actually, we are not interested in INDELs so I thought that the option -I could be used for ignoring INDELs calling.
But when I compared the results (for heterozygous SNPs) with -I and without -I, many detected heterozygous SNPs in the results with -I option are actually the INDELs cases in the results without -I option. So is it better to ignore those SNPs, in another word, should I "not" use the -I option?

I really want to know the appropriate options for calling heterozygous SNPs.
combiochem is offline   Reply With Quote
Old 08-01-2011, 01:15 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

My tiny bit of experience:

A sanger-verifed herterozygous SNP disappeared when I omitted -B, and reappeared when I put it on. I've had other projects, where I didn't sanger verify, but where multiple related projects all had SNPs in the same gene, which was a highly likely candidate, that were virtually uncallable without the -B option.

I wouldn't filter out indels, unless you were using an aligner that you know can't handle them.
swbarnes2 is offline   Reply With Quote
Old 08-01-2011, 01:20 PM   #3
combiochem
Member
 
Location: Los Angeles

Join Date: Jul 2009
Posts: 11
Default

Quote:
Originally Posted by swbarnes2 View Post
My tiny bit of experience:

A sanger-verifed herterozygous SNP disappeared when I omitted -B, and reappeared when I put it on. I've had other projects, where I didn't sanger verify, but where multiple related projects all had SNPs in the same gene, which was a highly likely candidate, that were virtually uncallable without the -B option.

I wouldn't filter out indels, unless you were using an aligner that you know can't handle them.
Thanks for sharing tips. Actually, the reads were aligned by BWA, so it's better to consider INDELs. Have you used '-E' option (Extended BAQ computation) too?
combiochem is offline   Reply With Quote
Old 08-01-2011, 01:47 PM   #4
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Quote:
Originally Posted by combiochem View Post
Thanks for sharing tips. Actually, the reads were aligned by BWA, so it's better to consider INDELs. Have you used '-E' option (Extended BAQ computation) too?
The author of samtools suggested it in a thread where I mentioned my problem with not using -B, but I haven't tried applying to those samples yet.

For my projects, false positives are not that big a problem. I'm looking for candidate phenotype-causing SNPs most of the time, so sanger-checking a modest number of false positives is not a big deal. But I don't want to miss the real deal because the software was overzealous in trying to help me, so it's safer for me to turn it off entirely.

But it's good to know that in your tests, -E worked about as well as -B.
swbarnes2 is offline   Reply With Quote
Old 08-02-2011, 07:05 AM   #5
sijungyun
Junior Member
 
Location: Bethesda, Maryland

Join Date: Jul 2011
Posts: 1
Default

For my case, using '-B' increased the number of SNP calls by 2 ~ 5 times from Illumina sequencing data.

Last edited by sijungyun; 08-03-2011 at 03:47 PM.
sijungyun is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:24 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO