SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SNP base calling for multiple samples shuang Bioinformatics 2 09-07-2011 03:06 PM
Editing fasta , reference base in snp calling samtools moriah Bioinformatics 2 08-10-2011 12:11 AM
base composition and base calling arolfe Illumina/Solexa 2 07-29-2011 08:50 AM
Mapping and base calling atgc Bioinformatics 7 06-20-2011 01:24 PM
PubMed: Probabilistic base calling of Solexa sequencing data. Newsbot! Literature Watch 0 10-15-2008 06:41 AM

Reply
 
Thread Tools
Old 08-11-2011, 08:20 AM   #1
shuang
Senior Member
 
Location: IL

Join Date: Jul 2011
Posts: 100
Default SNP base calling

My SNP data is from Sanger sequencing. Multiple samples cover varied regions, not necessary the same fragments. I performed alignment via bwasw and pileup via samtools.

What would be the differences between
1. pileup multiple samples all together
2. pileup one sample at a time?


Also, would QUAL score, DP, AC be affected dramatically?
shuang is offline   Reply With Quote
Old 08-11-2011, 09:18 AM   #2
volks
Member
 
Location: hd.de

Join Date: Jun 2010
Posts: 81
Default

mpileup doesnt give you an output at positions where there is no coverage. so if you want to compare different samples it might be more convenient to generate the pileup together.
volks is offline   Reply With Quote
Old 08-11-2011, 10:15 AM   #3
shuang
Senior Member
 
Location: IL

Join Date: Jul 2011
Posts: 100
Default

I actually do not need to know the position when a sample doesn't cover that base.

Other than that, any else would make differences?
shuang is offline   Reply With Quote
Old 08-11-2011, 12:18 PM   #4
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

If a SNP is called in one sample, and not another, it is helpful to look at the other sample, to determine if that other sample really is wt, or if coverage was just too low for it to make the same SNP call, and doing mpileup together helps for that. Unfortunately, what's really helpful is the DP4 values, and mpileup combines them all, which can make it harder to assess the likelihood of each sample. Yes, you get a GQ, but the coverage is helpful as well.
swbarnes2 is offline   Reply With Quote
Old 08-11-2011, 12:27 PM   #5
shuang
Senior Member
 
Location: IL

Join Date: Jul 2011
Posts: 100
Default

I also notice that QUAL score tends to be much lower in one sample analysis than multiple samples analysis. Why is that way?
shuang is offline   Reply With Quote
Old 08-12-2011, 06:49 AM   #6
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

If you want to compare between samples, always pool samples together (i.e. generate mpileup across all samples). Mpileup skips sites where there is no coverage across ALL samples. On the other hand, swbarnes2 has the point that DP4 is combined. If you need that information, you may pool samples at first to find sites you are interested in and then run single-sample pileup to get DP4.
lh3 is offline   Reply With Quote
Old 08-18-2011, 07:39 AM   #7
shuang
Senior Member
 
Location: IL

Join Date: Jul 2011
Posts: 100
Default

When I tried to pileup multiple samples together, even a sequence/read did not cover the SNP base was shown as het (1/0). That confused our conclusion.

How do I avoid that? Or how do I tell a het means a real one or means no-coverage?

Also, how do I use DP4 value?
shuang is offline   Reply With Quote
Old 10-24-2011, 12:50 PM   #8
aslihan
Member
 
Location: USA

Join Date: Jun 2011
Posts: 23
Default How to use dp4 values ??

How to use dp4 values ??
aslihan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:38 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO