SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
dbSNP frequencies JohnK Bioinformatics 2 12-05-2013 09:36 AM
Need help identifying SNPs and allele frequencies dkenned1 Genomic Resequencing 1 11-06-2011 03:02 PM
Get allele frequencies for specific coordinates from a .bam file mehc Bioinformatics 1 10-28-2011 01:05 PM
Identifying human disease SNPs BetterPrimate Bioinformatics 1 11-05-2010 06:40 AM
Plot of Frequencies - Different Colors hicham Bioinformatics 2 04-12-2010 05:56 AM

Reply
 
Thread Tools
Old 11-06-2011, 12:55 PM   #1
dkenned1
Junior Member
 
Location: Chicago

Join Date: Dec 2010
Posts: 3
Default Need help identifying SNPs and allele frequencies

Hi everyone,

Long-time reader, first-time poster looking for advice.

I have 100 bp, SE Illumina data (~500x coverage) and a short (~150 kb) reference genome. My difficulties stem from the fact that my DNA isn't from a single individual. Instead, it is a pool of an unknown number (but we're talking lots) of individuals. My goal is to identify SNPs, and accurately quantify allele frequencies at these sites.

Currently, I'm mapping my reads back to the reference genome using 'bowtie.' But mapping back to a reference is almost certainly biasing my allele frequencies in favor of the reference genome. Does anyone have a suggestion for alternative methods that eliminate or correct for this bias? I've considered de novo assembly (i.e. velvet) but I've been told that pooled DNA causes velvet problems.

I also have strong evidence of reads mis-mapping in some regions. I tried throwing out reads that map to multiple regions, but that didn't seem to solve the problem. Is there a technique for identifying mis-mapped reads, or to post-hoc identify problematic regions?

Thanks for any thoughts/suggestions you may have,
Dave

Last edited by dkenned1; 11-06-2011 at 01:28 PM. Reason: posted to wrong section
dkenned1 is offline   Reply With Quote
Old 11-06-2011, 01:17 PM   #2
adaptivegenome
Super Moderator
 
Location: US

Join Date: Nov 2009
Posts: 437
Default

It looks like you have two issues here. One is genotyping SNPs in a pooled DNA samples. There are lots of threads on seqanswers you can look at but here is one in particular:

http://seqanswers.com/forums/showthread.php?t=15000

The second issue regarding mapping is more complex. Without really knowing anything about your project or your data I would at least suggest you try a couple different alignment tools to see if the problem is aligner-independent. For example BWA and STAMPY both are well regarded for accurately mapping reads.
adaptivegenome is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:16 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO