Need help identifying SNPs and allele frequencies

dkenned1

Junior Member

Join Date: Dec 2010

Posts: 3
- Share
- Tweet
#1

Need help identifying SNPs and allele frequencies

11-06-2011, 12:55 PM

Hi everyone,

Long-time reader, first-time poster looking for advice.

I have 100 bp, SE Illumina data (~500x coverage) and a short (~150 kb) reference genome. My difficulties stem from the fact that my DNA isn't from a single individual. Instead, it is a pool of an unknown number (but we're talking lots) of individuals. My goal is to identify SNPs, and accurately quantify allele frequencies at these sites.

Currently, I'm mapping my reads back to the reference genome using 'bowtie.' But mapping back to a reference is almost certainly biasing my allele frequencies in favor of the reference genome. Does anyone have a suggestion for alternative methods that eliminate or correct for this bias? I've considered de novo assembly (i.e. velvet) but I've been told that pooled DNA causes velvet problems.

I also have strong evidence of reads mis-mapping in some regions. I tried throwing out reads that map to multiple regions, but that didn't seem to solve the problem. Is there a technique for identifying mis-mapped reads, or to post-hoc identify problematic regions?

Thanks for any thoughts/suggestions you may have,
Dave

Last edited by dkenned1; 11-06-2011, 01:28 PM. Reason: posted to wrong section
Tags: None
adaptivegenome

Super Moderator

Join Date: Nov 2009

Posts: 436
- Share
- Tweet
#2

11-06-2011, 01:17 PM

It looks like you have two issues here. One is genotyping SNPs in a pooled DNA samples. There are lots of threads on seqanswers you can look at but here is one in particular:

models and softwares for SNP and indel detections - SEQanswers

http://seqanswers.com/forums/showthread.php?t=15000

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

The second issue regarding mapping is more complex. Without really knowing anything about your project or your data I would at least suggest you try a couple different alignment tools to see if the problem is aligner-independent. For example BWA and STAMPY both are well regarded for accurately mapping reads.
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Need help identifying SNPs and allele frequencies

Comment

Latest Articles

ad_right_rmr

News