View Single Post
Old 04-01-2010, 12:35 PM   #32
Junior Member
Location: Memphis

Join Date: Mar 2009
Posts: 6
Default A 454 - SSAHA approach

Just to throw in on the conversation, I pooled genomic DNA from 18 individuals, cut with a 4 base cutter, and sequenced a 15bp size fraction with two full runs of 454 reads (250bp). I assembled them gsAssembler which produced an average 20 reads per contig. Then I mapped the individual reads back to the contig consensus sequences using SSAHA2 and used the SSAHA_pipeline to call SNPs. It worked pretty well - wound up with about 8000 SNPs I could believe in, and the validation rate was about 95%. The predicted allele frequency was strongly correlated (>0.8) with the real allele frequency in the donors. My goal was just basic SNP discovery in a novel species and it fit the bill.

Caveats - Beware of minor allele freqs near 0.5 which could arise from alignment of reads from duplicated loci; Screen out short tandem repeats because STR allelic differences in the alignment can cause false positive SNPs; Loci with only 4 mapped reads (minimum 2 reads per allele) may be useful but don't count on them.
Boonie is offline   Reply With Quote