View Single Post
Old 12-22-2010, 04:17 PM   #2
Awesome
Junior Member
 
Location: california

Join Date: Aug 2009
Posts: 7
Default

To do SNP calling, the standard procedure is to map reads to a reference genome. Then you look at your pileup (i.e. the base frequencies and associated quality scores for every position) and find regions where allele frequencies are least divergent. Illumina's CASAVA uses a fancy nearest-neighbor SNP caller, SOAPsnp uses a bayesian algorithm, and I'm sure there are many, many other methods.

The standard way to SNPcall, because you don't have a reference sequence, is to generate one. You do this by feeding trimmed, high-quality-only reads into a de-novo assembler such as Velvet or ABYSS.

For SNPcalls, contig length isn't really your end goal. Your goal for the assembly should be to have a high percentage of your reads to actually map to your de novo genome.

It is okay if your de novo genome has 1000s of contigs.

If you are dealing with RNA, then mapping partial reads plays a role for a minority of SNPs (close to intron junctions, etc). So you might need to use a Bowtie/Cufflinks, SOAP or whatever to map partially.

Good luck.
Awesome is offline   Reply With Quote