View Single Post
Old 04-02-2010, 02:23 PM   #3
jmartin
Member
 
Location: St. Louis

Join Date: Dec 2009
Posts: 74
Default

The expected actual variation between individuals is < 1%, but the sequencing error we're seeing from the Illumina 100mer reads varies quite a bit. I used a collection of 100mer reads from a different project being done here at WashU for my human control data, I did not simulate errors for that. They were real 100mer Illumina reads from a different individual (I did not use the Ensembl/NCBI human build for the control).

I am not sure what sequencing error rate I should expect to see in these Illumina 100mers. I do see some fraction of the reads containing a significant number of ambiguous bases (N). Maybe ~1-2% of the reads will have > 3 ambiguous bases. I was trying such a high -n setting to allow reads containing Ns to align to human.

What is the largest value of -n that bwa-short can safely use in alignments?
jmartin is offline   Reply With Quote