 As far as I understand it...your script calculates single colorspace errors, right? Rather than "miscalls" in true basespace?
 I guess it's application dependent. What are you intended to do with SOLiD reads without a reference? I would have thought that with short color space reads there's little you can do but SNP calling against a reference, but I could be wrong. If you're aligning to a reference, any reference I would have thought it would make sense to calculate the error rate against this.
 You may find this useful, but it's falling into many of the pitfalls of non Solid informatics people.
10-04-2008, 09:33 AM   #5
pmiguel
Senior Member

Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,318

Quote:
 As far as I understand it...your script calculates single colorspace errors, right? Rather than "miscalls" in true basespace?
Yes. Except it estimates the number of errors per read based on the quality values assigned by the SOLiD base(color) caller.

For example, if a read is 35 bases and each base had a quality value of 10, then that is a 10% chance of error per base. So the estimated number of miscalls would be 3.5 =(0.1*35). But if each base had a quality value of 20, the estimated number of miscalls for that read would be 0.3 =(0.01*35).

Of course normal reads will have different quality values for each base. To estimate the number of miscalls, the script just adds up the estimated chance of a miscall for each base.

The major pitfall here is that I have no idea whether the SOLiD base caller accurately predicts its own error rate. I gather that the SOLiD base caller is tuned on mappable reads (those with 3 errors or less). Should be possible to check how it does on reads mapped with up to 6 errors against a reference sequence without a lot of redundant/low complexity segments. But I have not done this.

--
Phillip

10-13-2008, 11:41 PM   #6
ECO

Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358

Quote:
 You may find this useful, but it's falling into many of the pitfalls of non Solid informatics people.
I'd love to hear more on that line of thinking....

