Nature Methods - 5, 679 - 682 (2008)
Published online: 6 July 2008; | doi:10.1038/nmeth.1230
Alta-Cyclic: a self-optimizing base caller for next-generation sequencing
Yaniv Erlich1, 2, Partha P Mitra1, Melissa delaBastide1, W Richard McCombie1 & Gregory J Hannon1, 2
1 Watson School of Biological Sciences, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
2 Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
Correspondence should be addressed to Gregory J Hannon [email protected]
Next-generation sequencing is limited to short read lengths and by high error rates. We systematically analyzed sources of noise in the Illumina Genome Analyzer that contribute to these high error rates and developed a base caller, Alta-Cyclic, that uses machine learning to compensate for noise factors. Alta-Cyclic substantially improved the number of accurate reads for sequencing runs up to 78 bases and reduced systematic biases, facilitating confident identification of sequence variants.
More...
Published online: 6 July 2008; | doi:10.1038/nmeth.1230
Alta-Cyclic: a self-optimizing base caller for next-generation sequencing
Yaniv Erlich1, 2, Partha P Mitra1, Melissa delaBastide1, W Richard McCombie1 & Gregory J Hannon1, 2
1 Watson School of Biological Sciences, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
2 Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
Correspondence should be addressed to Gregory J Hannon [email protected]
Next-generation sequencing is limited to short read lengths and by high error rates. We systematically analyzed sources of noise in the Illumina Genome Analyzer that contribute to these high error rates and developed a base caller, Alta-Cyclic, that uses machine learning to compensate for noise factors. Alta-Cyclic substantially improved the number of accurate reads for sequencing runs up to 78 bases and reduced systematic biases, facilitating confident identification of sequence variants.
More...
Comment