Seqanswers Leaderboard Ad

**HESmith** · 08-13-2015, 04:30 AM

The first four cycles of data are used for cluster calling, and also for establishing metrics (e.g., signal thresholds) for base-calling. Base-calls for the first cycles rely on a standard set of parameters, but for subsequent cycles have been calibrated to the actual data (and are therefore more accurate).

Illumina also corrects the signal for phasing. Each cluster contains ~1000 copies, and imperfect chemistry means that some molecules are +1 or -1 relative to the actual cycle. Base-calling accuracy is improved by correcting the measurement (filtering the signal based on the preceding and subsequent cycles). Phase correction for cycle five is partially dependent upon the lower-quality preceding cycle, so it's quality is also lower (either that, or the algorithm doesn't use cycle four for phase correction). This is the same reason why quality of the last base is always significantly lower, since there's no subsequent cycle for phase correction.

**Brian Bushnell** · 08-13-2015, 09:37 AM

In my tests, the low quality values in at the beginning of the read are generally erroneous, just an artifact of the software. Here's an example of a HiSeq run:

The first 13bp have artificially lowered quality values. The first base really is inaccurate, but the quality seems to peak by around the 5th base. The lines for "measured" reflect the actual quality as measured by mapping and counting mismatches.

Attached Files

Quality.png (41.1 KB, 1787 views)

**HESmith** · 08-13-2015, 09:54 AM

Originally posted by HESmith View Post

Base-calls for the first cycles rely on a standard set of parameters, but for subsequent cycles have been calibrated to the actual data (and are therefore more accurate).

Brian's analysis is correct (as usual). I should have said that Illumina assigns a conservative (lower) quality score to those early cycles. And you can clearly see the reduced quality of the last cycle in his graph.

**MU Core** · 08-14-2015, 06:02 AM

There was a technote released by Illumina discussing the changes to the quality predictor modes used by RTA. Technote has been attached.

Attached Files

RTA_Quality_Predictors_TechNote.pdf (491.6 KB, 490 views)

**NYGen** · 08-14-2015, 08:04 PM

Wow, this was extremely helpful. Thanks very much for the explanation HESmith and Brian, and also MU Core for the technote which I've saved for future reference. Cheers

Topics	Statistics	Last Post
Single-Cell Sequencing Links Early Genetic Mutations to Breast Cancer Development by seqadmin Started by seqadmin, 11-22-2024, 07:36 AM	0 responses 44 views 0 likes	Last Post by seqadmin 11-22-2024, 07:36 AM
Human Cell Atlas Consortium Advances Cellular Mapping with New Studies by seqadmin Started by seqadmin, 11-22-2024, 07:04 AM	0 responses 54 views 0 likes	Last Post by seqadmin 11-22-2024, 07:04 AM
RNA Editing Provides a Novel Way to Track Gene Activity Over Time by seqadmin Started by seqadmin, 11-21-2024, 09:19 AM	0 responses 51 views 0 likes	Last Post by seqadmin 11-21-2024, 09:19 AM
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, 11-08-2024, 11:09 AM	0 responses 304 views 0 likes	Last Post by seqadmin 11-08-2024, 11:09 AM

Seqanswers Leaderboard Ad

Announcement

WHY are the first few bases of Illumina HiSeq reads of lower quality?

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News