View Single Post
Old 04-29-2015, 10:01 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi David,

You can correct in various ways. There are multiple calibration matrices generated; by default, 2 are used:

qp: Tracks match and mismatch rates by quality score and position in read (tuples like (37,120).
qb012: Tracks match and mismatch rates by quality score, current base, and the preceding 2 bases. (tuples like {37,A,G,A}).

In the current version (and I believe in that version you are using) you can enable/disable matrices with BBDuk using flags like "loadqb012=f loadqp=f loadq102=t".

That would disable the quality/position matrix and quality/bases matrix, and enable the matrix that calibrates using the quality score and the trailing and leading quality scores, which would make it completely sequence-independent. With extreme-GC organisms, it may be better to disable the qb012 matrix; I just need to do more testing. I'll post a description of how the recalibration works later. The results of the recalibration vary depending on which matrices are used.
Brian Bushnell is offline   Reply With Quote