The Genome Analysis Toolkit 'CountCovariates' portion of the base quality recalibration tool utilizes platform information, and in the case of color space data, reads the colorspace (CS) tag when working with SOLiD data. From an alignment generated with Bfast, I am getting an error using GATK:
'Unrecognized color space in SOLID read, color = . Unfortunately this bam file can not be recalibrated without full color space information because of potential reference bias.'
As far as I can tell, this occurs because for GATK, only 0,1,2 and 3 are allowable states, but Bfast also includes, '.'
An example SOLiD tag from Bfast:
'CS:Z:T033.0330033003300123.22122221212102233223313322131'
And...from the GATK source:
' private static byte getNextBaseFromColor( SAMRecord read, final byte prevBase, final byte color ) {
switch(color) {
case '0':
return prevBase;
case '1':
return performColorOne( prevBase );
case '2':
return performColorTwo( prevBase );
case '3':
return performColorThree( prevBase );
default:
throw new UserException.MalformedBam(read, "Unrecognized color space in SOLID read, color = " + (char)color +
" Unfortunately this bam file can not be recalibrated without full color space information because of potential reference bias.");'
Does anyone have a workaround for this problem?
Thanks,
Adam
'Unrecognized color space in SOLID read, color = . Unfortunately this bam file can not be recalibrated without full color space information because of potential reference bias.'
As far as I can tell, this occurs because for GATK, only 0,1,2 and 3 are allowable states, but Bfast also includes, '.'
An example SOLiD tag from Bfast:
'CS:Z:T033.0330033003300123.22122221212102233223313322131'
And...from the GATK source:
' private static byte getNextBaseFromColor( SAMRecord read, final byte prevBase, final byte color ) {
switch(color) {
case '0':
return prevBase;
case '1':
return performColorOne( prevBase );
case '2':
return performColorTwo( prevBase );
case '3':
return performColorThree( prevBase );
default:
throw new UserException.MalformedBam(read, "Unrecognized color space in SOLID read, color = " + (char)color +
" Unfortunately this bam file can not be recalibrated without full color space information because of potential reference bias.");'
Does anyone have a workaround for this problem?
Thanks,
Adam
Comment