View Single Post
Old 11-09-2009, 05:30 PM   #8
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by sgupta View Post
Hi,

I am trying to convert color space sequences generated by ABI SOLiD sequencer to actually bases using the following color space data "matrix":

AA=0
AC=1
AG=2
AT=3
CC=0
CA=1
CT=2
CG=3
GG=0
GT=1
GA=2
GC=3
TT=0
TG=1
TC=2
TA=3

So, this
>44_35_267_F3
T20220213203000111000122223221121222

gets converted to

>44_35_267_F3
CCTCCTGCTTAAAACACCCCAGAGATCTGTCAGAG

I want to do this to be able to use alignment programs that cannot work with ABI color space data. But so far I think I am doing something wrong because my alignment rates are less than 5% using published data (allowing upto 2 mismatches, mouse genome).

Any insights would be really appreciated.

I may just go ahead and use MAQ to do this in color space but I am not sure why this does not work the way I am currently trying to doing it. I am very new to SOLiD data so I maybe missing some piece of information here.

Thanks in advance.
You can also do the conversion directly on our web server:
http://genome.ucla.edu/bfast-server/. Click on the left tab that says CS2NT/NT2CS and enjoy!

I would recommend aligning in color space since one color error will cause all bases after the color error to be translated incorrectly. Many great color space aware mapping tools exist.

Nils
nilshomer is offline   Reply With Quote