dagarfield 11-09-2010 03:18 PM

Needed: GAPed alignment tool to save my sequences from the SMART kit
Hi folks,

I'm looking for some recommendations. I've inherited some 454 sequences of cDNA generated using a SMART kit. Perhaps not surprisingly (sigh...see the sequences are contaminated with SMART kit adapters and giant swaths of As or Ts.

BUT, there is hope. Embedded within these sequences are the actual sequences of transcripts that I would like to align to a reference genome (of a closely related species). In effect, the sequences look like this:


Can anyone recommend an alignment tool that deals well with gaps such that the good sequence within the bad will align to the reference genome and the BAD portion will be dropped? Out of familiarity, I'm leaning towards using Blastz (or lastz), but there're a whole lot more alignment tools out there than the last time I did this.

Many thanks!


malachig 11-16-2010 01:20 AM

It seems like the next-gen, fast, short read aligners are generally focused on aligning entire reads and do not do substring alignments. You may be stuck using an old school aligner. BLAST will certainly do what you are describing. BLAT or Exonerate would also probably be fine. If you are aligning cDNAs to a genome you could also use splice aware aligners such as Spidey or Splign. Of course all of these options are way, way slower. Maybe someone else will suggest other options.

drio 11-16-2010 05:05 AM

As far as I can tell the 454 analysis tool will take care of the adapter for you. Have you tried it?

dagarfield 12-10-2010 05:58 AM

454 adaptor removal
Thanks for the suggestion. In the end, the remove adaptor and repeat screen options on the 454 software did the trick just fine.

