![]() |
|
|||||||
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Conversion from base space to colorspace | KevinLam | SOLiD | 5 | 09-08-2012 04:52 PM |
| Stupid perl scripts for converting colour-space <-> base-space | gringer | Bioinformatics | 7 | 07-20-2011 07:35 AM |
| Bioscope Ma conversion (to base space) | DNAjunk | Bioinformatics | 2 | 04-15-2011 07:29 AM |
| Solid formats translator(base space/color space/double encoded) | AronaldJ | SOLiD | 0 | 10-26-2010 12:10 AM |
| ZOOM released (supporting both Illumina data and ABI SOLiD data) | spirit | Bioinformatics | 2 | 08-21-2008 06:48 AM |
![]() |
|
|
Thread Tools |
|
|
#1 |
|
Junior Member
Location: MA Join Date: Jun 2009
Posts: 3
|
I have previously been using Solexa for small RNA sequencing but am trying out SOLiD. I just received the data back for my first SOLiD smallRNA sequencing run and am having some difficulty with data analysis.
(1) The run produced 25nt long reads - my smallRNAs are expected to be ~21nt long. I assumed there is primer sequence at the ends of the reads. What is the best way to filter these reads using the primer sequences and .qualilty files? I know that ABI provides the "small RNA analysis pipeline" for this but I want to - filter reads using primer sequences/qual files and output them in base-space, not color space as some programs I want to use for different analyses require colorspace. Does anybody have any idea how to do this. Any help would be greatly appreciated. |
|
|
|
|
|
#2 | |
|
Super Moderator
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,279
|
Quote:
|
|
|
|
|
|
|
#3 |
|
Member
Location: USA Join Date: Oct 2008
Posts: 97
|
i didn't think you could even order 25bp chemistry anymore. Was this done on a version 2 machine?
Your best bet is to find someone with bioscope so you can output these directly into SAM files. |
|
|
|
|
|
#4 | |
|
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 1,698
|
Quote:
If you absolutely must use a program that does not understand color space, you can do a trick called "double encoding". Double encoding leaves the sequence in color space, but uses base letters (a, c, g, t) to indicate color instead of numbers. This allows the use of color space naive programs with one caveat: to inter-convert strands in color space one must reverse, rather than reverse-complement. So forward and reverse strands have to be considered separately. (Assembly programs, for example, would create two contigs -- one top strand, the other bottom strand). For strand-specific data like small RNA data sets, this will be less of an issue. As far as clipping adaptor sequence from the end goes. That would be tricky with 25 base reads. I suppose you could just chop off the last 5 bases. Your best bet really is to use a color space aware program to map the reads like the SOLiD™ System Small RNA Analysis Tool or its Bioscope equivalent then convert the reads that align to your reference to base space, if needed. -- Phillip |
|
|
|
|
|
|
#5 |
|
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 685
|
If you do go the double-encoding route (via the encodeFasta.py program provided within the Corona lite package) then make sure that you differentiate your double-encoded file from normal sequence files. ABI recommends making all double-encoded files begin with 'de_' and to use the '-a' switch in order to add an annotation to the file.
Also be aware that color-space, even double-encoded color space, can not be reverse complemented in the normal fashion. |
|
|
|
|
|
#6 | |
|
Member
Location: Milano, Italy Join Date: Dec 2008
Posts: 25
|
Hi. you can notice the adapter (or P2) from a string which, in case your sequences are 3' SREK sequences, begins with 3302010
You could map {0,1,2,3} to {A,C,G,T} easily, trim with a S&W procedure the P2 (check on the SREK protocol manual the sequence in nucleotides) and revert back transforming {A,C,G,T} to {0,1,2,3} - REMEMBER not to tocuh the first T, ie reads should look like T0011112333, T2233111000 etc OR map with SHRiMP against referecne genome or mirbase => the adapter won't align properly HTH Alessandro Quote:
|
|
|
|
|
![]() |
| Tags |
| abi, base space, colorspace, filter, solid |
| Thread Tools | |
|
|