SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to convert general fastq to fastq int format? feng Bioinformatics 21 07-03-2014 11:40 PM
For MAQ: Is there a Tool to convert sanger-format fastq file to illumina-fotmat fastq byb121 Bioinformatics 6 12-20-2013 01:26 AM
i converted illumina fastq into sanger fastq, need advice Aicen Bioinformatics 5 08-27-2012 06:24 AM
Convert illumina v1.5 fastq to sanger fastq zouzou Bioinformatics 29 05-14-2012 09:07 PM
Reduce file size after Illumina FASTQ to Sanger FASTQ conversion? jjw14 Illumina/Solexa 2 06-01-2010 04:35 PM

Reply
 
Thread Tools
Old 01-28-2010, 08:26 AM   #21
mmuratet
Member
 
Location: Huntsville AL

Join Date: Jul 2008
Posts: 13
Default

I am looking at a lot of SOLiD that we received from collaborators. I don't see any fastq files, all the read and qual data are in separate files. I don't see anything in the SOLiD manuals that indicates that their tools make fastq files. Might I ask: did you make these fastq files yourselves by collating read and qual data? Is there a utility that does this?
Thanks
Mike
mmuratet is offline   Reply With Quote
Old 01-28-2010, 09:35 PM   #22
lix
Member
 
Location: Beijing

Join Date: Sep 2009
Posts: 17
Default

Hi mmuratet,

I collected the raw datasets from the SRA on NCBI website. All the raw reads are generated from the ABI Solid platform and are all in color space which are also the fastq-like format. I just downloaded them and never processed them by myself.
You can have a try to search.

Best,
lix
lix is offline   Reply With Quote
Old 01-29-2010, 06:56 AM   #23
mmuratet
Member
 
Location: Huntsville AL

Join Date: Jul 2008
Posts: 13
Default

Thanks for the reply. In the meantime, I found that the bfast suite has a tool solid2fastq. The ABI manual says that there quality scores are phred values.
mmuratet is offline   Reply With Quote
Old 02-18-2010, 04:51 AM   #24
subhashree
Junior Member
 
Location: India

Join Date: Feb 2010
Posts: 3
Default

I am trying to analyse a SRA file from SOLID through Galaxy. The file is recognised by Galaxy as a FASTQ file but is not taken up by groomer for further processing for converting it into sanger or other formats. However, the same pipeline is working fine for GA-II data. Can you help?
subhashree is offline   Reply With Quote
Old 02-18-2010, 06:31 AM   #25
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default Converting solid to fastq in Galaxy

Galaxy has a new tool called solid2fastq that converts fragment and mate-pair runs into fastq files that can be mapped by bowtie. The tool takes care of the "orphaned" mates and makes sure that in the case of mate pair run the resulting fastq files have exactly the same number of reads. A video explaining how to use this for fragment runs is here:

http://screencast.g2.bx.psu.edu/gala..._end/flow.html

and for mate pairs it is here:

http://screencast.g2.bx.psu.edu/gala...pair/flow.html

These can also accessed from galaxy site (http://usegalaxy.org) as quickie 8 and 9.

Let us (galaxy-user@bx.psu.edu) know if you have issues.
nekrut is offline   Reply With Quote
Old 02-20-2010, 10:11 PM   #26
subhashree
Junior Member
 
Location: India

Join Date: Feb 2010
Posts: 3
Default RNA-Seq

I am trying to retrieve data for RNA-Seq experiments, preferably Human. I have tried the UCSC browser and EMBL, but I am not able to figure the link. Can anyone suggest a database for the same, or any other link???
subhashree is offline   Reply With Quote
Old 11-12-2011, 10:24 AM   #27
pepperoni
Member
 
Location: US

Join Date: Oct 2011
Posts: 59
Exclamation

Quote:
Originally Posted by BENM View Post
Hi, pliang

Because samt's question is "Convert SOLiD fastq to Illumina fastq", Illumina FASTQ is different from Standard(Sanger) FASTQ in quality format.

The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl code gives the Phred quality $Q:

$Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10);

The ASCII charactars in Solexa FASTQ means:
Code:
CHAR	DEC	QUALITY
A	65	1
B	66	2
C	67	3
D	68	4
E	69	5
F	70	6
G	71	7
H	72	8
I	73	9
J	74	10
K	75	11
L	76	12
M	77	13
N	78	14
O	79	15
P	80	16
Q	81	17
R	82	18
S	83	19
T	84	20
U	85	21
V	86	22
W	87	23
X	88	24
Y	89	25
Z	90	26
[	91	27
\	92	28
]	93	29
^	94	30
_	95	31
`	96	32
a	97	33
b	98	34
c	99	35
d	100	36
e	101	37
f	102	38
g	103	39
h	104	40
;	59	-5
<	60	-4
=	61	-3
>	62	-2
?	63	-1
@	64	0
In contrast to Solexa FASTQ quality, the ASCII characters in standard (sanger) FASTQ, it used to denote:
Code:
CHAR	DEC	QUALITY
!       0       -64
!       1       -63
!       2       -62
!       3       -61
!       4       -60
!       5       -59
!       6       -58
!       7       -57
!       8       -56
!       9       -55
!       10      -54
!       11      -53
!       12      -52
!       13      -51
!       14      -50
!       15      -49
!       16      -48
!       17      -47
!       18      -46
!       19      -45
!       20      -44
!       21      -43
!       22      -42
!       23      -41
!       24      -40
!       25      -39
!       26      -38
!       27      -37
!       28      -36
!       29      -35
!       30      -34
!       31      -33
!       32      -32
!       33      -31
!       34      -30
!       35      -29
!       36      -28
!       37      -27
!       38      -26
!       39      -25
!       40      -24
!       41      -23
!       42      -22
!       43      -21
!       44      -20
!       45      -19
!       46      -18
!       47      -17
!       48      -16
!       49      -15
!       50      -14
!       51      -13
!       52      -12
!       53      -11
!       54      -10
"       55      -9
"       56      -8
"       57      -7
"       58      -6
"       59      -5
"       60      -4
#       61      -3
#       62      -2
$       63      -1
$       64      0
%       65      1
%       66      2
&       67      3
&       68      4
'       69      5
(       70      6
)       71      7
*       72      8
+       73      9
+       74      10
,       75      11
-       76      12
.       77      13
/       78      14
0       79      15
1       80      16
2       81      17
3       82      18
4       83      19
5       84      20
6       85      21
7       86      22
8       87      23
9       88      24
:       89      25
;       90      26
<       91      27
=       92      28
>       93      29
?       94      30
@       95      31
A       96      32
B       97      33
C       98      34
D       99      35
E       100     36
F       101     37
G       102     38
H       103     39
I       104     40
J       105     41
K       106     42
L       107     43
M       108     44
N       109     45
O       110     46
P       111     47
Q       112     48
R       113     49
S       114     50
T       115     51
U       116     52
V       117     53
W       118     54
X       119     55
Y       120     56
Z       121     57
[       122     58
\       123     59
]       124     60
^       125     61
_       126     62
`       127     63
a       128     64
So it is easy to conver Solexa->Sanger quality, you just need to build a conversion table in PERL script, just like this:
# Solexa->Sanger quality conversion table
my @conv_table;
for (-64..64) {
$conv_table[$_+64] = chr(int(33 + 10*log(1+10**($_/10.0))/log(10)+.499));
}

I am trying to write a universal script for Solexa/Illumina, SOLiD/ABi, 454/Roche, 3730/Sanger,...transforming to each other format for different purpose, but I need to know your requirements, after that, I will share it to you all.

Hope I answer your question.
BTW I attach the SOLiD2std.pl for your question, just make a little change in SOLiD2Solexa.pl
Hi Pliang, I am using your script SOLiD2std.pl at the begining the file looks fine but then some reads look weird, without quality data. Do you know how can I solve that?

Quote:
@373_15_180_F3
CTCATAGCCCTCCGGCAGAATGAACGGACATGTACGACCATAACATAACA
+
?B=@BBB@>?2A=?BA8;;>52>72%>?=>/=:;?<=@9><?B1<@%?85
@373_15_216_F3
TCGAGCGGCCCCCATCTCCTAATAGTTATACGCCGCACATAACATTATCA
+
(
@373_15_605_F3
ACGATCTTGCCGGCACCGCGCCGTATTAGCGCGTATATATAGCGCGCGCG
+

@373_15_663_F3
TTCCTCATGGCCCGGGCGTTGTCCCATGCCGCACAATCGAGACGTCACTC
+
BBAB@BBA;9=BB-B@BA7B<?+6B:@29'BB<%;7B6C<)7?&-?%6+:

thanks
pepperoni is offline   Reply With Quote
Old 11-12-2011, 10:35 AM   #28
pepperoni
Member
 
Location: US

Join Date: Oct 2011
Posts: 59
Default

Quote:
Originally Posted by pepperoni View Post
Hi Pliang, I am using your script SOLiD2std.pl at the begining the file looks fine but then some reads look weird, without quality data. Do you know how can I solve that?



thanks
Is there a way to keep my quality data as it is and only use your script to do the number to base translation?
thanks
pepperoni is offline   Reply With Quote
Old 11-12-2011, 10:37 AM   #29
pepperoni
Member
 
Location: US

Join Date: Oct 2011
Posts: 59
Default

Quote:
Originally Posted by nekrut View Post
Galaxy has a new tool called solid2fastq that converts fragment and mate-pair runs into fastq files that can be mapped by bowtie. The tool takes care of the "orphaned" mates and makes sure that in the case of mate pair run the resulting fastq files have exactly the same number of reads. A video explaining how to use this for fragment runs is here:

http://screencast.g2.bx.psu.edu/gala..._end/flow.html

and for mate pairs it is here:

http://screencast.g2.bx.psu.edu/gala...pair/flow.html

These can also accessed from galaxy site (http://usegalaxy.org) as quickie 8 and 9.

Let us (galaxy-user@bx.psu.edu) know if you have issues.
I have had issues, it does not give me the correct conversion, it gives me a frameshift!
pepperoni is offline   Reply With Quote
Old 11-12-2011, 12:32 PM   #30
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

Just in case you weren't aware, bowtie has changed a bit over the years. It is now able to quite easily handle SOLiD data as colour-space FASTA files and quality files (use options '-C -f' and '-Q' or '--Q1/--Q2' depending on whether it's paired end or not). Note that the colour-space switch changes the default read orientation to '--ff', so you may need to add in a '--fr' option for paired-end matching (I needed to do this for SOLiD4 data).

Bowtie2 (which can handle gaps) will handle colour-space input, but it will (in the beta3 version) only record as a match if the base-space conversion is perfect (no SNPs, no sequencer read errors). I assume this will only get better in the future.
gringer is offline   Reply With Quote
Old 03-19-2012, 04:01 PM   #31
srividya22
Junior Member
 
Location: New Jersey

Join Date: Mar 2012
Posts: 4
Default

Hello,

I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

The script doesnt convert characters after a dot.

I want to convert it to fastq format and align it using Stampy.

Do anyone have a script that can do the conversion properly now ?
srividya22 is offline   Reply With Quote
Old 03-20-2012, 09:32 AM   #32
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

Quote:
Originally Posted by srividya22 View Post
I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

The script doesnt convert characters after a dot.

I want to convert it to fastq format and align it using Stampy.

Do anyone have a script that can do the conversion properly now ?
After a dot, you can't assume anything about the sequence. All subsequent reads should be N in base space. Alignment should always be done in colour-space to get the most information (and least error) from the colour-space sequence.
gringer is offline   Reply With Quote
Old 03-26-2012, 07:41 PM   #33
BENM
Member
 
Location: PRC

Join Date: May 2009
Posts: 33
Default

Quote:
Originally Posted by srividya22 View Post
Hello,

I am using ur script SOLiD2Std.pl to convert 1100 genomes data to base space (fastq)

The script doesnt convert characters after a dot.

I want to convert it to fastq format and align it using Stampy.

Do anyone have a script that can do the conversion properly now ?
I have already fixed some bugs of it, you can try it again.
BENM is offline   Reply With Quote
Old 03-30-2012, 12:28 PM   #34
srividya22
Junior Member
 
Location: New Jersey

Join Date: Mar 2012
Posts: 4
Default downloading SOLid2Std.pl file

Hello BENM,

Where can I get the recent SOLid2Std.pl. because when I googled it I could nt locate it. Can u please specify the path ?
srividya22 is offline   Reply With Quote
Old 08-23-2012, 06:29 AM   #35
vz33
Junior Member
 
Location: London

Join Date: Aug 2012
Posts: 2
Default

Quote:
Originally Posted by BENM View Post
Hi, pliang

Because samt's question is "Convert SOLiD fastq to Illumina fastq", Illumina FASTQ is different from Standard(Sanger) FASTQ in quality format.

The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl code gives the Phred quality $Q:

$Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10);

The ASCII charactars in Solexa FASTQ means:
Code:
CHAR	DEC	QUALITY
A	65	1
B	66	2
C	67	3
D	68	4
E	69	5
F	70	6
G	71	7
H	72	8
I	73	9
J	74	10
K	75	11
L	76	12
M	77	13
N	78	14
O	79	15
P	80	16
Q	81	17
R	82	18
S	83	19
T	84	20
U	85	21
V	86	22
W	87	23
X	88	24
Y	89	25
Z	90	26
[	91	27
\	92	28
]	93	29
^	94	30
_	95	31
`	96	32
a	97	33
b	98	34
c	99	35
d	100	36
e	101	37
f	102	38
g	103	39
h	104	40
;	59	-5
<	60	-4
=	61	-3
>	62	-2
?	63	-1
@	64	0
In contrast to Solexa FASTQ quality, the ASCII characters in standard (sanger) FASTQ, it used to denote:
Code:
CHAR	DEC	QUALITY
!       0       -64
!       1       -63
!       2       -62
!       3       -61
!       4       -60
!       5       -59
!       6       -58
!       7       -57
!       8       -56
!       9       -55
!       10      -54
!       11      -53
!       12      -52
!       13      -51
!       14      -50
!       15      -49
!       16      -48
!       17      -47
!       18      -46
!       19      -45
!       20      -44
!       21      -43
!       22      -42
!       23      -41
!       24      -40
!       25      -39
!       26      -38
!       27      -37
!       28      -36
!       29      -35
!       30      -34
!       31      -33
!       32      -32
!       33      -31
!       34      -30
!       35      -29
!       36      -28
!       37      -27
!       38      -26
!       39      -25
!       40      -24
!       41      -23
!       42      -22
!       43      -21
!       44      -20
!       45      -19
!       46      -18
!       47      -17
!       48      -16
!       49      -15
!       50      -14
!       51      -13
!       52      -12
!       53      -11
!       54      -10
"       55      -9
"       56      -8
"       57      -7
"       58      -6
"       59      -5
"       60      -4
#       61      -3
#       62      -2
$       63      -1
$       64      0
%       65      1
%       66      2
&       67      3
&       68      4
'       69      5
(       70      6
)       71      7
*       72      8
+       73      9
+       74      10
,       75      11
-       76      12
.       77      13
/       78      14
0       79      15
1       80      16
2       81      17
3       82      18
4       83      19
5       84      20
6       85      21
7       86      22
8       87      23
9       88      24
:       89      25
;       90      26
<       91      27
=       92      28
>       93      29
?       94      30
@       95      31
A       96      32
B       97      33
C       98      34
D       99      35
E       100     36
F       101     37
G       102     38
H       103     39
I       104     40
J       105     41
K       106     42
L       107     43
M       108     44
N       109     45
O       110     46
P       111     47
Q       112     48
R       113     49
S       114     50
T       115     51
U       116     52
V       117     53
W       118     54
X       119     55
Y       120     56
Z       121     57
[       122     58
\       123     59
]       124     60
^       125     61
_       126     62
`       127     63
a       128     64
So it is easy to conver Solexa->Sanger quality, you just need to build a conversion table in PERL script, just like this:
# Solexa->Sanger quality conversion table
my @conv_table;
for (-64..64) {
$conv_table[$_+64] = chr(int(33 + 10*log(1+10**($_/10.0))/log(10)+.499));
}

I am trying to write a universal script for Solexa/Illumina, SOLiD/ABi, 454/Roche, 3730/Sanger,...transforming to each other format for different purpose, but I need to know your requirements, after that, I will share it to you all.

Hope I answer your question.
BTW I attach the SOLiD2std.pl for your question, just make a little change in SOLiD2Solexa.pl
which format fastq file does the bowtie2 use ? standard fastq or Solexa FASTQ? Thank you!
vz33 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:50 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO