I have fastq reads represented in SOLiD format (Reads are stored in two files, one in fasta format for reads, and another one contains qualities). here is an example for one read (I combined the read and its quality from the two files):
>SRR179591.7 2_59_264_F3
T.23.2121.101131333000102.1.0022.0223.0001.0.313002
>SRR179591.7 2_59_264_F3
0 29 27 0 30 33 31 22 0 31 4 30 8 12 30 7 31 18 15 30 24 30 19 18 0 21 0 22 5 26 18 0 5 20 28 10 0 12 4 25 21 0 11 0 8 17 29 11 8 22
I used Cutadapt to trim the reads and remove adapters. As an output for Cutadapt, it always produce one fastq file (instead of two files) and it represents the qualities in phred qualties instead of fastq-int. The Cutadapt output for the above read is:
@SRR179591.7 2_59_264_F3
T.23.2121.101131333000102.1.0022.0223.0001.0.313002
+
!><!?B@7!@%?)-?(@30?9?43!6!7&;3!&5=+!-%:6!,!)2>,)7
Now, when I use tophat to map this file, it complains by saying (are you sure that you have qualities stored in fastq-int?) which means that tophat expect the qualities to be represented in the original SOLiD format (not as Cutadapt converted them). Here is the exact error message from tophat:
Error running bowtie:
Too few quality values for read: 11T0=
are you sure this is a FASTQ-int file?
terminate called after throwing an instance of 'int'
So, my question: what is the best solution for this problem
1) Shall I convert the quality values in the cutadapt output from phred to fastq-int? Does anyone has a script for this?
2) Is there any specific parameter in tophat that I can use to tell tophat that the input is SOLiD data but the qualities are not in fastq-int but in phred?
>SRR179591.7 2_59_264_F3
T.23.2121.101131333000102.1.0022.0223.0001.0.313002
>SRR179591.7 2_59_264_F3
0 29 27 0 30 33 31 22 0 31 4 30 8 12 30 7 31 18 15 30 24 30 19 18 0 21 0 22 5 26 18 0 5 20 28 10 0 12 4 25 21 0 11 0 8 17 29 11 8 22
I used Cutadapt to trim the reads and remove adapters. As an output for Cutadapt, it always produce one fastq file (instead of two files) and it represents the qualities in phred qualties instead of fastq-int. The Cutadapt output for the above read is:
@SRR179591.7 2_59_264_F3
T.23.2121.101131333000102.1.0022.0223.0001.0.313002
+
!><!?B@7!@%?)-?(@30?9?43!6!7&;3!&5=+!-%:6!,!)2>,)7
Now, when I use tophat to map this file, it complains by saying (are you sure that you have qualities stored in fastq-int?) which means that tophat expect the qualities to be represented in the original SOLiD format (not as Cutadapt converted them). Here is the exact error message from tophat:
Error running bowtie:
Too few quality values for read: 11T0=
are you sure this is a FASTQ-int file?
terminate called after throwing an instance of 'int'
So, my question: what is the best solution for this problem
1) Shall I convert the quality values in the cutadapt output from phred to fastq-int? Does anyone has a script for this?
2) Is there any specific parameter in tophat that I can use to tell tophat that the input is SOLiD data but the qualities are not in fastq-int but in phred?
Comment