SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > SOLiD



Similar Threads
Thread Thread Starter Forum Replies Last Post
Doubts about GATK "raw data processing" step for SOliD exome data jorgebm Bioinformatics 2 06-18-2012 05:17 AM
SOLID data processing by using BFast david.tamborero Bioinformatics 2 03-17-2011 01:26 PM
Tophat mean inner distance error for SOLID data repinementer Bioinformatics 0 10-28-2010 06:58 PM
Single-end SOLiD data for TopHat wlnjseu RNA Sequencing 6 10-15-2010 01:28 AM
Processing SOLiD data with CLC bio Genomics work bench wishSEQ RNA Sequencing 4 03-01-2009 04:43 AM

Reply
 
Thread Tools
Old 06-19-2014, 06:28 AM   #1
Helical
Junior Member
 
Location: US

Join Date: Mar 2013
Posts: 9
Default Processing SOLiD data from SRA using Tophat

Hello all,

I'm attempting to run Tophat on SOLiD data from an SRA file and running into problems with the fastq file formatting.

After running fastq-dump on the SRA file, I get the following format:

Quote:
@SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
T000002201013000130000000.01...20...2....2.....2...
+SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
!+,0,,/'*&/)&&)2%&+2.0%37!7%!!!1%!!!%!!!!5!!!!!5!!!
Executing Tophat like this:

tophat -C -o output --bowtie1 ColorIndex SRR.fastq

Results in the following error:

Quote:
Error running bowtie:
Too few quality values for read: 2899T33
are you sure this is a FASTQ-int file?
I researched this error and found that the problem may be I need to use the --quals option and provide a separate quality file. So, I split the fastq file into two separate files:

Quote:
@SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
T000002201013000130000000.01...20...2....2.....2...
Quote:
+SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
!+,0,,/'*&/)&&)2%&+2.0%37!7%!!!1%!!!%!!!!5!!!!!5!!!
And ran:

tophat -C --quals -o output --bowtie1 ColorIndex SRR.fastq SRR_qual.fastq

That generates the following error:

Quote:
Error encountered parsing file SRR.fastq:
Premature end of file (missing quality values for SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50)
I can't find any information on how to properly format the base and quality files when they are separated so that Tophat can read them. Is this my problem? Or something else?

<EDIT>

I properly formatted the two split files into proper FASTA:

Quote:
>SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
T000002201013000130000000.01...20...2....2.....2...
Quote:
>SRR1119927.1 solid309_20110721_FRAG_BC_yadegari_1_55_1170 length=50
!+,0,,/'*&/)&&)2%&+2.0%37!7%!!!1%!!!%!!!!5!!!!!5!!!
But now get the following error:

Quote:
Error running 'prep_reads'
Error: beginning of quality values record not found! (!'/,<&.&&*'%1*%.2(%&20%'&!')!!!%&!!!1!!!!1!!!!!%!!!)

Last edited by Helical; 06-19-2014 at 06:43 AM.
Helical is offline   Reply With Quote
Old 06-19-2014, 06:54 AM   #2
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

TopHat is probably expecting the data to be in 2 files, .csfasta and .qual.

I think there should be a command 'abi-dump', instead of fastq-dump,

that will produce the file formats that you need.
mastal is offline   Reply With Quote
Old 07-14-2014, 09:23 AM   #3
mbblack
Senior Member
 
Location: Research Triangle Park, NC

Join Date: Aug 2009
Posts: 245
Default

Did you use fastq-dump, or abi-dump to generate your original files? If the SRA submission was actually in color space reads, then you should use "abi-dump" NOT fastq-dump with the SRA toolkit. The abi-dump command will actually give you matched csfasta/csqual files.
__________________
Michael Black, Ph.D.
ScitoVation LLC. RTP, N.C.
mbblack is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO