Seqanswers Leaderboard Ad

**SongLi** · 12-22-2010, 08:09 AM

A little update on this issue:

I wrote a script that trim off the first quality value in my fastq file. Then tophat runs smoothly through the whole analysis.

I am still not sure that's the correct way of solving this problem.

Thanks,

Originally posted by SongLi View Post

Hi All,

I have trouble using tophat with files downloaded from SRA.

My command is:

./tophat -C -p 5 -o ./825tophat ../bowtie-0.12.7/indexes/ath_gmc_colspace_110510 ~/SRR039825.fastq

Error encountered parsing file /home/SRR039825.fastq:
Length mismatch between sequence and quality strings for SRR039825.1 923_6_55 (36 vs 36).

The sequence is here:
@SRR039825.1 923_6_55
T00310202021210203103230203233012210
+
!;>1<998495<<3$4.40%/87-101*&3,8#%'#

I dig into the code, and find the problem is at line 931-934 of tophat.py, where the length of the sequence has to be 1 character longer than the quality score.

Why is this and how can I fix it?

Thanks,

Song Li

**xinwu** · 12-27-2010, 01:08 AM

Originally posted by SongLi View Post

A little update on this issue:

I wrote a script that trim off the first quality value in my fastq file. Then tophat runs smoothly through the whole analysis.

I am still not sure that's the correct way of solving this problem.

Thanks,

This is due to the format used by NCBI. NCBI transforms all the data from different platforms to a standard FASTQ format.
Tophat uses bowtie for reads mapping and it expects csfasta and qual files if the data is color-spaced. Sequence in csfasta has additional 'T' adapter comparing to qual file, so tophat expects one more base. Just tell bowtie you use fastq format rather than fasta.

**ngsbioinfo** · 12-28-2010, 10:27 AM

hi all,

I am new to NGS analysis field. I am working on RNA-Seq data, aim it to identify all novel junctions and transcripts. I would appreciate if any one can help me out in using tophat and cufflinks for that matter.

Thanks in advance

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

TopHat color space

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News