Seqanswers Leaderboard Ad

**goudurix** · 07-26-2012, 10:13 AM

This happen to me one time when using -C with an illumina dataset. Are you sure your reads are in color space ?
Cheers

**HSV-1** · 07-26-2012, 06:13 PM

Originally posted by goudurix View Post

This happen to me one time when using -C with an illumina dataset. Are you sure your reads are in color space ?
Cheers

The data is from ABI-Solid. And I open the data file there are no ACGTs,but 1,2,3,...
They are color space.

**sonia.bao** · 08-01-2012, 12:05 AM

Got the same error here when feeding tophat2 with csfastq files as the input.

I checked the csfastq file, and found nothing wrong with it. No truncated reads or qual values. Then I tried bowtie to align the reads to the reference genome using the csfastq file as the input - bowtie finished without any error, and over 90% of the reads were mapped.

Try feeding tophat2 with csfasta+qual files as the input instead of csfastq. I tried that and tophat2 ran through successfully.

**HSV-1** · 08-01-2012, 12:09 AM

Originally posted by sonia.bao View Post

Got the same error here when feeding tophat2 with csfastq files as the input.

I checked the csfastq file, and found nothing wrong with it. No truncated reads or qual values. Then I tried bowtie to align the reads to the reference genome using the csfastq file as the input - bowtie finished without any error, and over 90% of the reads were mapped.

Try feeding tophat2 with csfasta+qual files as the input instead of csfastq. I tried that and tophat2 ran through successfully.

Thanks for your reply. How to get csfasta files and qual files from the same csfastq?

**sonia.bao** · 08-01-2012, 01:00 AM

Try this python script - it takes color space .fastq file as the input and outputs 2 files, .csfasta and .QV.qual.

(It was not written by me. Someone wrote this script and shared it on this board (much appreciated!!!). If anybody knows who the author is, please let me know and I'll update it)

csfastq2solid.py

Code:

import sys
fq = sys.argv[1]

base = fq.split(".fastq")[0]
quals = open(base + ".QV.qual", "w")
seq = open(base + ".csfasta", "w")

for i, line in enumerate(open(fq)):

    mod = i % 4
    if mod == 0: # name
        assert line[0] == "@"
        quals.write(">" + line[1:])
        seq.write(">" + line[1:])
    elif mod == 1: # cseq
        seq.write(line)
    elif mod == 3:
        print >>quals, " ".join((str(ord(q) - 33) for q in line.rstrip("\r\n")))

seq.close(); quals.close()
print >>sys.stderr, "wrote %s, %s" % (quals.name, seq.name)

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 50 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Bowtie error when mapping ABI RNA-seq data with Tophat

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News