Seqanswers Leaderboard Ad

**gringer** · 03-02-2012, 10:02 AM

Given that you're asking for a "tested program", I presume that the Galaxy tool will be good enough:

Galaxy

http://main.g2.bx.psu.edu/tool_runner?tool_id=cshl_fastq_quality_converter

Galaxy is a community-driven web-based analysis platform for life science research.

**lpn** · 03-02-2012, 10:18 AM

What about something that I can run in the command lane?

**gringer** · 03-02-2012, 12:31 PM

What about something that I can run in the command lene?

Why do you need a Phred+64 offset and/or the command line?

Most programs should work with Phred+33 (e.g. append '-Q 33' to the fastx command line).

Also, Biopython can read files as one format and write them as another:

Introduction to SeqIO · Biopython

http://biopython.org/wiki/SeqIO

Here's a quick python conversion script, derived from an example on that page:

Code:

#!/usr/bin/python
from Bio import SeqIO
SeqIO.convert("input.fastq", "fastq-sanger", "output.fastq", "fastq-illumina")

The hashbang isn't strictly needed, and the import is obvious, but it seemed too small with just a single line of code.

**Jon_Keats** · 03-02-2012, 12:36 PM

Gringer thanks for the "-Q 33" heads up for FastX. You solved a major headache of mine. Now only if the Illumina mate-pairs we made were not 88% duplicates... As Homer would say, Doh....

**maubp** · 03-02-2012, 02:26 PM

If you are still looking for a command line tool for the job, EMBOSS seqret can do this (and the reverse).

**lpn** · 03-02-2012, 11:24 PM

Originally posted by gringer View Post

Why do you need a Phred+64 offset and/or the command line?

Because tophat doesn't seem to handle well Phred+33 (CASAVA 1.8), but works with Phred+64 (CASAVA 1.5+).

Originally posted by gringer View Post

Most programs should work with Phred+33 (e.g. append '-Q 33' to the fastx command line).

Also, Biopython can read files as one format and write them as another:

Introduction to SeqIO · Biopython

http://biopython.org/wiki/SeqIO

Here's a quick python conversion script, derived from an example on that page:

Code:

#!/usr/bin/python
from Bio import SeqIO
SeqIO.convert("input.fastq", "fastq-sanger", "output.fastq", "fastq-illumina")

The hashbang isn't strictly needed, and the import is obvious, but it seemed too small with just a single line of code.

Thanks a lot!

**gringer** · 03-03-2012, 01:14 AM

Originally posted by lpn View Post

Because tophat doesn't seem to handle well Phred+33 (CASAVA 1.8), but works with Phred+64 (CASAVA 1.5+).

This is interesting and doesn't match my experience with tophat on recent Illumina runs. Do you have any "--solexa1.3-quals" options on your command line? Removing that should stop bowtie from using Phred+64, and go back to the default Phred+33.

There's also the Bowtie "--phred33-quals" option, which I guess you could add to tophat's bowtie call to force this:

Code:

nano $(which tophat)

**Jon_Keats** · 03-03-2012, 05:47 AM

As gringer said if you DO NOT specify a qual flag it will work fine. You are not the only person with HiSeq data which finally encodes the quality values in the standard sanger format for which nearly all programs expect by by default. The flag, for most today, is just for processing legacy datasets.

**lpn** · 03-03-2012, 07:47 AM

Originally posted by gringer View Post

This is interesting and doesn't match my experience with tophat on recent Illumina runs. Do you have any "--solexa1.3-quals" options on your command line? Removing that should stop bowtie from using Phred+64, and go back to the default Phred+33.

There's also the Bowtie "--phred33-quals" option, which I guess you could add to tophat's bowtie call to force this:

Code:

nano $(which tophat)

That works, but subsequent analysis produces strange results.

**gringer** · 03-03-2012, 11:54 AM

Originally posted by lpn View Post

That works, but subsequent analysis produces strange results.

What works? I suggested two options (not counting the python code). One was to remove --solexa1.3-quals from the tophat command line, and the other was to modify the bowtie parameters. I was deliberately vague about the second option because you need to know what you're doing before you do it (e.g. change the bowtie options everywhere bowtie is called, and change the tophat code that expects Phred+64 output).

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

converting CASAVA 1.8 qual scores

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News