SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics
Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting FASTA/qual file pair from 454 to FASTQ oiiio Bioinformatics 9 01-01-2016 03:55 PM
QUAL vs GQ in GATK kasthuri Bioinformatics 1 06-20-2012 12:01 AM
extracting scores from *.qual files HELP anna_ Core Facilities 1 12-08-2011 11:45 PM
fastq to csfasta and .qual samt SOLiD 15 10-29-2009 09:11 AM
Run maq on solexa data with simulated qual scores AnamikaDarwin Bioinformatics 0 05-22-2009 10:52 AM

Reply
 
Thread Tools
Old 03-02-2012, 08:28 AM   #1
lpn
Member
 
Location: west coast

Join Date: May 2011
Posts: 17
Default converting CASAVA 1.8 qual scores

Does anybody have a tested program to convert Illumina CASAVA 1.8 qual scores (Phred+33) to the previous version Illumina 1.5+ (Phred+64)?

Last edited by lpn; 03-02-2012 at 09:19 AM.
lpn is offline   Reply With Quote
Old 03-02-2012, 09:02 AM   #2
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 843
Default

Given that you're asking for a "tested program", I presume that the Galaxy tool will be good enough:

http://main.g2.bx.psu.edu/tool_runne...lity_converter
gringer is offline   Reply With Quote
Old 03-02-2012, 09:18 AM   #3
lpn
Member
 
Location: west coast

Join Date: May 2011
Posts: 17
Default

What about something that I can run in the command lane?
lpn is offline   Reply With Quote
Old 03-02-2012, 11:31 AM   #4
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 843
Default

Quote:
What about something that I can run in the command lene?
Why do you need a Phred+64 offset and/or the command line?

Most programs should work with Phred+33 (e.g. append '-Q 33' to the fastx command line).

Also, Biopython can read files as one format and write them as another:

http://biopython.org/wiki/SeqIO

Here's a quick python conversion script, derived from an example on that page:

Code:
#!/usr/bin/python
from Bio import SeqIO
SeqIO.convert("input.fastq", "fastq-sanger", "output.fastq", "fastq-illumina")
The hashbang isn't strictly needed, and the import is obvious, but it seemed too small with just a single line of code.
gringer is offline   Reply With Quote
Old 03-02-2012, 11:36 AM   #5
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

Gringer thanks for the "-Q 33" heads up for FastX. You solved a major headache of mine. Now only if the Illumina mate-pairs we made were not 88% duplicates... As Homer would say, Doh....
Jon_Keats is offline   Reply With Quote
Old 03-02-2012, 01:26 PM   #6
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

If you are still looking for a command line tool for the job, EMBOSS seqret can do this (and the reverse).
maubp is offline   Reply With Quote
Old 03-02-2012, 10:24 PM   #7
lpn
Member
 
Location: west coast

Join Date: May 2011
Posts: 17
Default

Quote:
Originally Posted by gringer View Post
Why do you need a Phred+64 offset and/or the command line?
Because tophat doesn't seem to handle well Phred+33 (CASAVA 1.8), but works with Phred+64 (CASAVA 1.5+).

Quote:
Originally Posted by gringer View Post
Most programs should work with Phred+33 (e.g. append '-Q 33' to the fastx command line).



Also, Biopython can read files as one format and write them as another:

http://biopython.org/wiki/SeqIO

Here's a quick python conversion script, derived from an example on that page:

Code:
#!/usr/bin/python
from Bio import SeqIO
SeqIO.convert("input.fastq", "fastq-sanger", "output.fastq", "fastq-illumina")
The hashbang isn't strictly needed, and the import is obvious, but it seemed too small with just a single line of code.
Thanks a lot!
lpn is offline   Reply With Quote
Old 03-03-2012, 12:14 AM   #8
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 843
Default

Quote:
Originally Posted by lpn View Post
Because tophat doesn't seem to handle well Phred+33 (CASAVA 1.8), but works with Phred+64 (CASAVA 1.5+).
This is interesting and doesn't match my experience with tophat on recent Illumina runs. Do you have any "--solexa1.3-quals" options on your command line? Removing that should stop bowtie from using Phred+64, and go back to the default Phred+33.

There's also the Bowtie "--phred33-quals" option, which I guess you could add to tophat's bowtie call to force this:
Code:
nano $(which tophat)
gringer is offline   Reply With Quote
Old 03-03-2012, 04:47 AM   #9
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

As gringer said if you DO NOT specify a qual flag it will work fine. You are not the only person with HiSeq data which finally encodes the quality values in the standard sanger format for which nearly all programs expect by by default. The flag, for most today, is just for processing legacy datasets.
Jon_Keats is offline   Reply With Quote
Old 03-03-2012, 06:47 AM   #10
lpn
Member
 
Location: west coast

Join Date: May 2011
Posts: 17
Default

Quote:
Originally Posted by gringer View Post
This is interesting and doesn't match my experience with tophat on recent Illumina runs. Do you have any "--solexa1.3-quals" options on your command line? Removing that should stop bowtie from using Phred+64, and go back to the default Phred+33.

There's also the Bowtie "--phred33-quals" option, which I guess you could add to tophat's bowtie call to force this:
Code:
nano $(which tophat)
That works, but subsequent analysis produces strange results.

Last edited by lpn; 03-03-2012 at 06:53 AM.
lpn is offline   Reply With Quote
Old 03-03-2012, 10:54 AM   #11
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 843
Default

Quote:
Originally Posted by lpn View Post
That works, but subsequent analysis produces strange results.
What works? I suggested two options (not counting the python code). One was to remove --solexa1.3-quals from the tophat command line, and the other was to modify the bowtie parameters. I was deliberately vague about the second option because you need to know what you're doing before you do it (e.g. change the bowtie options everywhere bowtie is called, and change the tophat code that expects Phred+64 output).

Last edited by gringer; 03-03-2012 at 10:57 AM.
gringer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 05:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO