SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 11-30-2010, 11:31 PM   #1
angelpie
Junior Member
 
Location: Japan

Join Date: Nov 2010
Posts: 8
Default converting_fastq_file

I have illumina v1.5+ type fastq files.

I learned fastq file consists of 4 types, sanger, illumina v1.0, v1.3 and v1.5.
Then, I also learned many programs require sanger type fastq files.

Method of converting illumina v1.0 (solexa) into sanger is found occasionally.
However, I can't find actual procedures of converting illumina v1.5 into sanger.

Could you please help me?
angelpie is offline   Reply With Quote
Old 12-01-2010, 01:04 AM   #2
nicolallias
Member
 
Location: France

Join Date: Jan 2010
Posts: 23
Default

Hi,
The only difference between those formats is the quality, if you are familiar with Perl, try the following :
http://seqanswers.com/forums/showthread.php?t=5192
$q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;

Which could be written in Python as
q_line = "".join([chr(ord(i)-31) for i in q_line])

Or do you prefer an awk line ?
Or a full script ready-to-use ?
nicolallias is offline   Reply With Quote
Old 12-01-2010, 01:17 AM   #3
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by angelpie View Post
Could you please help me?
See http://en.wikipedia.org/wiki/FASTQ_format and:

Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, http://dx.doi.org/10.1093/nar/gkp1137

From the point of view of conversion, FASTQ files from Illumina 1.5 are basically the same as Illumina 1.3 and 1.4 except the meaning of some low qualities, see:

http://seqanswers.com/forums/showpos...91&postcount=3
http://news.open-bio.org/news/2010/0...q2-trim-fastq/

Last edited by maubp; 12-01-2010 at 01:17 AM. Reason: making DOI into a link
maubp is offline   Reply With Quote
Old 12-01-2010, 02:46 AM   #4
angelpie
Junior Member
 
Location: Japan

Join Date: Nov 2010
Posts: 8
Default

Thank nicolallias and maubp for your quick reply.

Although I think I know formulae of these formats,
I don't know how do I convert between them
because I am just a user of existent scripts/programs.

Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?

I tried to use perl script in refered thread.
However, I found errors.

Quote:
Or a full script ready-to-use ?
If possible, please teach me.

Last edited by angelpie; 12-01-2010 at 02:48 AM.
angelpie is offline   Reply With Quote
Old 12-01-2010, 02:58 AM   #5
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by angelpie View Post
Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?
Yes.


Quote:
Originally Posted by angelpie View Post
I am just a user of existent scripts/programs.
Try EMBOSS seqret if you want a command line tool for converting file formats. Use fastq-illumina as the input format, fastq-sanger as the output format.

If you are happier with Python, Perl, Java, or Ruby then try Biopython, BioPerl, BioJava or BioRuby for existing libraries for reading, writing and converting FASTQ files (see the paper I linked to before).

Quote:
Originally Posted by angelpie View Post
I tried to use perl script in refered thread.
However, I found errors.
What errors?
maubp is offline   Reply With Quote
Old 12-01-2010, 03:42 AM   #6
angelpie
Junior Member
 
Location: Japan

Join Date: Nov 2010
Posts: 8
Default

Error messages said
"Use of uninitialized value $(variables) in concatenation (.) or string at .....".
angelpie is offline   Reply With Quote
Old 12-01-2010, 04:42 AM   #7
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

I don't know enough Perl to help you - but I don't think nicolallias' example was standalone, it was more of a hint for someone familiar with Perl.

Do you have EMBOSS installed? The EMBOSS tool seqret is an easy way to do this at the command line.
maubp is offline   Reply With Quote
Old 12-01-2010, 05:02 AM   #8
nicolallias
Member
 
Location: France

Join Date: Jan 2010
Posts: 23
Default

Quote:
Originally Posted by maubp View Post
it was more of a hint for someone familiar with Perl
Exact, and using already written tools is your best option.
angelpie: if you wish to learn more, you really should visit the wikipedia page about Fastq format.
nicolallias is offline   Reply With Quote
Old 03-01-2011, 06:51 AM   #9
epigen
Senior Member
 
Location: Germany

Join Date: May 2010
Posts: 101
Default convert Illumina scores to Phred in a BAM file

If you already have a BAM file, you can transform the scores in it as follows:

samtools view -h Illumina_score.bam | perl -lane '$"="\t"; if (/^@/) {print;} else {$F[10]=~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;print "@F"}' | samtools view -Sbh - > Phred_score.bam

Thanks nicolallias for providing the very efficient trick. It saved us a lot of fastq file transformations and we did not have to run all the BWA alignments again.
epigen is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:37 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO