SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
For MAQ: Is there a Tool to convert sanger-format fastq file to illumina-fotmat fastq byb121 Bioinformatics 6 12-20-2013 01:26 AM
i converted illumina fastq into sanger fastq, need advice Aicen Bioinformatics 5 08-27-2012 06:24 AM
Convert SOLiD fastq to Illumina fastq samt SOLiD 34 08-23-2012 06:29 AM
how to transfer sanger fastQ into illumina FastQ sunsnow86 Bioinformatics 3 06-17-2011 02:21 PM
Reduce file size after Illumina FASTQ to Sanger FASTQ conversion? jjw14 Illumina/Solexa 2 06-01-2010 04:35 PM

Reply
 
Thread Tools
Old 05-21-2010, 06:08 AM   #1
zouzou
Junior Member
 
Location: usa

Join Date: May 2010
Posts: 1
Default Convert illumina v1.5 fastq to sanger fastq

Hi everybody !

I am a very new user of new generation sequncing. I download the software BWA and SAMtools to analyse data of a illumina GA 2. I saw that BWA need .fastq format in input for the reads. I have data in qseq.txt format.
I saw that .txt and .fastq can be the same thing but there are variants in .fastq. I read BWA needs sanger-fastq and i think i have illumina v1.5-fastq.
Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
Thanks !
zouzou is offline   Reply With Quote
Old 05-21-2010, 08:19 AM   #2
rmdavies
Member
 
Location: Great Britain

Join Date: Dec 2009
Posts: 13
Default

See http://seqanswers.com/forums/showthread.php?t=5192 for a short perl script that converts .qseq.txt to a sangr-fastq file. The quality value conversion is actually done by this line:
Code:
$q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;
It's fairly easy to convert this into a quick-and-dirty perl script that will do the same thing for a fastq file:

Code:
#!/usr/bin/perl

use strict;
use warnings;

my $count = 0;
while (<>) {
    chomp;
    if ($count++ % 4 == 3) { tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/; }
    print "$_\n";
}
N.B.: The script above assumes that the sequence and quality values in the fastq file are on single lines. This is not necessarily true, but you can usually get away with it for short read data. You should check the output carefully, to make sure that it is doing what you want. It should be fairly obvious if it gets out of synchronization, or if you run it on a sanger-fastq file by mistake.
rmdavies is offline   Reply With Quote
Old 05-31-2010, 05:47 AM   #3
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,540
Default

Quote:
Originally Posted by zouzou View Post
Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
Thanks !
You can use several existing tools to do the conversion from Illumina FASTQ to Sanger FASTQ, including EMBOSS seqret, Biopython, BioPerl, BioJava, BioRuby etc.
http://dx.doi.org/10.1093/nar/gkp1137

Note in recent pipelines Illumina FASTQ files some of the low quality scores have special meaning:
http://seqanswers.com/forums/showthread.php?p=17491

Last edited by maubp; 05-31-2010 at 06:55 AM. Reason: adding missing last two words of my sentence.
maubp is offline   Reply With Quote
Old 05-31-2010, 06:16 AM   #4
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by zouzou View Post
Hi everybody !

I am a very new user of new generation sequncing. I download the software BWA and SAMtools to analyse data of a illumina GA 2. I saw that BWA need .fastq format in input for the reads. I have data in qseq.txt format.
I saw that .txt and .fastq can be the same thing but there are variants in .fastq. I read BWA needs sanger-fastq and i think i have illumina v1.5-fastq.
Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
Thanks !
Also, bfast comes with a perl script to perform the conversion. It's under scripts (ill2fastq.pl).
__________________
-drd
drio is offline   Reply With Quote
Old 06-25-2010, 10:35 AM   #5
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Quote:
Originally Posted by zouzou View Post
Hi everybody !

Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
Thanks !
You may try to patch latest bwa version with the appropriate patch listed
here. It is the first one. It adds a '-I' option to 'bwa aln' predicate so that one can use Illumina (pipeline 1.3+ or 1.5+) fastq and trim as they were in sanger scale. Output in the SAM file is in Sanger scale as well.

d
dawe is offline   Reply With Quote
Old 06-29-2010, 01:18 PM   #6
ntremblay
Member
 
Location: Montreal, Quebec, Canada

Join Date: Dec 2009
Posts: 27
Default

Hey, Galaxy has a tool called FASTQ Groomer under NGS: QC and manipulation menu.
you can convert bw various quality format (sanger, solexa, Illumina 1.3 and above, colorspace sanger).

I think you can also download the script directly from the website ...

NT
ntremblay is offline   Reply With Quote
Old 10-15-2010, 05:35 AM   #7
zeam
Member
 
Location: USA

Join Date: Oct 2010
Posts: 35
Default Questions on '-I' option

Quote:
Originally Posted by dawe View Post
You may try to patch latest bwa version with the appropriate patch listed
here. It is the first one. It adds a '-I' option to 'bwa aln' predicate so that one can use Illumina (pipeline 1.3+ or 1.5+) fastq and trim as they were in sanger scale. Output in the SAM file is in Sanger scale as well.

d
I have used patch to update my bwa.I followed you directions.But I don't know how to use the the "-I",and I have browsed your patch file and saw " -I Input files are in Illumina quallity scale." Meanwhile,when I type bwa aln after I used your patch file,I thought I would see the "-I" option ,but I didn't.
So,can you give me some explanations?Supposed I will use Sanger quality 15,how to set -q INT after I used your patch.Shoud I set 15 or not?
I really appreciate of you threads and sorry for bothering.

[email protected] bwa-0.5.8a]$ bwa aln

Usage: bwa aln [options] <prefix> <in.fq>

Options: -n NUM max #diff (int) or missing prob under 0.02 err rate (float)
[0.04]
-o INT maximum number or fraction of gap opens [1]
-e INT maximum number of gap extensions, -1 for disabling long
gaps [-1]
-i INT do not put an indel within INT bp towards the ends [5]
-d INT maximum occurrences for extending a long deletion [10]
-l INT seed length [32]
-k INT maximum differences in the seed [2]
-m INT maximum entries in the queue [2000000]
-t INT number of threads [1]
-M INT mismatch penalty [3]
-O INT gap open penalty [11]
-E INT gap extension penalty [4]
-R INT stop searching when there are >INT equally best hits [30]
-q INT quality threshold for read trimming down to 35bp [0]
-c input sequences are in the color space
-L log-scaled gap penalty for long deletions
-N non-iterative mode: search for all n-difference hits
(slooow)
-f FILE file to write output to instead of stdout
zeam is offline   Reply With Quote
Old 10-15-2010, 05:59 AM   #8
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

It appears you haven't applied the patch (or you haven't installed the patched binary).

d
dawe is offline   Reply With Quote
Old 10-15-2010, 04:12 PM   #9
zeam
Member
 
Location: USA

Join Date: Oct 2010
Posts: 35
Default Questions on BWA patch

Quote:
Originally Posted by dawe View Post
It appears you haven't applied the patch (or you haven't installed the patched binary).

d
I'm sorry I don't unstand your reply.Would you give me some explicit directions.Thanks very much!

I followed the directions:
cd bwa-source-directory
patch -p1 < patch.file
make
zeam is offline   Reply With Quote
Old 10-15-2010, 10:19 PM   #10
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Quote:
Originally Posted by zeam View Post
I'm sorry I don't unstand your reply.Would you give me some explicit directions.Thanks very much!

I followed the directions:
cd bwa-source-directory
patch -p1 < patch.file
make
Could you successfully apply the patch? If yes, well, try to issue
Code:
./bwa aln
and see if the -I options appear. If yes, substitute the installed binary with this, i.e.

Code:
sudo install bwa `which bwa`
d
dawe is offline   Reply With Quote
Old 12-28-2010, 03:23 PM   #11
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default BWA Illumina Quality Patch

Hi dawe,

I just tried to apply your SVN v50 patch to the current svn download, which lists version 50, and the patch fails.

Code:
$ patch -p1 < bwa-svn-r50_illumina-qual.patch 
missing header for unified diff at line 5 of patch
can't find file to patch at input line 5
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|Index: bwape.c
|===================================================================
|--- bwape.c	(revision 50)
|+++ bwape.c	(working copy)
--------------------------
File to patch:
Steps:
1) svn download of current bio-bwa subversion (version 50)

Code:
svn co https://bio-bwa.svn.sourceforge.net/svnroot/bio-bwa bio-bwa
....
bunch of stuff
....
Checked out revision 50.
2) cd bio-bwa/trunk/bwa
3) make
4) copied patch to current directory
5) attempted to patch as noted above

I tried the archived bwa-0.5.8 patch and that applied perfectly

Any suggestions?

PS - thanks for this patch and the previous maq ill2sanger patch they are life savers.
Jon_Keats is offline   Reply With Quote
Old 12-28-2010, 08:56 PM   #12
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

My bad, sorry. Anyway, as suggested by 'patch' error, you should use a different strip:

Code:
$ patch -p0 < / path/to/patch
That should work.

HTH
D
dawe is offline   Reply With Quote
Old 12-28-2010, 09:09 PM   #13
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

Thanks, that worked perfectly
Jon_Keats is offline   Reply With Quote
Old 05-25-2011, 11:44 AM   #14
nsl
Member
 
Location: CA

Join Date: Jan 2011
Posts: 28
Default

I am new to NGS and bioinformatics. I just got my data and am trying out Galaxy. I am trying to use Fastq Groomer to convert into fastq-sanger. I have 8GB's of data, does anyone know an estimate of how long this process should take? I don't know whether to quit and execute again, it has been running for about 3.5 hours. Am I being impatient?

Sorry for the novice/inexperienced question

Thanks
nsl
nsl is offline   Reply With Quote
Old 05-26-2011, 12:30 AM   #15
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,540
Default

It will depend on which Galaxy installation you are using (e.g. the main http://usegalaxy.org Penn State one), and how busy it is with other people's work. If you asked on the Galaxy mailing list you'd probably get a better answer.
maubp is offline   Reply With Quote
Old 05-26-2011, 11:58 AM   #16
nsl
Member
 
Location: CA

Join Date: Jan 2011
Posts: 28
Default

Thank you.
nsl is offline   Reply With Quote
Old 06-17-2011, 10:37 AM   #17
jiltysequence
Junior Member
 
Location: Rockies

Join Date: Jun 2011
Posts: 6
Default

Quote:
Originally Posted by maubp View Post
It will depend on which Galaxy installation you are using (e.g. the main http://usegalaxy.org Penn State one), and how busy it is with other people's work. If you asked on the Galaxy mailing list you'd probably get a better answer.
Has the Galaxy server ever been known to go down? I am having some trouble accessing the download at the moment. No errors just too long of a wait that I am giving in. Thirty minutes is too long when you are on a deadline. I have been running some simulations to test different compositions of structural foams and plastics. I have been considering using them in a scientific construction class for university. Now I just need to find out who else is into manufacturing these products. Have you guys ever heard of www.dekalbplastics.com?

Last edited by jiltysequence; 06-23-2011 at 10:20 AM.
jiltysequence is offline   Reply With Quote
Old 06-17-2011, 10:41 AM   #18
sunsnow86
Member
 
Location: organge CA

Join Date: Jul 2010
Posts: 17
Default conver sanger fastQ into illumina fastQ

Is there any one know a script which can convert sanger fastQ (phred+33) to illumina fastQ( phred+64
)
sunsnow86 is offline   Reply With Quote
Old 06-17-2011, 01:07 PM   #19
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,540
Default

Quote:
Originally Posted by sunsnow86 View Post
Is there any one know a script which can convert sanger fastQ (phred+33) to illumina fastQ( phred+64
)
Did you read all this thread? e.g. My earlier post said:

Quote:
Originally Posted by maubp
You can use several existing tools to do the conversion from Illumina FASTQ to Sanger FASTQ, including EMBOSS seqret, Biopython, BioPerl, BioJava, BioRuby etc.
http://dx.doi.org/10.1093/nar/gkp1137
These tools can also do the reverse conversion.

You can also do this in Galaxy
maubp is offline   Reply With Quote
Old 06-21-2011, 02:08 PM   #20
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

I can't seem to get seqret to read files at all. It keeps saying Died: seqret terminated: Bad value for '-sequence' and no prompt
Kotoro is offline   Reply With Quote
Reply

Tags
bwa, fastq, illumina v1.5, sanger

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:53 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO