Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • zouzou
    Junior Member
    • May 2010
    • 1

    Convert illumina v1.5 fastq to sanger fastq

    Hi everybody !

    I am a very new user of new generation sequncing. I download the software BWA and SAMtools to analyse data of a illumina GA 2. I saw that BWA need .fastq format in input for the reads. I have data in qseq.txt format.
    I saw that .txt and .fastq can be the same thing but there are variants in .fastq. I read BWA needs sanger-fastq and i think i have illumina v1.5-fastq.
    Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
    Thanks !
  • rmdavies
    Member
    • Dec 2009
    • 13

    #2
    See https://www.seqanswers.com/node/4344 for a short perl script that converts .qseq.txt to a sangr-fastq file. The quality value conversion is actually done by this line:
    Code:
    $q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;
    It's fairly easy to convert this into a quick-and-dirty perl script that will do the same thing for a fastq file:

    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    my $count = 0;
    while (<>) {
        chomp;
        if ($count++ % 4 == 3) { tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/; }
        print "$_\n";
    }
    N.B.: The script above assumes that the sequence and quality values in the fastq file are on single lines. This is not necessarily true, but you can usually get away with it for short read data. You should check the output carefully, to make sure that it is doing what you want. It should be fairly obvious if it gets out of synchronization, or if you run it on a sanger-fastq file by mistake.

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      Originally posted by zouzou View Post
      Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
      Thanks !
      You can use several existing tools to do the conversion from Illumina FASTQ to Sanger FASTQ, including EMBOSS seqret, Biopython, BioPerl, BioJava, BioRuby etc.


      Note in recent pipelines Illumina FASTQ files some of the low quality scores have special meaning:
      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc
      Last edited by maubp; 05-31-2010, 06:55 AM. Reason: adding missing last two words of my sentence.

      Comment

      • drio
        Senior Member
        • Oct 2008
        • 323

        #4
        Originally posted by zouzou View Post
        Hi everybody !

        I am a very new user of new generation sequncing. I download the software BWA and SAMtools to analyse data of a illumina GA 2. I saw that BWA need .fastq format in input for the reads. I have data in qseq.txt format.
        I saw that .txt and .fastq can be the same thing but there are variants in .fastq. I read BWA needs sanger-fastq and i think i have illumina v1.5-fastq.
        Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
        Thanks !
        Also, bfast comes with a perl script to perform the conversion. It's under scripts (ill2fastq.pl).
        -drd

        Comment

        • dawe
          Senior Member
          • Apr 2009
          • 258

          #5
          Originally posted by zouzou View Post
          Hi everybody !

          Do you know if there is a software to convert illumina v1.5 fastq to sanger-fastq? Or do you know the code to do this ?
          Thanks !
          You may try to patch latest bwa version with the appropriate patch listed
          here. It is the first one. It adds a '-I' option to 'bwa aln' predicate so that one can use Illumina (pipeline 1.3+ or 1.5+) fastq and trim as they were in sanger scale. Output in the SAM file is in Sanger scale as well.

          d

          Comment

          • ntremblay
            Member
            • Dec 2009
            • 31

            #6
            Hey, Galaxy has a tool called FASTQ Groomer under NGS: QC and manipulation menu.
            you can convert bw various quality format (sanger, solexa, Illumina 1.3 and above, colorspace sanger).

            I think you can also download the script directly from the website ...

            NT
            Nicolas Tremblay
            Graduate Student

            Cardiovascular Genetics - Andelfinger Lab
            CHU Ste-Justine Research Center

            Comment

            • zeam
              Member
              • Oct 2010
              • 43

              #7
              Questions on '-I' option

              Originally posted by dawe View Post
              You may try to patch latest bwa version with the appropriate patch listed
              here. It is the first one. It adds a '-I' option to 'bwa aln' predicate so that one can use Illumina (pipeline 1.3+ or 1.5+) fastq and trim as they were in sanger scale. Output in the SAM file is in Sanger scale as well.

              d
              I have used patch to update my bwa.I followed you directions.But I don't know how to use the the "-I",and I have browsed your patch file and saw " -I Input files are in Illumina quallity scale." Meanwhile,when I type bwa aln after I used your patch file,I thought I would see the "-I" option ,but I didn't.
              So,can you give me some explanations?Supposed I will use Sanger quality 15,how to set -q INT after I used your patch.Shoud I set 15 or not?
              I really appreciate of you threads and sorry for bothering.

              bioinformatics@localhost bwa-0.5.8a]$ bwa aln

              Usage: bwa aln [options] <prefix> <in.fq>

              Options: -n NUM max #diff (int) or missing prob under 0.02 err rate (float)
              [0.04]
              -o INT maximum number or fraction of gap opens [1]
              -e INT maximum number of gap extensions, -1 for disabling long
              gaps [-1]
              -i INT do not put an indel within INT bp towards the ends [5]
              -d INT maximum occurrences for extending a long deletion [10]
              -l INT seed length [32]
              -k INT maximum differences in the seed [2]
              -m INT maximum entries in the queue [2000000]
              -t INT number of threads [1]
              -M INT mismatch penalty [3]
              -O INT gap open penalty [11]
              -E INT gap extension penalty [4]
              -R INT stop searching when there are >INT equally best hits [30]
              -q INT quality threshold for read trimming down to 35bp [0]
              -c input sequences are in the color space
              -L log-scaled gap penalty for long deletions
              -N non-iterative mode: search for all n-difference hits
              (slooow)
              -f FILE file to write output to instead of stdout

              Comment

              • dawe
                Senior Member
                • Apr 2009
                • 258

                #8
                It appears you haven't applied the patch (or you haven't installed the patched binary).

                d

                Comment

                • zeam
                  Member
                  • Oct 2010
                  • 43

                  #9
                  Questions on BWA patch

                  Originally posted by dawe View Post
                  It appears you haven't applied the patch (or you haven't installed the patched binary).

                  d
                  I'm sorry I don't unstand your reply.Would you give me some explicit directions.Thanks very much!

                  I followed the directions:
                  cd bwa-source-directory
                  patch -p1 < patch.file
                  make

                  Comment

                  • dawe
                    Senior Member
                    • Apr 2009
                    • 258

                    #10
                    Originally posted by zeam View Post
                    I'm sorry I don't unstand your reply.Would you give me some explicit directions.Thanks very much!

                    I followed the directions:
                    cd bwa-source-directory
                    patch -p1 < patch.file
                    make
                    Could you successfully apply the patch? If yes, well, try to issue
                    Code:
                    ./bwa aln
                    and see if the -I options appear. If yes, substitute the installed binary with this, i.e.

                    Code:
                    sudo install bwa `which bwa`
                    d

                    Comment

                    • Jon_Keats
                      Senior Member
                      • Mar 2010
                      • 279

                      #11
                      BWA Illumina Quality Patch

                      Hi dawe,

                      I just tried to apply your SVN v50 patch to the current svn download, which lists version 50, and the patch fails.

                      Code:
                      $ patch -p1 < bwa-svn-r50_illumina-qual.patch 
                      missing header for unified diff at line 5 of patch
                      can't find file to patch at input line 5
                      Perhaps you used the wrong -p or --strip option?
                      The text leading up to this was:
                      --------------------------
                      |Index: bwape.c
                      |===================================================================
                      |--- bwape.c	(revision 50)
                      |+++ bwape.c	(working copy)
                      --------------------------
                      File to patch:
                      Steps:
                      1) svn download of current bio-bwa subversion (version 50)

                      Code:
                      svn co https://bio-bwa.svn.sourceforge.net/svnroot/bio-bwa bio-bwa
                      ....
                      bunch of stuff
                      ....
                      Checked out revision 50.
                      2) cd bio-bwa/trunk/bwa
                      3) make
                      4) copied patch to current directory
                      5) attempted to patch as noted above

                      I tried the archived bwa-0.5.8 patch and that applied perfectly

                      Any suggestions?

                      PS - thanks for this patch and the previous maq ill2sanger patch they are life savers.

                      Comment

                      • dawe
                        Senior Member
                        • Apr 2009
                        • 258

                        #12
                        My bad, sorry. Anyway, as suggested by 'patch' error, you should use a different strip:

                        Code:
                        $ patch -p0 < / path/to/patch
                        That should work.

                        HTH
                        D

                        Comment

                        • Jon_Keats
                          Senior Member
                          • Mar 2010
                          • 279

                          #13
                          Thanks, that worked perfectly

                          Comment

                          • nsl
                            Member
                            • Jan 2011
                            • 28

                            #14
                            I am new to NGS and bioinformatics. I just got my data and am trying out Galaxy. I am trying to use Fastq Groomer to convert into fastq-sanger. I have 8GB's of data, does anyone know an estimate of how long this process should take? I don't know whether to quit and execute again, it has been running for about 3.5 hours. Am I being impatient?

                            Sorry for the novice/inexperienced question

                            Thanks
                            nsl

                            Comment

                            • maubp
                              Peter (Biopython etc)
                              • Jul 2009
                              • 1544

                              #15
                              It will depend on which Galaxy installation you are using (e.g. the main http://usegalaxy.org Penn State one), and how busy it is with other people's work. If you asked on the Galaxy mailing list you'd probably get a better answer.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 08:59 AM
                              0 responses
                              13 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              18 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...