Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • converting_fastq_file

    I have illumina v1.5+ type fastq files.

    I learned fastq file consists of 4 types, sanger, illumina v1.0, v1.3 and v1.5.
    Then, I also learned many programs require sanger type fastq files.

    Method of converting illumina v1.0 (solexa) into sanger is found occasionally.
    However, I can't find actual procedures of converting illumina v1.5 into sanger.

    Could you please help me?

  • #2
    Hi,
    The only difference between those formats is the quality, if you are familiar with Perl, try the following :
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    $q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;

    Which could be written in Python as
    q_line = "".join([chr(ord(i)-31) for i in q_line])

    Or do you prefer an awk line ?
    Or a full script ready-to-use ?

    Comment


    • #3
      Originally posted by angelpie View Post
      Could you please help me?
      See http://en.wikipedia.org/wiki/FASTQ_format and:

      Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, http://dx.doi.org/10.1093/nar/gkp1137

      From the point of view of conversion, FASTQ files from Illumina 1.5 are basically the same as Illumina 1.3 and 1.4 except the meaning of some low qualities, see:

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

      Last edited by maubp; 12-01-2010, 02:17 AM. Reason: making DOI into a link

      Comment


      • #4
        Thank nicolallias and maubp for your quick reply.

        Although I think I know formulae of these formats,
        I don't know how do I convert between them
        because I am just a user of existent scripts/programs.

        Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?

        I tried to use perl script in refered thread.
        However, I found errors.

        Or a full script ready-to-use ?
        If possible, please teach me.
        Last edited by angelpie; 12-01-2010, 03:48 AM.

        Comment


        • #5
          Originally posted by angelpie View Post
          Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?
          Yes.


          Originally posted by angelpie View Post
          I am just a user of existent scripts/programs.
          Try EMBOSS seqret if you want a command line tool for converting file formats. Use fastq-illumina as the input format, fastq-sanger as the output format.

          If you are happier with Python, Perl, Java, or Ruby then try Biopython, BioPerl, BioJava or BioRuby for existing libraries for reading, writing and converting FASTQ files (see the paper I linked to before).

          Originally posted by angelpie View Post
          I tried to use perl script in refered thread.
          However, I found errors.
          What errors?

          Comment


          • #6
            Error messages said
            "Use of uninitialized value $(variables) in concatenation (.) or string at .....".

            Comment


            • #7
              I don't know enough Perl to help you - but I don't think nicolallias' example was standalone, it was more of a hint for someone familiar with Perl.

              Do you have EMBOSS installed? The EMBOSS tool seqret is an easy way to do this at the command line.

              Comment


              • #8
                Originally posted by maubp View Post
                it was more of a hint for someone familiar with Perl
                Exact, and using already written tools is your best option.
                angelpie: if you wish to learn more, you really should visit the wikipedia page about Fastq format.

                Comment


                • #9
                  convert Illumina scores to Phred in a BAM file

                  If you already have a BAM file, you can transform the scores in it as follows:

                  samtools view -h Illumina_score.bam | perl -lane '$"="\t"; if (/^@/) {print;} else {$F[10]=~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;print "@F"}' | samtools view -Sbh - > Phred_score.bam

                  Thanks nicolallias for providing the very efficient trick. It saved us a lot of fastq file transformations and we did not have to run all the BWA alignments again.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  47 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X