Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kwebb
    Member
    • Jul 2008
    • 21

    What does Illumina raw data look like?

    Hi

    I'm trying to work through some of the various assembler programs before actually collecting my own Illumina data. I've found some test datasets here:



    but I'm not sure if the file formats are the same as raw data from the Genome Analzyer.

    The files are s_4_seq.txt and s_4_prb.txt and the first few lines look like this:
    s_4_seq.txt
    4 1 56 910 AACTTACAATTGAAAATATAAACTCAT
    4 1 64 716 AAGATGATTATATGTCTTCCTTTTCGA
    4 1 890 894 TCAAACCAATCAGACCTATGTTTCATA

    s_4_prb.txt
    40 -40 -40 -40 40 -40 -40 -40 -40 40 -40 -40 -40 -4
    0 -40 40 -40 -40 -40 40 40 -40 -40 -40 -40 40 -40
    -40 40 -40 -40 -40 40 -40 -40 -40 -40 -40 -40 40

    So my questions are
    1. Is this the raw data format from the machine?
    2. How do I get these files into fastq format? The maq converter and sanger perl scripts previously mentioned do not seem to work.

    Thank you!
  • kwebb
    Member
    • Jul 2008
    • 21

    #2
    Update

    I've managed to convert my data using the solexa2fasta.pl script. However the tool included with Maq, sol2sanger, does not work with my data. Can someone please explain?

    Thank you!

    Comment

    • zee
      NGS specialist
      • Apr 2008
      • 249

      #3
      There's a really good tool in the maq package (latest release) called fq_all2std.pl

      See below:


      Usage: fq_all2std.pl <command> <in.txt>

      Command: scarf2std Convert SCARF format to the standard/Sanger FASTQ
      fqint2std Convert FASTQ-int format to the standard/Sanger FASTQ
      sol2std Convert Solexa/Illumina FASTQ to the standard FASTQ
      fa2std Convert FASTA to the standard FASTQ
      seqprb2std Convert .seq and .prb files to the standard FASTQ
      fq2fa Convert various FASTQ-like format to FASTA
      export2sol Convert Solexa export format to Solexa FASTQ
      export2std Convert Solexa export format to Sanger FASTQ
      csfa2std Convert AB SOLiD read format to Sanger FASTQ
      instruction Explanation to different format
      example Show examples of various formats

      Note: Read/quality sequences MUST be presented in one line.

      Comment

      • zee
        NGS specialist
        • Apr 2008
        • 249

        #4
        There's a really good tool in the maq package (latest release) called fq_all2std.pl

        See below:


        Usage: fq_all2std.pl <command> <in.txt>

        Command: scarf2std Convert SCARF format to the standard/Sanger FASTQ
        fqint2std Convert FASTQ-int format to the standard/Sanger FASTQ
        sol2std Convert Solexa/Illumina FASTQ to the standard FASTQ
        fa2std Convert FASTA to the standard FASTQ
        seqprb2std Convert .seq and .prb files to the standard FASTQ
        fq2fa Convert various FASTQ-like format to FASTA
        export2sol Convert Solexa export format to Solexa FASTQ
        export2std Convert Solexa export format to Sanger FASTQ
        csfa2std Convert AB SOLiD read format to Sanger FASTQ
        instruction Explanation to different format
        example Show examples of various formats

        Note: Read/quality sequences MUST be presented in one line.

        Comment

        • kwebb
          Member
          • Jul 2008
          • 21

          #5
          Great tool!

          Thanks for the info!

          Comment

          • hannat
            Member
            • Jan 2009
            • 16

            #6
            I have similar data, seq.txt and prb just like you,
            seq.txt
            ........................................................................
            6 1 914 893 GCTACTGCCGTGACCTCATTTCTCTTA
            6 1 898 905 GAAAAAGAGAAAGTTTAGGAGATCGAT
            .....................................................................................
            prob.txt
            .....................................................................................
            -30 -30 30 -30 -30 30 -30 -30 -30 -30 -30 30 30 -30 -30
            -30 -30 30 -30 -30 -30 -30 -30 30 -30 -30 30 -30 -30 3
            0 -30 -30 -30 30 -30 -30 -30 -30 30 -30 -30 -30 -30 30
            -30 -30 30 -30 30 -30 -30 -30 -30 30 -30 -30 -30 30 -30
            -30 -30 -30 -30...
            .............................................................................

            but when i run
            fq_all2std.pl seqprb2std seq.txt prb.txt
            The output is like following,
            ...........................................
            @6:1:914:893
            GCTACTGCCGTGACCTCATTTCTCTTA
            +
            ???????????????????????????
            ..................................................

            And i had lots of the warnings, similar things like this

            but there is other problems, i got lots of warning message like this:
            Argument "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..." isn't numeric in numeric gt (>) at /usr/local/bin/fq_all2std.pl line 152, <$fhq> line 6609.
            ....................................................................................................................................................................................................... line ......


            i wonder if this kind of warning is happening to others too, if so, what do you think the problem is?
            now i am checking my prb.txt, i guess there is some lines which was not accpeted.
            Last edited by hannat; 01-26-2009, 04:00 AM.

            Comment

            • alig
              Member
              • Sep 2008
              • 44

              #7
              Hi,

              Re : There's a really good tool in the maq package (latest release) called fq_all2std.pl

              I tried to use

              fq2fa Convert various FASTQ-like format to FASTA

              to convert my illumina seq data from fastq to fasta as I want the quality in fasta format to run Mosaik's gigBayes program.

              But the Maq perl script fq_all2std.pl fq2fa <in.txt> command

              just seemed to print the results to the screen & not place them in a fasta file.

              Am I doing something really silly here?

              Only I've got a 1.8 Gb illumina seq text file so this process takes a while & I need it in a file, not printed to the screen

              thanks alig

              Comment

              • lparsons
                Member
                • Nov 2008
                • 28

                #8
                You simply need to redirect the standard output (which is printing to your screen) to a file:

                fq_all2std.pl fq2fa in.txt > out.fasta

                See http://www.december.com/unix/tutor/redirect.html for more info.

                Comment

                • alig
                  Member
                  • Sep 2008
                  • 44

                  #9
                  convert fastq to fasta

                  To lparsons,

                  Thank you. Yes I realised that later after I'd sent my post.

                  Also in case anyone else is looking to separate a fastq file into seq.fasta & qual.fasta files you actually need the other command within Maq

                  fq_all2std.pl std2qual <out.prefix> <in.fastq>

                  Thanks again

                  alig

                  Comment

                  • subram28@msu.edu
                    Junior Member
                    • Jul 2009
                    • 1

                    #10
                    Bowtie

                    Has anybody used Bowtie for mapping?

                    Comment

                    • spadejac
                      Junior Member
                      • Sep 2009
                      • 4

                      #11
                      Bowtie for alignment

                      Originally posted by [email protected] View Post
                      Has anybody used Bowtie for mapping?
                      Oh yeah! We have. And that is the best that I've come across in my career for alignment of short reads. Just too fast - Great for expression data.

                      Spade

                      Comment

                      • der_eiskern
                        Member
                        • Jul 2009
                        • 46

                        #12
                        Originally posted by spadejac View Post
                        Oh yeah! We have. And that is the best that I've come across in my career for alignment of short reads. Just too fast - Great for expression data.

                        Spade
                        I heard bowtie is great for mapping Chromatin IPs and RNA back to a reference but isn't as good as MAQ for finding snps though. Is this accurate?

                        Comment

                        • lhw_genome
                          Junior Member
                          • Nov 2009
                          • 3

                          #13
                          hi,
                          everyone, I am a new user of BWA. Greatly appreciate if I could get any of your help!
                          I have paired-end Solexa data (in two files s_2_1.export.txt ; s_2_2_export.txt) presented in the following format (SCARF ASCII with mapping information)

                          HWI-EAS433 16 3 11 255 71 0 2 TGAAAGGGAATATCTTCATATAAAATCTAGACAAAAGCATTCTCAGAATC abbb``b_`aaab_bb``babaa_`a^b_a__aaa`aa`aa`_`aa[^a_
                          chr9.fa 66572916 F G32G3A10G1 33 0 chr7.fa 61087451 R Y

                          now, I would like to convert the Solexa export file to fastq format file so that I could use BWA, I tried the scripts fq_all2std.pl export2std command, but it doesn't work. i also tried scarf2std command, it converted my file, but the export file was not the fastq format, there was other information (Eland mapping position also included in the output file.

                          I don't have any experience to write perl or other scripts.
                          Could you please help me?
                          Many thanks!

                          Comment

                          • federica torri
                            Junior Member
                            • Nov 2009
                            • 2

                            #14
                            script format converter

                            Originally posted by alig View Post
                            To lparsons,

                            Thank you. Yes I realised that later after I'd sent my post.

                            Also in case anyone else is looking to separate a fastq file into seq.fasta & qual.fasta files you actually need the other command within Maq

                            fq_all2std.pl std2qual <out.prefix> <in.fastq>

                            Thanks again

                            alig
                            Hi,

                            I checked in my maq 0.7.1 version for this script but I didn't find it...do you know if it is anymore available or did you find it as a supplemetary maq script? Thanks

                            Comment

                            • niazi84@hotmail.com
                              Member
                              • Jan 2010
                              • 25

                              #15
                              @hannat..

                              Did you get the solution of your problem. I have same kind of problem with my data.

                              Thanks
                              ~Adnan~

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              26 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              36 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              60 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...