Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by Xi Wang View Post
    You can use the script below (name it qseq2fastq.pl and replace the former one):

    Code:
    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    while (<>) {
    	chomp;
    	my @parts = split /\t/;
    	print "@","$parts[0]:$parts[2]:$parts[3]:$parts[4]:$parts[5]#$parts[6]/$parts[7]\n";
    	print "$parts[8]\n";
    	print "+","$parts[0]:$parts[2]:$parts[3]:$parts[4]:$parts[5]#$parts[6]/$parts[7]\n";
    	print "$parts[9]\n";
    }
    Greetings Xi Wang,

    I have tried to use this script to convert from minimal fastq format to one in which the read name is listed before the base qualities. Here is my command line:

    $ perl qseq2fastq.pl sequence.fastq > test.fastq

    However at each attempt, I get an empty output file and the "use of uninitialized value in concatenation (.) or string" message in the terminal. Please excuse my ignorance as I have only very limited knowledge of perl scripts. I would appreciate it very much if you could explain what I am doing wrong and give me step-by-step instructions on how to run this script.

    Many thanks!

    Comment


    • #32
      Originally posted by labrat73 View Post
      Greetings Xi Wang,

      I have tried to use this script to convert from minimal fastq format to one in which the read name is listed before the base qualities. Here is my command line:

      $ perl qseq2fastq.pl sequence.fastq > test.fastq

      However at each attempt, I get an empty output file and the "use of uninitialized value in concatenation (.) or string" message in the terminal. Please excuse my ignorance as I have only very limited knowledge of perl scripts. I would appreciate it very much if you could explain what I am doing wrong and give me step-by-step instructions on how to run this script.

      Many thanks!
      You try to convert fastq to fastq; that's not the intention of the script. The above script converts qseq format to fastq.

      Comment


      • #33
        Originally posted by sklages View Post
        You try to convert fastq to fastq; that's not the intention of the script. The above script converts qseq format to fastq.
        sklages-

        thanks so much for your reply. i'm a bit confused because my file has the fastq extension and it looks like this:

        @SRR101483.1 SCS_0014:6:1:1063:16736/1
        GCGTAGGCTCTATCCCTAGAATGCAAAGGTGGTTCAACATACACAGATCAATAAATGTGATTCAC
        +
        DDDBDCC=D-5AA<B--CAAC5?A5@CC-=AA>>5CC:5=?:A5AC:C?D:C:>5?==@A@

        when i try to run it, though, i keep getting an error. i compared it to other files that i've run and that's when i noticed that in other files, the title name appears again after the "+", immediately before the base qualities. i'm trying to convert or edit this file so that it looks like this:

        @SRR101483.1 SCS_0014:6:1:1063:16736/1
        GCGTAGGCTCTATCCCTAGAATGCAAAGGTGGTTCAACATACACAGATCAATAAATGTGATTCAC
        +SRR101483.1 SCS_0014:6:1:1063:16736/1
        DDDBDCC=D-5AA<B--CAAC5?A5@CC-=AA>>5CC:5=?:A5AC:C?D:C:>5?==@A@

        i hope this makes sense and appreciate any advice you could offer.

        best-

        labrat73

        Comment


        • #34
          Use [ code ] and [ /code ] tags to prevent the forum messing up the display of examples.

          Your files is already FASTQ format - without the redundant optional repeated identifier on the plus lines. You don't need to make that change.

          As sklages said earlier, the script this thread is about converting from the Illumina qseq format into FASTQ.

          Comment


          • #35
            fastq validator

            has anyone tried using this to test?

            I have a very similar problem here where my .txt is in this format
            where there is no line break after the '+'... however this is still in fastq format because the '+' line is optional... however some people here were still getting errors in the format i have posted below

            has anyone used http://genome.sph.umich.edu/wiki/FastQValidator ?


            @HWI-ST604_0134:4:1101:1391:1882#0/1
            NATAGTGCTTTAGCATCATATCTAAGGCTGTTCGTCCTACATTGTTGAGGAAACAACTATGACCTCCCTTGGGTCGGTTGCTATGCAA AGCAATGCTAACA
            +HWI-ST604_0134:4:1101:1391:1882#0/1
            BUXRMZ[Z[[cccccccccccccccccccccccccccccc\cccccccccc_cccUYcccccccaccUYccccc_ccc__a\cac\_V __^X^^^\^^[^\
            @HWI-ST604_0134:4:1101:1493:1886#0/1
            NTAGATAATGATGCCACTGTTACAACTCTGTGCTTTGGGGTACCTAACAAGTCTCCCTCAGTGCCTCTCTGATTTGTAGCTAGTCAAT AGAATGAATAAAG
            +HWI-ST604_0134:4:1101:1493:1886#0/1
            BUXYX[[Z[[cccccc_cccccccc_ccccccccccc\ccZ____ccc_ccccccccccc[____ccccc_[cc_c_ccc_c_c_cc_ \_BBBBBBBBBBB

            Comment


            • #36
              Originally posted by arcolombo698 View Post
              has anyone tried using this to test?

              I have a very similar problem here where my .txt is in this format
              where there is no line break after the '+'... however this is still in fastq format because the '+' line is optional... however some people here were still getting errors in the format i have posted below

              has anyone used http://genome.sph.umich.edu/wiki/FastQValidator ?


              @HWI-ST604_0134:4:1101:1391:1882#0/1
              NATAGTGCTTTAGCATCATATCTAAGGCTGTTCGTCCTACATTGTTGAGGAAACAACTATGACCTCCCTTGGGTCGGTTGCTATGCAA AGCAATGCTAACA
              +HWI-ST604_0134:4:1101:1391:1882#0/1
              BUXRMZ[Z[[cccccccccccccccccccccccccccccc\cccccccccc_cccUYcccccccaccUYccccc_ccc__a\cac\_V __^X^^^\^^[^\
              @HWI-ST604_0134:4:1101:1493:1886#0/1
              NTAGATAATGATGCCACTGTTACAACTCTGTGCTTTGGGGTACCTAACAAGTCTCCCTCAGTGCCTCTCTGATTTGTAGCTAGTCAAT AGAATGAATAAAG
              +HWI-ST604_0134:4:1101:1493:1886#0/1
              BUXYX[[Z[[cccccc_cccccccc_ccccccccccc\ccZ____ccc_ccccccccccc[____ccccc_[cc_c_ccc_c_c_cc_ \_BBBBBBBBBBB
              I don't get it. There is a "linebreak" (newline) after your '+' line. So this is normal fastq format.

              Btw, the '+' line is *not* optional, its content is! There must always be at least the '+' sign as header for the quality line. But it is optional to write any information after that (in the same line).

              Comment


              • #37
                The problem I see is that bases and qualities both have a spaces in them, but otherwise it looks fine.

                Comment


                • #38
                  Originally posted by Brian Bushnell View Post
                  The problem I see is that bases and qualities both have a spaces in them, but otherwise it looks fine.
                  You're right, maybe a copy&paste issue ..?

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X