Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge two raw data files.fq.gz into one

    Hi,
    I am a new perl user. I would like to merge two raw data files.fq.gz into one file.fq.gz using perl script. I made this command line:

    ~/software/Test_perlscripts/mergeGZFastqFiles.pl lane1r2_subset.fq.gz lane1r1_subset.fq.gz 11.r2r1_subset.fq.gz

    After writing the above command and hit enter, it says Permission denied


    bash: /home/software/Test_perlscripts/mergeGZFastqFiles.pl: Permission denied

    Can anyone advice me why this error message came in and how I can fix it?

  • #2
    Code:
    chmod a+x ~/software/Test_perlscripts/mergeGZFastqFiles.pl
    Also, make sure you have the appropriate shebang at the beginning of the perl file (i.e., something like "#!/usr/bin/env perl").

    Comment


    • #3
      @shis: Unless you are trying to interleave the reads with the perl script, you could also just "cat" them together to make a single file.

      Comment


      • #4
        I have changed according file permission: chmod a+x and execute the command again. Now it says:

        Can't open gzip file lane1r2_subset.fq.gz

        Comment


        • #5
          Originally posted by shis View Post
          I have changed according file permission: chmod a+x and execute the command again. Now it says:

          Can't open gzip file lane1r2_subset.fq.gz
          That sounds like a read permission error (as long as the file is in the local directory).

          What exactly are you trying to do with the perl script?

          Post output for
          Code:
          $ ls -l *.fq.gz

          Comment


          • #6
            @GenoMax: Actually I would like to merge two reads, read1 (Forward) and read2 (reverse) of lane 1 and lane 2 using perl script. For the begining I am trying to test the perl script I have with a subset of lane1 read2 and lane1 read1.

            Comment


            • #7
              If you want to merge them based on an overlap (as opposed to just appending one sequence to the other) then you might want to just use Flash.

              Comment


              • #8
                How is perl reading in the files? You might use a construct like:

                Code:
                open(IN, "gunzip -c $ARGV[0] |");
                while(<IN>){
                ...
                }
                It pipes the gunzip call into perl. You can also do same for BAM files eg:

                Code:
                open(IN, "samtools view $bam |");
                while(<IN>){
                ...
                }

                Comment


                • #9
                  Originally posted by shis View Post
                  @GenoMax: Actually I would like to merge two reads, read1 (Forward) and read2 (reverse) of lane 1 and lane 2 using perl script. For the begining I am trying to test the perl script I have with a subset of lane1 read2 and lane1 read1.
                  Perhaps there is something simple that is wrong with your perl script. If you want someone to help you debug the script then you can post it here.

                  We are assuming that file permissions on the two subset files allow reading by user account that is running the perl script.

                  It may just be simple to use the program Devon suggested.

                  Comment


                  • #10
                    @GenoMax:

                    $ ls -l lane1r2_subset.fq.gz

                    -rw-r--r-- 1 me me 2888777 Apr 14 14:14 lane1r2_subset.fq.gz

                    Comment


                    • #11
                      Originally posted by shis View Post
                      @GenoMax:

                      $ ls -l lane1r2_subset.fq.gz

                      -rw-r--r-- 1 me me 2888777 Apr 14 14:14 lane1r2_subset.fq.gz
                      Read permission is not the problem. It must be something in your code.

                      Comment


                      • #12
                        zcat *fq.gz | gzip > merged.fq.gz

                        Comment


                        • #13
                          You don't have to gunzip/gzip. You can just concatenate gz files ( http://stackoverflow.com/questions/8005114 )

                          Code:
                          cat f1.gz f2.gz > merged.gz

                          Comment


                          • #14
                            Originally posted by crazyhottommy View Post
                            zcat *fq.gz | gzip > merged.fq.gz
                            @shis (post # 6) does not want to merge the files but the R1/R2 reads (overlap them).

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            22 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            24 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            19 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            50 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X