Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • combining fastq files

    Hi all,
    we did a MiSeq run and we got low cluster density but got fastq files- then we repeated the run for the same samples and got better cluster density- Now how could I combine the fastq files/ sample so I can get better reads required for the analysis
    for example I have the following fastq files for sample x

    run1_X_read1.fastq
    run1_X_read2.fastq

    run2_X_read1.fastq
    run2_X_read2.fastq

    what is the command please used for this purpose?

  • #2
    In unix

    cat run1_X_read1.fastq run2_X_read1.fastq > R1.run1.run2.fastq

    cat run2_X_read1.fastq run2_X_read1.fastq > R2.run1.run2.fastq

    Comment


    • #3
      should be:

      cat run1_X_read1.fastq run2_X_read1.fastq > R1.run1.run2.fastq

      cat run1_X_read2.fastq run2_X_read2.fastq > R2.run1.run2.fastq

      Comment


      • #4
        You may have checked this already but in case you have not then check the quality profiles for the two runs independently. You want both runs to be of good quality for combined downstream analysis.

        Comment


        • #5
          Originally posted by mastal View Post
          should be:

          cat run1_X_read1.fastq run2_X_read1.fastq > R1.run1.run2.fastq

          cat run1_X_read2.fastq run2_X_read2.fastq > R2.run1.run2.fastq
          thanks for correcting

          Comment


          • #6
            What is the analysis and the pipeline you plan on using? If it's anything that can make use of the read groups (like gatk bqsr) you might want to align separately and merge the bam files (if you even plan on aligning that is).

            Comment


            • #7
              combining fastq files

              Originally posted by GenoMax View Post
              You may have checked this already but in case you have not then check the quality profiles for the two runs independently. You want both runs to be of good quality for combined downstream analysis.
              have checked the reads quality on fastqc (before trimming low qulaity bases) and they look reasonable- do you think it is better to trimm the low quality bases (<20) first before combining fastq files or shall I combine them firstly them carry out the trimming after?

              Comment


              • #8
                Either should be ok. You are only appending the contents of one file to another by doing "cat". You are not changing any individual sequence reads in any way by that operation.

                Comment


                • #9
                  combining fastq files

                  Originally posted by GenoMax View Post
                  Either should be ok. You are only appending the contents of one file to another by doing "cat". You are not changing any individual sequence reads in any way by that operation.
                  not sure if it is possible to combine a fastq file of 250bp read with a fastq file (of the same sample) but 300bp read??- is this applicable

                  Comment


                  • #10
                    Originally posted by mmmm View Post
                    not sure if it is possible to combine a fastq file of 250bp read with a fastq file (of the same sample) but 300bp read??- is this applicable
                    That is what you will be doing. Using "cat" you are appending contents of file 2 at end of file 1 to produce file 3. That way you are going to end up with file 3 that has reads with 2 different lengths in it.

                    If you want to do any absolute length based trimming then you should do that *before* you "cat" the files together.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM
                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    25 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    24 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    52 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X