Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using trimmomatic on multiple paired-end read files

    I need help to write a for loop to run Trimmomatic tool for quality trimming of paired end fastq files.
    I need to write a for loop so that I can run an executable for all multiple files.

    Input PE files looks like - C1_S1_L001_R1_001.fastq.gz
    C1_S1_L001_R2_001.fastq.gz

    C2_S39_L001_R1_001.fastq.gz
    C2_S39_L001_R2_001.fastq.gz

    T2_S41_L001_R1_001.fastq.gz
    T2_S41_L001_R2_001.fastq.gz

    T6_S45_L001_R1_001.fastq.gz
    T6_S45_L001_R2_001.fastq.gz

    To run trimmomatic for the paired reads corresponding to C1_S1_L001_R1_001.fastq.gz and C1_S1_L001_R2_001.fastq.gz, the following command works:

    java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 C1_S1_L001_R1_001.fastq.gz C1_S1_L001_R2_001.fastq.gz C1_R1_paired.fq.gz C1_R1_unpaired.fq.gz C1_R2_paired.fq.gz C1_R2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35

    The framework provided by trimmomatic

    java -jar <path to trimmomatic.jar> PE [-threads <threads] [-phred33 | -phred64] [-trimlog <logFile>] <input 1> <input 2> <paired output 1> <unpaired output 1> <paired output 2> <unpaired output 2> <step 1> ...

    Any help please!
    Thanks!
    Last edited by shashankgupta; 02-14-2017, 04:49 AM.

  • #2
    I would assume the following should work:
    (but obviously untested)

    Code:
    for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
    do
    java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
    done
    You can easily check if the commands look alright be adding in an echo statement:
    Code:
    for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
    do
    echo java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
    done

    Comment


    • #3
      Originally posted by wdecoster View Post
      I would assume the following should work:
      (but obviously untested)

      Code:
      for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
      do
      java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      done
      You can easily check if the commands look alright be adding in an echo statement:
      Code:
      for f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
      do
      echo java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz  ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      done
      Thank you for helping me.
      But, somehow it is showing some error as shown below-

      TrimmomaticPE: Started with arguments:
      0*_R1_001.fastq.gz 0*_R2_001.fastq.gz 0_R1.trimmed_PE.fastq 0_R1.trimmed_SE.fastq 0_R2.trimmed_PE.fastq 0_R2.trimmed_SE.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:3:20 MINLEN:30
      Exception in thread "main" java.io.FileNotFoundException: 0*_R1_001.fastq.gz (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
      at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264)
      at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539)
      at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
      0
      TrimmomaticPE: Started with arguments:
      0*_R1_001.fastq.gz 0*_R2_001.fastq.gz 0_R1.trimmed_PE.fastq 0_R1.trimmed_SE.fastq 0_R2.trimmed_PE.fastq 0_R2.trimmed_SE.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:3:20 MINLEN:30
      Exception in thread "main" java.io.FileNotFoundException: 0*_R1_001.fastq.gz (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
      at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264)
      at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539)
      at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
      0




      As I understand, above script unable to find the file. So to simplify it, I rename all the file names, and now it looks like-


      C1_R1.fastq
      C1_R2.fastq

      C2_R1.fastq
      C2_R2.fastq

      C3_R1.fastq
      C3_R2.fastq

      T1_R1.fastq
      T1_R2.fastq

      T2_R1.fastq
      T2_R2.fastq

      T3_R1.fastq
      T3_R2.fastq

      Therefore, the working trimmomatic command looks like,

      java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 C1_R1.fastq C1_R2.fastq C1_R1_paired.fastq C1_R1_unpaired.fastq C1_R2_paired.fastq C1_R2_unpaired.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
      Last edited by shashankgupta; 02-15-2017, 01:04 AM.

      Comment


      • #4
        Thanks, this helped me!

        Comment


        • #5
          Hello all,

          This code work on multiple single-end read files??

          Any help please??

          Many thanks

          Comment


          • #6
            #!/bin/bash

            for f1 in /path_to_your_raw_data/*.fastq.gz

            do
            java -jar /path_to_trimmomatic_folder/trimmomatic-0.36.jar SE -phred33 $f1 ${f1%%.fastq.gz}"trimmed_minleng50.fq.gz" ILLUMINACLIP:/path_to_trimmomatic_folder/adapters/TruSeq2-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50

            done


            This is what I used for the different SE files I have, basically it changes .fastq.gz for "trimmed_minleng50.fq.gz" once it is trimmed. You can edit the order and the value of the parameters. (Minimun_length, minimun quality at the end or start of the reads, different adapter files...).
            Last edited by carmarbla; 08-08-2017, 06:37 AM.

            Comment


            • #7
              Many thanks Carmarbla for your reply!
              Best

              Comment


              • #8
                Hi All,
                just found these threads. Does the code
                or f in $(ls *.fastq.gz | sed 's/?_001.fastq.gz//' | sort -u)
                do
                java -jar ~/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 ${f}1_001.fastq.gz ${f}1_002.fastq.gz ${f}_R1_paired.fq.gz ${f}1_unpaired.fq.gz ${f}_2_paired.fq.gz ${f}2_unpaired.fq.gz LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35
                done
                is need modification prior to use or it will work for any PE files?
                cheers

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM
                • seqadmin
                  The Impact of AI in Genomic Medicine
                  by seqadmin



                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                  02-26-2024, 02:07 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-14-2024, 06:13 AM
                0 responses
                34 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-08-2024, 08:03 AM
                0 responses
                72 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-07-2024, 08:13 AM
                0 responses
                82 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-06-2024, 09:51 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X