Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimmomatic explanation

    Can anybody explain the following command in Trimmonatic?

    java -classpath trimmomatic-0.15.jar org.usadellab.trimmomatic.TrimmomaticPE s_1_1_sequence.txt.gz s_1_2_sequence.txt.gz lane1_forward_paired.fq.gz lane1_forward_unpaired.fq.gz lane1_reverse_paired.fq.gz lane1_reverse_unpaired.fq.gz ILLUMINACLIP:illuminaClipping.fa:2:40:15 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
    Thanks,

  • #2
    All of this is explained in detail on the trimmomatic website, but in general this will:

    1. Clip Illumina adapters
    2. then trim the leading nucleotides until quality > 3
    3. then trim the trailing nucleotides until quality > 3
    4. then using a sliding window of 4 nucleotides and trims when quality < 15
    5. Remove any remaining sequences that are shorter than 36 nt.

    You should really try looking at their documentation before posting questions here though.

    Comment


    • #3
      Hi,

      I was wondering for steps 2 and 3, how many nucleotides would it trim from the start or end of the sequence. My intuition tells me its just one but I am not sure?

      Originally posted by DunderChief View Post
      All of this is explained in detail on the trimmomatic website, but in general this will:

      1. Clip Illumina adapters
      2. then trim the leading nucleotides until quality > 3
      3. then trim the trailing nucleotides until quality > 3
      4. then using a sliding window of 4 nucleotides and trims when quality < 15
      5. Remove any remaining sequences that are shorter than 36 nt.

      You should really try looking at their documentation before posting questions here though.

      Comment


      • #4
        "trim the leading nucleotides until quality > 3"
        Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

        This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20

        Comment


        • #5
          Trimmomatic explanation

          Originally posted by JackieBadger View Post
          "trim the leading nucleotides until quality > 3"
          Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

          This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20
          It's useful for removing Ns (which have quality < 3) from the ends of the reads.

          maria

          Comment


          • #6
            Thank you JackieBadger and mastal for your explaination.
            I got the idea now.

            Comment


            • #7
              Originally posted by mastal View Post
              It's useful for removing Ns (which have quality < 3) from the ends of the reads.

              maria
              But you shouldn't be keeping any base that has a quality of between Q3-Q19
              Wouldn't it just be better to trim off actual "N"s rather than assume they have are >Q3 ?

              Comment


              • #8
                Originally posted by JackieBadger View Post
                But you shouldn't be keeping any base that has a quality of between Q3-Q19
                Wouldn't it just be better to trim off actual "N"s rather than assume they have are >Q3 ?
                Yes, I agree, but I don't think you can do that with trimmomatic.

                Comment


                • #9
                  Hi Jakie

                  I have tried to run trimmomatic with the option -phred20 to get reads with quality scores of Q20 but it doesn't seem to like it. It runs with -phred33 though. In their website, they say that it only accepts phred scores of 33 or 64. Am I wrong or is there any way of making it accept the -phred20 option?

                  Thank you
                  Originally posted by JackieBadger View Post
                  "trim the leading nucleotides until quality > 3"
                  Its pretty self explanatory...It will continue to trim bases that have a quality score lower than 3 until it hits a base where it is 3 or greater.

                  This it self is pretty useless as Ive never seen an illumina base with this low score and you would want to be retaining score at a very minimum of Q20

                  Comment


                  • #10
                    Originally posted by modi2020 View Post
                    Hi Jakie

                    I have tried to run trimmomatic with the option -phred20 to get reads with quality scores of Q20 but it doesn't seem to like it. It runs with -phred33 though. In their website, they say that it only accepts phred scores of 33 or 64. Am I wrong or is there any way of making it accept the -phred20 option?

                    Thank you
                    You are just a little confused.
                    Phred refers to the actual encoding of the quality score information http://en.wikipedia.org/wiki/FASTQ_format
                    ....
                    phred33 or phred64 encryption are produced depend on the sequencer/software used to produce your data.

                    All quality codes have a range of quality scores associated with them and use different characters to ID particular quality scores (see wiki).

                    So you choose your quality encription (phredXX) and then choose the minimum quality score you want to enforce (e.g. 20)
                    Last edited by JackieBadger; 03-20-2013, 06:57 PM.

                    Comment


                    • #11
                      Thank you for the explanation Jakie. I got the idea now.
                      I tried to specify the quality after specifying the phred score and it didn't work though. To be specific I used -phred33 20
                      Is that what you meant?

                      Originally posted by JackieBadger View Post
                      You are just a little confused.
                      Phred refers to the actual encoding of the quality score information http://en.wikipedia.org/wiki/FASTQ_format
                      ....
                      phred33 or phred64 encryption are produced depend on the sequencer/software used to produce your data.

                      All quality codes have a range of quality scores associated with them and use different characters to ID particular quality scores (see wiki).

                      So you choose your quality encription (phredXX) and then choose the minimum quality score you want to enforce (e.g. 20)

                      Comment


                      • #12
                        Actually I think I got it.
                        The way I did it is using a sliding window option. What I think I did is ask it to go through the sequence in a window of 4 bps, take the average score, if the average score is below 20 then drop that window, otherwise keep moving. If the total length of the sequence after dropping low quality windows is less than 60 I removed it. I also used the leading and trailing options to drop low quality leading or trailing base pairs.
                        My complete command is as follows:

                        java -classpath trimmomatic-0.22.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 -trimlog trimmlog_log.txt R1.fastq R2.fastq Output_R1.fq unpaired_output1.fq Output_R2.fq unpairedoutput2.fq LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:60

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM
                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        30 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        32 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        28 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        53 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X