Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • byou678
    Member
    • Aug 2011
    • 52

    Please Help: What is the differences between standard trimming and adaptive trimming

    Hi All,

    When I do RNAseq Quality Trimmming using Perl scripts in Terminal, these Options appear:

    --type <num> 0=standard trimming, 1=adaptive trimming, 2=windowed adaptive trimming. Default 0

    -- qual-threshold <num> quality threshold for trimming, default 20
    -- length-threshold <num> length threshold for trimming, default 20
    ... ...

    Could anyboday explain the differences of 0=standard trimming, 1=adaptive trimming, 2=windowed adaptive trimming? and the criteria about setting length-threshold??

    Thanks a lot in advance.
    Last edited by byou678; 08-19-2011, 11:34 AM.
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    Is 'RNAsq' a program? If so (and I can not find it on the web) what does the program's documentation say? I am sure that we could hazard a guess but the program itself is your best bet.

    Oh ... I just found what you are probably using. 'Trim.pl' by Nik Joshi. That would have been nice to know. Anyway, yeah, there isn't much documentation to that program, is there? I suspect that you don't read "Perl" and Nik obviously believes that "good code is self-documenting" (e.g., his lack of comments about the basics is appalling although, unfortunately, I've seen worse) so it might take someone to dig into the code to give a definitive answer.

    Comment

    • westerman
      Rick Westerman
      • Jun 2008
      • 1104

      #3
      For anyone who wants to dig:



      Or you could write to Nik Joshi.

      Comment

      • byou678
        Member
        • Aug 2011
        • 52

        #4
        Sorry for the confusion. Actually, I use RNA-seq technology here. The data come from Illumina Genomic Analyzer II. Yes, I use this Scripts: 'Trim.pl' http://wiki.bioinformatics.ucdavis.e...ex.php/Trim.pl

        westerman, Thanks for your nice reply!!!
        Last edited by byou678; 08-19-2011, 11:44 AM.

        Comment

        • gaffa
          Member
          • Oct 2010
          • 82

          #5
          So from reading the code, "standard trimming" means that it will trim off a defined number of bases (as given by the "length-threshold" flag) from all reads, regardless of quality. In "adaptive trimming" mode it will use the quality scores to assess each read individually, by finding the first position which has a quality below cutoff (as given by the "qual-threshold" flag) and then trimming away this base and all following bases (unless the remaining read is shorter than the length threshold, in which case it will discard the whole read).

          So the adaptive method is slightly more sophisticated than the standard, though it might not always do what you'd want: if a read has a single poor-quality base early on but is otherwise high-quality, this method will throw away the good part of the read (possibly the whole read). The script has a third method which is slightly more sophisticated still, the "windowed adaptive trimming", which tries to combat this problem by running a sliding window over the read and looking at the average quality in this window, rather than at a single base.

          Comment

          • byou678
            Member
            • Aug 2011
            • 52

            #6
            Thanks for the reply

            Hi gaffa, Thank you very much for the reply. For "standard trimming", from which end of the reads, the 20 bases ( if I use the default number) will be trimmed off? And if "standard trimming" regardless of quality scores, it may not be used often, am i right?

            In addition, could you send me the related papers or resources about my question. I need take a deeper look because this project is really important to me.

            Thanks again! Have a great weekend!


            Originally posted by gaffa View Post
            So from reading the code, "standard trimming" means that it will trim off a defined number of bases (as given by the "length-threshold" flag) from all reads, regardless of quality. In "adaptive trimming" mode it will use the quality scores to assess each read individually, by finding the first position which has a quality below cutoff (as given by the "qual-threshold" flag) and then trimming away this base and all following bases (unless the remaining read is shorter than the length threshold, in which case it will discard the whole read).

            So the adaptive method is slightly more sophisticated than the standard, though it might not always do what you'd want: if a read has a single poor-quality base early on but is otherwise high-quality, this method will throw away the good part of the read (possibly the whole read). The script has a third method which is slightly more sophisticated still, the "windowed adaptive trimming", which tries to combat this problem by running a sliding window over the read and looking at the average quality in this window, rather than at a single base.

            Comment

            • byou678
              Member
              • Aug 2011
              • 52

              #7
              Is there anybody can offer me the related papers or resources about my urgent question? Thanks!

              Comment

              • westerman
                Rick Westerman
                • Jun 2008
                • 1104

                #8
                Originally posted by byou678 View Post
                Is there anybody can offer me the related papers or resources about my urgent question? Thanks!
                I doubt if there are any papers. As far as I can tell the terms used and the algorithm used by the program are internal to the program. In other words if the author of the program got his idea from somewhere he did not cite those sources. The ideas behind his code are not that unique and have probably been implemented many times.

                Comment

                • byou678
                  Member
                  • Aug 2011
                  • 52

                  #9
                  I think the two adaptive trimming modes will check the bases with quality scores from 5' end to 3' end, and then do trimming when the poor quality base or window is found. For standard trimming, it will directly trim off the defined number bases ( like 10 or 15 ) on the 3' end regardless the quality scores are good or bad (because Most modern sequencing technologies produce reads that have deteriorating quality towards the 3'-end).

                  Please correct me if i am wrong. Below is a related resouce and all other ideas and help will be greatly appreciated!!

                  Most modern sequencing technologies produce reads that have deteriorating quality towards the 3'-end. Incorrectly called bases here negatively impact assembles, mapping, and downstream bioinformatics analyses.

                  Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads. It will also discard reads based upon the length threshold. It takes the quality values and slides a window across them whose length is 0.1 times the length of the read. If this length is less than 1, then the window is set to be equal to the length of the read. Otherwise, the window slides along the quality values until the average quality in the window drops below the threshold. At that point the algorithm determines where in the window the drop occurs and cuts both the read and quality strings there. However, if the cut point is less than the minimum length threshold, then the read is discarded entirely.

                  Thanks westerman.

                  Originally posted by westerman View Post
                  I doubt if there are any papers. As far as I can tell the terms used and the algorithm used by the program are internal to the program. In other words if the author of the program got his idea from somewhere he did not cite those sources. The ideas behind his code are not that unique and have probably been implemented many times.

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM
                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  12 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  46 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-09-2026, 11:58 AM
                  0 responses
                  106 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  125 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...