Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trim FastQ

    Hi all, I would Like to know how could i trim a Fastq files and eliminate the reads lesser than QV30

  • #2
    Originally posted by nxtgenkid10 View Post
    Hi all, I would Like to know how could i trim a Fastq files and eliminate the reads lesser than QV30
    FASTQ quality trimmer will do this:



    If you want a more hand-holding approach, I suppose you could use Galaxy instead.

    Comment


    • #3
      Hi Gringer,
      I'm writing to ask some question about FastQ quality trimmer.
      I have loaded Fastx toolkit on my linux machine, I want to use that tool to trim bases from the 3' and 5' and with a low quality score <28.
      I have a paired end data files of 101 bases, only in the R2 file I have a decrease of quality in the 3' and in the 5' end, I attached the relative quality box plot.
      How should I set the parameter reported below of Fastq quality trimmer to obtain a good quality box plot as for R1 (attached)?

      fastq_quality_trimmer -h
      usage: fastq_quality_trimmer [-h] [-v] [-t N] [-l N] [-z] [-i INFILE] [-o OUTFILE]
      Part of FASTX Toolkit 0.0.13 by A. Gordon ([email protected])

      [-h] = This helpful help screen.
      [-t N] = Quality threshold - nucleotides with lower
      quality will be trimmed (from the end of the sequence).
      [-l N] = Minimum length - sequences shorter than this (after trimming)
      will be discarded. Default = 0 = no minimum length.
      [-z] = Compress output with GZIP.
      [-i INFILE] = FASTQ input file. default is STDIN.
      [-o OUTFILE] = FASTQ output file. default is STDOUT.
      [-v] = Verbose - report number of sequences.
      If [-o] is specified, report will be printed to STDOUT.
      If [-o] is not specified (and output goes to STDOUT),
      report will be printed to STDERR.

      I will appreciate your help.
      Thanks
      Attached Files

      Comment


      • #4
        You can find a complete list of FASTQ trimmers, compared to each other, in this paper:

        "An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis"

        Comment


        • #5
          Originally posted by giorgifm View Post
          You can find a complete list of FASTQ trimmers, compared to each other, in this paper:

          "An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis"
          That's not complete; BBDuk isn't in it

          Granted, it was not publicly available when the paper was published, but it DID exist. Here's a comparison of mapping error rates after trimming with various trimmers:

          Comment


          • #6
            FWIW, I'm now using Trimmomatic (which finally has a published paper out) because it has a pipeline-based interface that I find expressive enough for almost all of my potential uses. In particular, you can change the order of operations for different use cases.



            It has fairly good statistics in the paper giorgifm linked, but you can probably find a better trimmer for each specific case. I notice that Brian Bushnell hasn't included Trimmomatic in his graph (the Trimmomatic paper suggests "Maximum Information" mode for the best statistics) -- any chance of adding that?

            Comment


            • #7
              The reason I did not include Trimmomatic is because its settings are more complex. Most of them, you can just give a quality cutoff; Trimmomatic requires multiple parameters so it's hard to grade objectively and plot on that graph where the only variable is quality cutoff.

              That said, I would be happy to add it, if you can give me a typical command line.

              Comment


              • #8
                Here's a command line I've used:
                Code:
                ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:1:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:28
                I haven't tried optimising it at all (i.e. by adjusting the sliding window quality cutoff), and have yet to experiment with MAXINFO mode. Replacing the sliding window with maximum information would be something like the following:

                Code:
                ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:1:true LEADING:3 TRAILING:3 MAXINFO:50:0.5
                [optimisation of this would involve making sure the minimum length of 50bp is sufficient for a good mapping, then adjusting the score strictness to see how it changes mapping]

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X