Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trimming - looking for a complete solution

    Hi, I found this previous discussion which covers a lot of what I'd like to know:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    but not quite all! I am working with HaloPlex data. Before alignment, I need to remove Haloplex adapters, and also clip 5bp from both ends of both forward and reverse reads. I should also not be left with any empty or orphan (i.e. unmatched reads).

    I had previously been taking an approach to trim adapters with cutadapt, use a separate Perl script to remove the 5bp, then re-run cutadapt with a 'fake' adpater sequence to drop zero-length reads, then finally run another script to drop orphans. While this works, it seems tools like Trimmomatic or Trim Galore could achieve the same in a more efficient one-step manner.

    My problem is therefore that neither tool seems to deal with both ends of the reads:

    Trimmomatic has 'CROP: Cut the read to a specified length by removing bases from the end'

    Trim Galore has --clip_R1 <int> and --clip_R2 <int> to remove <int> bp from the 5' end of read 1 and read 2.

    Unless I've misunderstood, this only deals with one end of the reads. The reason I need to clip these bases from both ends is to remove residual bases from the restriction enzyme footprint.

    TIA!

  • #2
    Trimmomatic also has HEADCROP, which removes bases from the 5' end of the reads.

    Comment


    • #3
      Sorry - there's an error is my OP - HEADCROP is the option I meant to mention. CROP is actually not much use to me, as it's the opposite of what I'd like to do (specifying the length of sequence to be left behind as opposed to what to remove), so I still have the situation that I can only clip from one end (5').

      Ideally (in the case of Trimmomatic) I'm looking for a 'TAILCROP' option...

      Comment


      • #4
        Originally posted by girlmonkey View Post
        Sorry - there's an error is my OP - HEADCROP is the option I meant to mention. CROP is actually not much use to me, as it's the opposite of what I'd like to do (specifying the length of sequence to be left behind as opposed to what to remove), so I still have the situation that I can only clip from one end (5').

        Ideally (in the case of Trimmomatic) I'm looking for a 'TAILCROP' option...
        I guess it depends at which stage of the trimming and adapter removal steps you need to cut the bases from the 3' end, if you can do it as the first step, then CROP would be OK, unless your reads are all different lengths.

        Comment


        • #5
          Thanks for your reply. The reads are initially all the same length (150bp), but adapter trimming should come first (after which they are all different lengths) before the clipping of 5bp from the ends.

          Comment


          • #6
            We have just implemented two new options into Trim Galore (--three_prime_clip_r1 and --three_prime_clip_r2) to clip off any number of bases from the 3' ends of reads after adapter/quality trimming has finished. girlmonkey is just testing the new version, if it works fine it will find its way into the next release.

            Comment


            • #7
              PRINSEQ has many options for trimming the 3' end of reads. There is '--trim_right' for trimming a specified length, '--trim_right_p' for trimming a certain percentage, '--trim_ns_right' for trimming poly-N tails, '--trim_qual_right' for trimming by a certain quality threshold, and '--trim_to_len' to specify trimming to a certain length.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X