Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ClemBuntu
    Member
    • Dec 2014
    • 37

    Trying to remove nextera transposase sequence using cutadapt and fastqc

    Hello everybody,

    We just launched a nextseq500 run recently.
    I analyzed it with Fastqc and I obtained an error at the adapter content plot. It seems like the nextera transposase sequence is too high (see the files).

    So I've tried to trim my reads with cutadapt thus I used the following commands because I'm in paired-end :

    Python-2.7.9/python ~/cutadapt-1.7.1/bin/cutadapt -q 30 -b CTGTCTCTTATACACATCTGACGCTGCCGACGA --minimum-length 20 --overlap=5 -o tmpl1.1.fastq --paired-output tmpl1.2.fastq myRead_S1_L001_R1_001.fastq myRead_S1_L001_R2_001.fastq

    Python-2.7.9/python ~/cutadapt-1.7.1/bin/cutadapt -b CTGTCTCTTATACACATCTCCGAGCCCACGAGAC --minimum-length 20 -q 30 --overlap=5 -o myReads_S1_L001_R2_001.trimmed.fastq --paired-output myReads_S1_L001_R1_001.trimmed.fastq tmpl1.2.fastq tmpl1.1.fastq

    And then I checked my results like this :

    /FastQC/fastqc myReads_S1_L001_R1_001.trimmed.fastq -t 4 -o FASTQTRY/

    Finally I obtained the same plots ! For adapters plots and quality plots as well !

    How is that possible ? How can some reads have a quality score less than 30 ?
    Attached Files
    Last edited by ClemBuntu; 01-21-2015, 08:04 AM.
  • fkrueger
    Senior Member
    • Sep 2009
    • 627

    #2
    Hmm, have you tried repeating the run with -a instead of -b? When we trim Nextera adapters we run Cutadapt with -a CTGTCTCTTATA which does the job just nicely (in fact I have done this only an hour ago). If you wanted to use Trim Galore (a wrapper around Cutadapt) you can use the version attached (v0.3.8) with the option --nextera. Usage is simply:
    Code:
    trim_galore --paired --nextera myRead_S1_L001_R1_001.fastq myRead_S1_L001_R2_001.fastq
    Attached Files

    Comment

    • ClemBuntu
      Member
      • Dec 2014
      • 37

      #3
      Hi,
      I've use -b option because it does both 3' or 5' adapters instead of -a doing only 3'

      By the way the adapters were fine removed :
      Cutadapt output :
      === Adapter 1 ===

      Sequence: CTGTCTCTTATACACATCTGACGCTGCCGACGA; Type: variable 5'/3'; Length: 33; Trimmed: 2306073 times.

      === Adapter 1 ===

      Sequence: CTGTCTCTTATACACATCTCCGAGCCCACGAGAC; Type: variable 5'/3'; Length: 34; Trimmed: 2036017 times.
      So the results I obtained with fastq are very strange right ? Did I use it wrong ? Maybe I forgot an option but that seems odd.

      I'v tried to use trim_galore, I've change my .bashrc in order to make this software working, like that :
      alias cutadapt='/home/myhome/Python-2.7.9/python /home/myhome/cutadapt-1.7.1/bin/cutadapt'
      export PATH=$PATH:/home/myhome/FastQC/

      And I got this error :

      >>> Now performing quality (cutoff 30) and adapter trimming in a single pass for the adapter sequence: 'CTGTCTCTTATA' from file myReads_S1_L001_R1_001.fastq <<<
      Traceback (most recent call last):
      File "/home/myhome/cutadapt-1.7.1/bin//cutadapt", line 9, in ?
      from cutadapt.scripts import cutadapt
      File "/home/myhome/cutadapt-1.7.1/cutadapt/__init__.py", line 9
      except ImportError as e:
      ^
      SyntaxError: invalid syntax
      I got the same error when I tried to run cutadapt with an old python (v2.4), but the alias in my .bashrc should fix it...
      Last edited by ClemBuntu; 01-22-2015, 02:08 AM.

      Comment

      • fkrueger
        Senior Member
        • Sep 2009
        • 627

        #4
        what do you get when you run:
        Code:
        alias cutadapt='/home/myhome/Python-2.7.9/python /home/myhome/cutadapt-1.7.1/bin/cutadapt'
        and then
        Code:
        cutadapt
        You need to get this command to work, or it won't work within Trim Galore. You could also supply '/home/myhome/Python-2.7.9/python /home/myhome/cutadapt-1.7.1/bin/cutadapt' (or rather a version of it that is working) as the path to Cutadapt in one of the first lines of Trim Galore.

        Comment

        • ClemBuntu
          Member
          • Dec 2014
          • 37

          #5
          Launching 'cutadapt' command or all the pathway give me the same thing.

          Anyway, I change the Trim Galore source code as you said and now it works

          According to FastQC the nextera adapter was well remove.
          Now my 2nd question, I used Trim Galore like that :
          ~/trim_galore --paired -q 30 --nextera myReads_S1_L001_R1_001.fastq myReads_S1_L001_R2_001.fastq -o TrimGaloreTry/
          And after I used FastQC on the files I get at the output and I obtained the quality plots I attached.
          My question is : with the "-q 30" option all reads should have a phred score greater or equal than 30, but that's not what FastQC show me. Why ?

          (Edit : I also used cutadapt "manually" and FastQC gave me the same plots)
          Attached Files
          Last edited by ClemBuntu; 01-22-2015, 05:56 AM.

          Comment

          • fkrueger
            Senior Member
            • Sep 2009
            • 627

            #6
            Glad to hear that you got the adapter trimming sorted. I suppose the reason why there are still qualities lower than 30 in the file is because the Cutadapt doesn't immediately truncate a sequence as soon as it hits a certain threshold but it uses an algorithm for that:
            Code:
            -q CUTOFF, --quality-cutoff=CUTOFF
                                    Trim low-quality ends from reads before adapter
                                    removal. The algorithm is the same as the one used by
                                    BWA (Subtract CUTOFF from all qualities; compute
                                    partial sums from all indices to the end of the
                                    sequence; cut sequence at the index at which the sum
                                    is minimal) (default: 0)
            So if you only get a single dip in a sequence but all basecalls afterwards are fine again the sequence might pass nevertheless. Does that make sense?

            Comment

            • ClemBuntu
              Member
              • Dec 2014
              • 37

              #7
              Ok that makes sense thanks.
              But do you think it's "normal" that the boxplots extremities are this low ? i.e. up to 14 for 84 -150 bp.
              I used to use cutadapt and the boxplot I obtained are way better than this one, but it's my 1st nextseq run so maybe the quality is lower than HiSeq and MiSeq ?

              Comment

              • fkrueger
                Senior Member
                • Sep 2009
                • 627

                #8
                Hmm, hard to tell. But reads that long always show a similar decline towards the 3' end, I don't think its much different for MiSeq to be honest. I would just go ahead with your analysis and see how that goes. You could always come back and perform something more stringent afterwards.

                Comment

                • apredeus
                  Senior Member
                  • Jul 2012
                  • 151

                  #9
                  Your read qualities are fine. Could be a lot worse. There were some systematic problems with R2 qualities due to NextSeq reagent problems, you can search these forums for more detail.

                  You are getting this much transposase though because your insert size is too small. You are essentially sequencing all the way to the other end's adaptor. You need larger fragments for sure.

                  And most people don't use read trimming anymore because most modern aligners do read soft-clipping.

                  Comment

                  • sagarutturkar
                    Member
                    • Sep 2010
                    • 61

                    #10
                    Originally posted by fkrueger View Post
                    Hmm, have you tried repeating the run with -a instead of -b? When we trim Nextera adapters we run Cutadapt with -a CTGTCTCTTATA which does the job just nicely (in fact I have done this only an hour ago). If you wanted to use Trim Galore (a wrapper around Cutadapt) you can use the version attached (v0.3.8) with the option --nextera. Usage is simply:
                    Code:
                    trim_galore --paired --nextera myRead_S1_L001_R1_001.fastq myRead_S1_L001_R2_001.fastq
                    I second this comment - TrimGalore successfully removed the Nextera adapters (that could not be removed by cutadapt).

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM
                    • SEQadmin2
                      Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                      by SEQadmin2


                      With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                      Introduction

                      Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                      05-22-2026, 06:42 AM
                    • SEQadmin2
                      Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                      by SEQadmin2

                      Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                      Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                      05-06-2026, 09:04 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Today, 08:59 AM
                    0 responses
                    8 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    21 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    15 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 05-28-2026, 11:40 AM
                    0 responses
                    29 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...