Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Miseq:Trimming, and sequencing primers at the beginning of a read

    I noticed that when I trim my Miseq reads for adapter contamination (getting rid of the 3' portion of the read), I could still grep the trimmed reads for ACACTCTTTCCCTACACGAC (the sequencing primer/adapter sequence) and find several thousand at the 5' end of Read1 reads. These shouldn't be there, right? What am I missing?

    I used fastq-mcf to trim the 13bp common TruSeq sequence AGATCGGAAGAGC.

    Primer sequences do not appear in the beginning of Read2 reads. In the sample sheet, I did not request that the MiSeq do any onboard trimming. For library prep, I used NEBNext Ultra, whose adapters, seq primers, and indicies are the same as the TruSeq stuff.

    So, my questions are 1) why am I getting primer sequences in read 1? and 2) Is the 13bp sequence sufficient for trimming Illumina reads (and should I be doing this differently--the reads are used for de novo assembly and blast-based binning, so aggressively getting rid of adapter sequences is important to me)?

  • #2
    First part could be explained by having adapter/primer dimers without any insert.

    As for trimming give "trimmomatic" (http://www.usadellab.org/cms/?page=trimmomatic) or cutadapt (http://code.google.com/p/cutadapt/)/trimgalore (http://www.bioinformatics.babraham.a...s/trim_galore/) a try. Recent comparison of trimmers available http://www.plosone.org/article/info:...l.pone.0085024.

    Comment


    • #3
      I like the idea of trimmomatic, but I can't seem to make it trim the adapters--they still show up after the following:

      Code:
      java -classpath /opt/Trimmomatic-0.32/trimmomatic-0.32.jar org.usadellab.trimmomatic.TrimmomaticPE -threads 8 -trimlog TT.log Pool1_S1_L001_R1_001.fastq Pool1_S1_L001_R2_001.fastq p1r1_TT.fastq p1r1_To.fastq p1r2_TT.fastq p1r2_To.fastq LEADING:3 TRAILING:3 ILLUMINACLIP:adapter_13.fa:2:30:10 SLIDINGWINDOW:4:15 MINLEN:16
      I may have the parameters set funny, but I don't know the best way to set it. My adapter sequence is the 13bp common Illumina sequence--is 13bp not scoring high enough to get trimmed?

      Comment


      • #4
        Are you using the raw data for trimming? Why not use the TruSeq3 (PE) adapters that Trimmomatic includes (you will find those files in "Trimmomatic-0.30/adapters/") for the ILLUMINACLIP input.

        Comment


        • #5
          @clintp

          I think the parameters you are using for the IlluminaClip step (2:30:10 ) are too high for trimmomatic to recognize a match to a 13-base adapter sequence;

          You need to either change the values or use a longer adapter sequence.

          See the trimmomatic web page,



          particularly the discussion under the heading 'Adapter Fasta', from which I have extracted this quote:

          'The thresholds used are a simplified log-likelihood approach. Each matching base adds just over 0.6, while each mismatch reduces the alignment score by Q/10. Therefore, a perfect match of a 12 base sequence will score just over 7"

          Comment


          • #6
            @mastal
            Yep, understanding the cutoff scores helped a lot (durrr). Somehow I missed that discussion on the trimmomatic page.

            @GenoMax
            Thanks for that reference--very useful. It's too bad they didn't include ea-utils/FastqMcf in that analysis, though.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM
            • seqadmin
              The Impact of AI in Genomic Medicine
              by seqadmin



              Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
              02-26-2024, 02:07 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-14-2024, 06:13 AM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-08-2024, 08:03 AM
            0 responses
            71 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-07-2024, 08:13 AM
            0 responses
            80 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-06-2024, 09:51 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X