Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • finding adapters for trimming

    Hi All,

    I am a total newbies in this field. I want to know the trend of the community for adapter trimming steps.

    I have got 50bp single end reads (Sanger / Illumina 1.9). Primary goal is to align the reads using bismark, and then extract methylation scores using 'methylkit'. There were three overrepresented sequences in FastQC report. Then I ran trim_galore using the default settings. trim_galore(which basically uses 'cutadapt') trimmed the universal adapter but still there are two overrepresented sequences left in the fastQC report.

    I have read so many posts related to trimming last 3-4 days but still I am so confused. The summary I have got is that FastQC tells us about adapter contamination, but it may not tell about the actual adapter sequence.

    1. Is it a MUST to trim all the overrepresented sequences or just trimming the universal adapter is fine?
    2. What is the easiest way to find the sequences that need to be trimmed?

    Any help/suggestion is greatly appreciated.

  • #2
    1) The best practice is to trim the actual adapter sequences used in your library.
    2) The best way to find that is to ask the people who made the library.

    But, if you have paired reads, you can also find your adapter sequences with BBMerge like this:

    bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa

    BBDuk includes all standard Illumina adapters in "/resources/adapters.fa". If you do not know which adapters were used, and are unable to find out, I recommend using that as the reference.

    Since you are using single-ended reads, it's difficult to automatically empirically determine the adapter sequences. So, unless you can get them from the people who made the library, I suggest using that reference.
    Last edited by Brian Bushnell; 11-22-2015, 10:43 PM.

    Comment


    • #3
      Brian - thanks for parsing Illumina's PDF and making the adapters available. It looks like as of Nov 9 2015 Illumina updated their adapter sequence document. Are there any notable changes that aren't present in the BBDuk adapter sequence fasta?

      Oligonucleotide (oligo) sequences of Illumina adapters used in AmpliSeq, Nextera, TruSeq, and TruSight library prep kits.
      Attached Files

      Comment


      • #4
        Ah, thanks for notifying me... I'll look at it.

        Comment


        • #5
          Thanks a lot for the response Brian.

          I have single reads this time. Do you have any suggestions for the overrepresented sequences that do not match with any actual adapter (''No Hit" as described by fastqc)?

          Originally posted by Brian Bushnell View Post
          1) The best practice is to trim the actual adapter sequences used in your library.
          2) The best way to find that is to ask the people who made the library.

          But, if you have paired reads, you can also find your adapter sequences with BBMerge like this:

          bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa

          BBDuk includes all standard Illumina adapters in "/resources/adapters.fa". If you do not know which adapters were used, and are unable to find out, I recommend using that as the reference.

          Since you are using single-ended reads, it's difficult to automatically empirically determine the adapter sequences. So, unless you can get them from the people who made the library, I suggest using that reference.

          Comment


          • #6
            @bluepoison: I suggest you try adapter-removal using BBDuk and adapters.fa, and see if fastQC still detects overrepresented sequences. If not, everything should be fine! But if it does, you may have a new adapter sequence, so please reply in that case.

            @turnersd: Unfortunately... there are a lot of new adapter indexes in the latest Illumina letter that you linked - dozens. They are for human-specific tests, like autism, cancer, and other possibly genetic disorders. And as always, Illumina makes no effort to indicate which indexes go with which adapters. So, it looks like a huge amount of effort now to make a complete set of Illumina adapter sequences complete with indexes.

            JGI does not do any human sequencing, so none of that is relevant to us. But for everyone else out there - I really hope Illumina, or someone in the community, compiles a full list of the new human-specific adapter sequences. Because there are so many, and I have no way to empirically determine whether the new sequences are correct (since we don't use them), it's not really possible for me to generate them. Illumina would provide the full, indexed adapter-sequences for trimming if they had the slightest concern for their end users, which they unfortunately do not appear to have.

            So far, it's not clear to me which adapters go with new indexes, or why they even need new indexes for cancer versus autism, etc. Seems like a marketing ploy. But probably the new indices only affect amplicon sequencing and are irrelevant to randomly-shared libraries.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X