Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina paired end adapter contamination problem

    Hello everyone!

    I have rna-seq Illumina paired end reads and want to proceed with adapter trimming.
    I have some confusions:

    1. Does the 5' end of both the forward and reverse reads start from the first base of the insert? Or could there be some adapter contamination also at 5' end?
    From whatever I have read online, there shouldn't be any adapter present at 5' end. But, the data I am analyzing has around 75 reads (out of 7 million for forward read file) with adapter at 5' end. 75 sequences isn't much, but I want to know what causes this..

    2. For the forward reads, some 3' ends may have indexed adapter. In cases where this indexed adapter occurs within the sequence, I should delete the adapter and the following sequence, right? Even if the indexed primer is present at 5' end?? In which case the whole read should be deleted. (Because this was due to absence of insert between two adapters)

    3. Do the 5' ends of reverse reads have barcode sequences or any part of the indexed adapter?? I have 12,399 reads (out of 7 million) that have complete or a part of indexed adapter at 5' end, with a few of them within the reads.


    I am new to rna-seq data analysis, and have gone through lots of tutorials and explanations online, but everything seems to be really confusing at this moment.

    My main concern is: where to expect adapters in illumina forward and reverse reads respectively, and what to do upon encountering unexpected adapters.

  • #2
    Originally posted by Gazaldeep View Post
    Hello everyone!

    I have rna-seq Illumina paired end reads and want to proceed with adapter trimming.
    I have some confusions:

    1. Does the 5' end of both the forward and reverse reads start from the first base of the insert? Or could there be some adapter contamination also at 5' end?
    From whatever I have read online, there shouldn't be any adapter present at 5' end. But, the data I am analyzing has around 75 reads (out of 7 million for forward read file) with adapter at 5' end. 75 sequences isn't much, but I want to know what causes this..
    There should be no contamination on 5'-end if you are using standard Illumina kits.

    2. For the forward reads, some 3' ends may have indexed adapter. In cases where this indexed adapter occurs within the sequence, I should delete the adapter and the following sequence, right? Even if the indexed primer is present at 5' end?? In which case the whole read should be deleted. (Because this was due to absence of insert between two adapters)
    Barcodes/Tag reads are never part of the actual read in Illumina sequencing. If you have tags in your sequence then there is something wrong. If you have some reads with no inserts they should be taken care of during trimming.

    My main concern is: where to expect adapters in illumina forward and reverse reads respectively, and what to do upon encountering unexpected adapters.
    Use bbduk from BBMap suite. Search for that thread here. It is straight forward to use and @Brian includes all commercially used adapters in a file included in the package. Just point bbduk to that file and scan/trim your data.

    Comment


    • #3
      Thanks for your reply!!

      Originally posted by GenoMax View Post
      There should be no contamination on 5'-end if you are using standard Illumina kits.
      So, the 72 reads with 5' adapter contamination should be deleted, right?

      Originally posted by GenoMax View Post
      Barcodes/Tag reads are never part of the actual read in Illumina sequencing. If you have tags in your sequence then there is something wrong. If you have some reads with no inserts they should be taken care of during trimming.
      The paired-end data I am trying to analyze was downloaded from DDBJ.

      After searching online and through your answer, I'm sure that I should delete the reads that have any adapter at 5' end (be it the 5' adapter or 3' adapter), and perform trimming for reads with adapter at 3' end or within the read.

      But I'm actually a bit confused about the Illumina sequencing steps.

      Are the barcodes removed after sorting the reads into different files based on different barcodes?? So the files we get in the end cannot have the barcodes, but may they have the constant part of the indexed adapter (which occurs before/after the barcode) or are the constant parts also removed with the barcodes?
      I want to be clear about the process.

      Comment


      • #4
        I could just use a tool for trimming, but before that, I want to be clear about what's happening. Maybe I've got it all wrong?

        Comment


        • #5
          Originally posted by Gazaldeep View Post
          Thanks for your reply!!

          But I'm actually a bit confused about the Illumina sequencing steps.
          Check this video out for clarification: https://www.youtube.com/watch?v=HMyCqWhwB8E

          Are the barcodes removed after sorting the reads into different files based on different barcodes?? So the files we get in the end cannot have the barcodes, but may they have the constant part of the indexed adapter (which occurs before/after the barcode) or are the constant parts also removed with the barcodes?
          I want to be clear about the process.
          Illumina sequencing actually proceeds in four separate steps (for 2D barcodes, 3 for 1 D barcodes).

          Code:
          R1 --> R2 (index 1) --> R3 (index 2) --> R4.
          Illumina software keeps tracks of every cluster over R1 through R4. During base calling (conversion of BCL to FASTQ) index read sequences are extracted from R2 (and R3) and are transferred to the header of the FASTQ record to complete demultiplexing (you thus end up with R1/R2 files).

          It is possible to generate files with index reads in individual files so you end up with 4 files per sample. This is only needed for some applications (e.g. QIIME).

          Comment


          • #6
            Thanks!!! Really helpful!!

            In my reads, I have 5' end contaminated with 5' adapter (75 reads). Also, in 12,000 reads out of 7 million, 5' adapter is present with the reads.. what do you suggest? Should I delete those reads? Or should I just trim the adapter and the sequence preceeding it at 5'?? I'm using Cutadapt at present. But in any adapter removal tool, I will have to specify if I want to trim these reads and in what way..

            Sorry if my questions are naive!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X