Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina paired-end reads. More than 2 adapter sequences.

    Hi,

    I've been recently involved in a project where my task is to analyze double-stranded RNA sequencing data, and I'm relatively new to this field.

    Data:
    - Illumina MiSeq 2x150 reads
    - 2 samples - about 3.5 million paired-end reads per sample
    - Known linker/primer sequences
    - RNA library is supposed to include only dsRNA molecules > 150 nt

    I've started analyzing the data of sample 1 and I've found that a huge proportion of the reads have the following structure (R1 and R2 are mate reads):

    R1-5' => [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx 3'
    R2-3' => xxx [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] 5'

    (Note 1 : mate reads are complementary)
    (Note 2 : xxx portion ~10 nt long)

    Is there any explanation for this?

    I have a very basic idea about the RNA-Seq process, but not enough to explain why do I have more than 2 adapters per read.

    Thanks in advance.

  • #2
    Sorry, missalignment of the read structure. Corrected below:

    R1-5' => [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx 3'
    R2-3' => xxx [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] 5'

    Comment


    • #3
      Do you have any idea how the libraries were generated? This certainly does not look like a standard Illumina TruSeq library. It could be that something is going wrong during one of the adapter ligation steps that is causing fragments to concatamerize. It's also strange that adapter 1 is being sequenced first in both reads.

      Comment


      • #4
        This looks very strange to me. I don't think any of your reads should start with adapter sequence. They can end with adapter sequence if your insert is shorter than your read length. Is your "some sequence 2" a short sequence of indexing nucleotides?

        I can only think that the actual Illumina adapters are outside of your strange adapter reads.

        Comment


        • #5
          kcchan and microgirl123,

          Thanks a lot for your posts.

          A couple of clarifications:

          Do you have any idea how the libraries were generated?
          Hope this helps:
          - RNA fragmentation
          - Double strand cDNA synthesis
          - End repair
          - Linkers/adapters are ligated at the 3' and 5' ends of the double-stranded RNA (I refer to those sequences as adapter 1 and 2)
          - Denaturing
          - PCR amplification (primers are complementary to the linkers)
          - HTS

          It's also strange that adapter 1 is being sequenced first in both reads
          The actual sequences are:

          HTML Code:
          R1 => [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx
          R2 => [Adapter 2] [some sequence 2] [Adapter 2] [some sequence 1] [Adapter 1] xxxx
          But, since they are paired end reads, I was trying to show that both sequences were complementary:

          HTML Code:
          R1-5' =>     [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2] xxxx 3'
          R2-3' => xxx [Adapter 1] [some sequence 1] [Adapter 2] [some sequence 2] [Adapter 2]      5'
          Sorry about the confusion.

          Is your "some sequence 2" a short sequence of indexing nucleotides?
          No, I've already checked for the indexing nucleotides. Anyway I've taken a deeper look at this "some sequence 2", and I think it is also containing the Adapter 1, with a few mismatches.

          I can only think that the actual Illumina adapters are outside of your strange adapter reads.
          Adapter sequences were provided to me by the experimental people, and they are those Adapter 1 and 2 that I show.

          Comment


          • #6
            Ah, I see what's going on now. This is a directional RNA-Seq library. If I understand correctly, the 3' Adapter sequence is showing up multiple times. This is indicative of an adapter that was designed incorrectly. The oligo for the 3' adapter must have a modification at the 3' end in order to prevent it ligating to other inserts; either a dideoxy nucleotide or amino modification. I'm guessing this was not done and the shorter RNA fragments concatamerized during the 5' ligation reaction. The data is still good, but you'll have to do some work to separate the individual inserts in the reads.

            Comment


            • #7
              Thanks kcchan!

              Any suggestion on how to proceed?

              Since the majority of paired reads (r1/r2) show a high degree of overlapping, maybe I could build single reads from them, trim the adaptors, and map the resulting short reads to the reference genome.

              Comment


              • #8
                Yea, I think the best you can do with those reads is to use them as a single end read. You're going to have to do a bit of work to get the reads cleaned up, however. Not only will you need to trim off the adapters, but also separate each sequence in between the adapters and treat it as individuals reads.

                Comment


                • #9
                  You could split the reads on the index sequences (forward and reverse complement) using perl. You would also need to assign independent read names and (for quality scores) keep track of the positional information, plus discard reads below a certain length. Unless these are supposed to be small RNAs or it's a SAGE type of experiment, the small insert sizes would cause me to question the sample quality.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X