Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina Unique Molecular Identifier Adaptor

    Hi All,

    I want to generate RRBS libraries where we can track each unique molecule with a UMI. Thus I have generated new TruSeq adaptors that we normally use.

    The regular Truseq primers look like this:
    A1_P5: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC*T
    A1_P7 (AR005): ℗-GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG

    What we did was to add 8 random nucleotides to the P5 so that it looks like this:
    A1_P5 UMI8: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNN*T

    From what i heard I should anneal the two adaptors. The annealing effeciency can be seen here: https://imgur.com/a/70LZU

    What I did now was to try our regular protocol with old adaptors vs new adaptors.

    The libraries look like this: https://imgur.com/a/mHZeF

    Stupidly I did not generate another P5 without the 8 UMI's.

    It seems like the adaptors form adaptor dimers, but for the library it seems like it does not bind.

    Suggestions of other designs or ways to get it to work would be highly appreciated.

    I have looked into the dual indexes and then exchanging one index with UMI's instead, but could not find the sequences, do any of you have them?

    Best regards
    Emil

  • #2
    Is this a Y-adapter design? If so, the P7 and P5 need to anneal over the last 12 bases of the P5/first 12 bases of the P7, with a 3' "T" overhang to work with many Illumina work flows. When you add 8 N's and a T to the P5, you create a 9 base 3' overhang.
    No chance you can ligate that to anything using a double stranded DNA ligase.

    I suppose the opposite design, where you put the 8 N's at the 5' end of P7 could work. But you would have to anneal the P5 and P7 by their 12 bases of complementarity and then extend the 3' end of the P5 strand using klenow, for example. But that would give you a blunt adapter. So your inserts would need to be blunt as well. Which would allow chimeric inserts and the formation of massive adapter dimers.

    BTW, Illumina asks for the following:
    Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved.
    and
    Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved. Derivative works created by
    Illumina customers are authorized for use with Illumina instruments and products only. All other
    uses are strictly prohibited.


    to be added to any sequence information derived from their adapter sequences if they are distributed or published outside ones institution.

    The correct position to place a UMI in the P5 index site would be after the TCTACAC. That is:
    AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCT

    But, because 8 N's can compose, at most, 4^8 ~= 64K sequence combinations, they would not be "unique" in the context of a sample producing millions of sequences. And, I don't know how far you can push the length of the i5 index. Probably 12, at least. But going beyond 8 bases you would need to subtract sequence from the sequence reads to make up for the reagents you would be using to generate a longer UMI.

    --
    Phillip
    Last edited by pmiguel; 10-11-2017, 09:05 AM.

    Comment


    • #3
      Thank you so much for your answer Phillip.

      1. I am actually not really sure if it is a Y-adapter design. But I am pretty sure that it is. I tried to find information about it but couldn't, but it is the same adapter as used for TruSeq LT.

      2. Thank you for suggesting not to go with blunt inserts.

      3. Thank you for that suggestion. Do you know if any of the current kits from Illumina uses Dual indexing where you also have the Y-adaptor setup as I presumably have? I would believe that this setup would actually be the best setup to run since I would be sure to have intact annealing at the complementary 12 bases.

      Thank you for your answers
      Kind regards
      Emil


      Originally posted by pmiguel View Post
      Is this a Y-adapter design? If so, the P7 and P5 need to anneal over the last 12 bases of the P5/first 12 bases of the P7, with a 3' "T" overhang to work with many Illumina work flows. When you add 8 N's and a T to the P5, you create a 9 base 3' overhang.
      No chance you can ligate that to anything using a double stranded DNA ligase.

      I suppose the opposite design, where you put the 8 N's at the 5' end of P7 could work. But you would have to anneal the P5 and P7 by their 12 bases of complementarity and then extend the 3' end of the P5 strand using klenow, for example. But that would give you a blunt adapter. So your inserts would need to be blunt as well. Which would allow chimeric inserts and the formation of massive adapter dimers.

      BTW, Illumina asks for the following:
      Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved.
      and
      Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved. Derivative works created by
      Illumina customers are authorized for use with Illumina instruments and products only. All other
      uses are strictly prohibited.


      to be added to any sequence information derived from their adapter sequences if they are distributed or published outside ones institution.

      The correct position to place a UMI in the P5 index site would be after the TCTACAC. That is:
      AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCT

      But, because 8 N's can compose, at most, 4^8 ~= 64K sequence combinations, they would not be "unique" in the context of a sample producing millions of sequences. And, I don't know how far you can push the length of the i5 index. Probably 12, at least. But going beyond 8 bases you would need to subtract sequence from the sequence reads to make up for the reagents you would be using to generate a longer UMI.

      --
      Phillip

      Comment


      • #4
        Hi Emil,
        Most TruSeq Illumina kits use the Y-adapters. The common "TruSeq" DNA and RNAseq ones will offer a normal single index kit option (usually going up to 24 indexes) or a "high thoughput" one with dual indexes allowing you to multiplex 96 samples in a lane. Well, as long as you don't need them to be "unique dual".

        The parts of the dual index Y-adapters containing the indexes do not anneal! Only the 12 bases of the Y-adapter most proximate to the insert are annealled. The rest of the adapter has non-complementary sequence that won't anneal and hangs off like a forked tail. Get it? "Y" where the double-stranded part is the stem of the "Y" and the single stranded tails are the top part. It is only during subsequent PCR that this tail region becomes double stranded.

        This is one of those brilliant solutions to a bunch amplicon construction issues when using ligation. You need to have a forward and a reverse adapter. But if you just make them both double-stranded, so that T4 DNA ligase will use them as a substrate, you create 2 problems. First some of your constructs will get forward adapters on both ends or reverse adapters on both end and won't be suitable templates for clustering. Second with a double-stranded molecule you have two ends that can be ligated -- possibly to each other.

        So some fiendish genius came up with the idea of annealing single-stranded versions of both the left and right adapters to each other such that only one end was actually annealed. This way each double stranded insert was guaranteed to have an R adapter on one end and an F adapter on the other end. Actually, each strand of the the double stranded insert would have a single stranded R on one end and F on the other--the top strand in one orientation (say, F-insert-R) the bottom strand in the other orientation (R-insert-F).

        --
        Phillip

        Comment


        • #5
          Sequence and structure of TruSeq HT adapters is attached.

          You would need to substitute i5 sequences with N to use as UMI.

          Other option you might consider is: http://www.nugen.com/products/ovatio...hyl-seq-system

          It has 6 UMI base which follows the index read so the index 1 read has to be 12 cycles to utilize UMI or 6 cycles just for the index. Other advantage is that they have included diversity nucleotides and libraries can be sequenced with 1% PhiX spike in. In the conventional protocol higher PhiX (>30%) is required.
          Attached Files

          Comment


          • #6
            For a way to create TruSeq adapters with UMI at the end see Kennedy SR, et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc. 2014 Nov;9(11):2586-606. doi: 10.1038/nprot.2014.170. https://www.nature.com/nprot/journal....2014.170.html

            Comment


            • #7
              Thanks again Phillip!

              Indeed they must be some fiendish geniouses.

              I have allready generated the TruSeq DNA LT adapter piece with a 6 nt index. Do you think it would work to anneal the i5 adapter to this adapter or should i generate new i7 adapters aswell?

              Also to nucacidhunter and torben, thanks for the suggestions!

              Cheers
              Emil

              Originally posted by pmiguel View Post
              Hi Emil,
              Most TruSeq Illumina kits use the Y-adapters. The common "TruSeq" DNA and RNAseq ones will offer a normal single index kit option (usually going up to 24 indexes) or a "high thoughput" one with dual indexes allowing you to multiplex 96 samples in a lane. Well, as long as you don't need them to be "unique dual".

              The parts of the dual index Y-adapters containing the indexes do not anneal! Only the 12 bases of the Y-adapter most proximate to the insert are annealled. The rest of the adapter has non-complementary sequence that won't anneal and hangs off like a forked tail. Get it? "Y" where the double-stranded part is the stem of the "Y" and the single stranded tails are the top part. It is only during subsequent PCR that this tail region becomes double stranded.

              This is one of those brilliant solutions to a bunch amplicon construction issues when using ligation. You need to have a forward and a reverse adapter. But if you just make them both double-stranded, so that T4 DNA ligase will use them as a substrate, you create 2 problems. First some of your constructs will get forward adapters on both ends or reverse adapters on both end and won't be suitable templates for clustering. Second with a double-stranded molecule you have two ends that can be ligated -- possibly to each other.

              So some fiendish genius came up with the idea of annealing single-stranded versions of both the left and right adapters to each other such that only one end was actually annealed. This way each double stranded insert was guaranteed to have an R adapter on one end and an F adapter on the other end. Actually, each strand of the the double stranded insert would have a single stranded R on one end and F on the other--the top strand in one orientation (say, F-insert-R) the bottom strand in the other orientation (R-insert-F).

              --
              Phillip

              Comment


              • #8
                Hi Emil,
                I would strongly recommend that you verify this yourself by aligning your p5 and the reverse (in the 3' - 5' direction) of your p7 sequence. You will see the terminal 12 bases on one side are complements of each other with just a 3' "T" overhang provided by the p5 oligo.
                Once you have done that, you will understand how a Y-adapter is structured to function as it does.
                --
                Phillip

                Comment


                • #9
                  Thank you for all your help!

                  I have now ordered the adapters and hope they will work!

                  Comment


                  • #10
                    I hope that you have asked all C residues to be synthesized with mC to prevent C conversion to U during bisulfite treatment (which is very expensive) unless you are using techniques that does not require mC in adapters.

                    Comment


                    • #11
                      Originally posted by nucacidhunter View Post
                      I hope that you have asked all C residues to be synthesized with mC (which is very expensive) unless you are using techniques that does not require mC in adapters.

                      Indeed expensive, but yes it is synthesized with mC. Thanks for the heads up.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      9 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      49 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      67 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X