Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rules for making your own index

    We're planning to add a second index to the TruSeq v2 kit because we need more multiplexing than just 24. Are there any rules on making your own index?

    I asked Illumina and they said to make sure that for each cycle A,T,C,G are all represented because the MiSeq has to "focus" or else the cycle is lost. So this means I can't make a universal tag since there will be cycles where all my base reads would only consist of a single nucleotide! Is this accurate?

  • #2
    Not really sure if I am getting what you are saying but it's best to keep all the hamming distance >1 for all the barcodes.
    --------------
    Ethan

    Comment


    • #3
      Yes I should maximize hamming distance.
      But for example I have 4 indexes...

      5' ATGCAT
      5' TGAACG
      5' GCTGTC
      5' AGCTGC

      An Illumina representative mentioned that I can't use that index set because the first, second and last bases will not have all of the four bases. So one the flowcell for base 1, I'll have signals for A, T, G clusters but not for C so our machine (MiSeq) will trash that cycle. Well this is what I understood from our conversation.

      Any thoughts?

      Comment


      • #4
        On the HiSeq, you need balance nucleotide composition at the beginning of the sequencing read but not the barcode read. Otherwise you could only multiplex in multiples of four. Which was one of the drawbacks of putting the barcode at the beginning of the sequencing read. Maybe the MiSeq is more picky about the barcode read, I don't know.
        --------------
        Ethan

        Comment


        • #5
          Actually, ETHANol's statement is not 100% accurate. On the HiSeq, high cluster densities (900-1000K) have a more deleterious effect on index reads than inserts. We've had several flow cells with good cluster calling (80-90% PF) and high quality scores (mean ~38), yet fewer than 50% of the indices were called accurately. In some cases, pseudotiles at the inflow side (which contain higher cluster densities) have completely dropped out (i.e., no basecalling) during the index read after producing high-quality insert reads. The problem can be mitigated by balancing the ratio of index bases that are excited by the same laser (A/C or G/T).

          If your second index is at the start of read one, then you absolutely have to use all four bases in roughly equal proportions for the first four cycles (which is when cluster calling occurs).

          Comment


          • #6
            HESmith, Thanks for the correction. I'm curious here. How do you determine that the indices are called incorrectly? How do I go about performing QC on the index read?
            --------------
            Ethan

            Comment


            • #7
              It's funny, I found this on the internet some time ago and follow it but Illumina hasn't followed up on it so I just assumed it wasn't a problem. Apparently, it can be. Which leads one to ask, why is this not mentioned in any of the library preparation manuals.

              I think pmiguel has said that base balanced base composition for the index read is important on the HiScan.


              1. Some sequencing experiments require the use of fewer than 12 index sequences in a lane with a high cluster density. In such cases, select indexes carefully to ensure optimum base calling and demultiplexing by having different bases at each cycle of the index read. Illumina recommends the following sets of indexes for low-level pooling experiments.
              Pool of 2 samples:
              • Index #6 GCCAAT • Index #12 CTTGTA

              Pool of 3 samples:
              • Index #4 TGACCA • Index #6 GCCAAT • Index #12 CTTGTA

              Pool of 6 samples: • Index #2 CGATGT • Index #4 TGACCA • Index #5 ACAGTG • Index #6 GCCAAT • Index #7 CAGATC • Index #12 CTTGTA
              --------------
              Ethan

              Comment


              • #8
                Thanks guys. Yes I was planning to introduce a 5' index. Being able to multiplex only at multiples of 4 isn't a problem. Just need to multiplex into the hundreds.

                I think I've read the same post by pmiguel mentioning index reads should always contain a A/C and G/T at each position that is why I was curious why all bases should be in equal proportion.

                Comment


                • #9
                  Originally posted by ETHANol View Post
                  HESmith, Thanks for the correction. I'm curious here. How do you determine that the indices are called incorrectly? How do I go about performing QC on the index read?
                  I examined the frequencies of different indices in the Undetermined directory. The most common were one-base mismatches with the correct indices (we required perfect matches for demultiplexing), but there were nearly as many with two or more mismatches. Also, there were some pseudotiles with all Ns in the index despite high quality insert reads. Note that we observed this problem only at very high cluster densities.

                  You can use SAV or HCS to visualize the Q-scores for the index cycles. They are usually a bit lower than read one; if they're a lot lower, be concerned. A high fraction (>3-4%) of reads in the Undetermined directory is another indication of poor index reads.

                  Harold

                  Comment


                  • #10
                    Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost.


                    With a 5' index you have the invariant T required for adapter ligation in all libraries. I guess it doesn't cause too much of a problem because people use this strategy, but it is something to think about nonetheless. Has this caused problems for anyone?
                    --------------
                    Ethan

                    Comment


                    • #11
                      Thanks Harold!
                      --------------
                      Ethan

                      Comment


                      • #12
                        Originally posted by kentk View Post
                        Thanks guys. Yes I was planning to introduce a 5' index. Being able to multiplex only at multiples of 4 isn't a problem. Just need to multiplex into the hundreds.

                        I think I've read the same post by pmiguel mentioning index reads should always contain a A/C and G/T at each position that is why I was curious why all bases should be in equal proportion.
                        There's a distinction b/t the Illumina index read (which is separate) vs. barcodes that are incorporated at the start of your insert. In addition to cluster calling, I believe that the measured signal intensities for the first four cycles are used to calibrate values that are utilized for the remainder of the run (e.g., signal-to-noise), which would obviously affect the data if the bases are not equally represented in those cycles.

                        For the index read, A/C vs. G/T is usually sufficient to discriminate between a small number of barcodes.

                        Harold

                        Harold

                        Comment


                        • #13
                          Originally posted by ETHANol View Post
                          http://www.plosone.org/article/info%...l.pone.0016607
                          With a 5' index you have the invariant T required for adapter ligation in all libraries. I guess it doesn't cause too much of a problem because people use this strategy, but it is something to think about nonetheless. Has this caused problems for anyone?
                          Article looks interesting. I'll have to read it through first. Thanks again ETHANol

                          You mean the T for the T-A ligation right? No I don't think it's a problem because that T (or actually its complement A) anneals to the last base of the sequencing primer so essentially it's not part of the read

                          Comment


                          • #14
                            Originally posted by ETHANol View Post
                            http://www.plosone.org/article/info%...l.pone.0016607

                            With a 5' index you have the invariant T required for adapter ligation in all libraries. I guess it doesn't cause too much of a problem because people use this strategy, but it is something to think about nonetheless. Has this caused problems for anyone?
                            I assume you mean barcodes that are part of the adapter. Unless your index is only three bases long, you should be okay (but I haven't done the experiment). You could also resolve the problem by using indices of different length so the T is phase-shifted, and balance the other nucleotides for that cycle.

                            Comment


                            • #15
                              On a related note; I have a bunch of indexes (The Sanger 96-plex ones) that I'd like to use for low plexing too (4plex). These indexes are 8 bases long. Is there anything stopping just reading the first 6 bases (as per standard illumina indexing) on the GAIIx/HiSeq as long as there is AC/GT balance at all 6 positions? I only want to do this on one lane, so don't need to read 8 cycles on the other 7 lanes.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X