Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • amplicon sequencing on MiSeq

    I am a new Illumina user. What is the best way to do targeted seqeuncing (<100 amplicons) on the MiSeq?

  • #2
    How many samples and what size are your amplicons?

    Comment


    • #3
      to start, just 4 amplicons of sizes 130-200 (without adaptors), and 32 samples.

      Comment


      • #4
        That's a very low complexity library. You'll probably have to spike a lot of PhiX in to get good results.

        Comment


        • #5
          I've often wondered if you couldn't include some Ns in the primer for low complexity libraries which you could then remove post sequencing along with your target primer. So your fwd primer would look like this

          [P5][Seq Primer]NNNNN[Target Primer]

          In theory the first 5 bases would be random which would give decent cluster identification.

          There's probably some obvious reason why this wouldn't work that I haven't thought of yet.

          Or maybe designing PCR's to both strands would increase the complexity enough?

          Comment


          • #6
            Originally posted by TonyBrooks View Post
            I've often wondered if you couldn't include some Ns in the primer for low complexity libraries which you could then remove post sequencing along with your target primer. So your fwd primer would look like this

            [P5][Seq Primer]NNNNN[Target Primer]

            In theory the first 5 bases would be random which would give decent cluster identification.

            There's probably some obvious reason why this wouldn't work that I haven't thought of yet.

            Or maybe designing PCR's to both strands would increase the complexity enough?
            A related trick I saw published was to use a variable length sequence between the Illumina primers & targeting sequence; this way the targeting sequences aren't all in phase & the complexity is significantly increased in the eye of the cluster caller.

            Comment


            • #7
              Do you have the reference for the paper mentioned in your post?

              Thanks!
              Marc

              Comment


              • #8
                is it this one?

                Kindle et al., Detection and Quantification of Rare Variants with Massively Parallel Sequencing. PNAS doi:10.1073 (April 2011)

                Comment


                • #9
                  Cluster caller?

                  Originally posted by krobison View Post
                  A related trick I saw published was to use a variable length sequence between the Illumina primers & targeting sequence; this way the targeting sequences aren't all in phase & the complexity is significantly increased in the eye of the cluster caller.
                  Can you educate me more on "cluster caller"? I recently got results from MiSeq Paired-End run (150bp) and I suspect there's a double read problem. Qscores are rather wavy at the beginning.

                  My samples are multiplexed targeted re-sequencing of exomes from 16 genotypes. Could this be due to a low complexity issue discussed on this thread?

                  I also heard rumors that if the first 4bp of a read is identical (which is very likely in targeted re-sequencing) it will be assigned to the same cluster. Is this true?

                  Comment


                  • #10
                    Originally posted by ShiveringFire View Post
                    Can you educate me more on "cluster caller"? I recently got results from MiSeq Paired-End run (150bp) and I suspect there's a double read problem. Qscores are rather wavy at the beginning.

                    My samples are multiplexed targeted re-sequencing of exomes from 16 genotypes. Could this be due to a low complexity issue discussed on this thread?

                    I also heard rumors that if the first 4bp of a read is identical (which is very likely in targeted re-sequencing) it will be assigned to the same cluster. Is this true?
                    I am thinking about this a lot. Especially with respect to possibly streamlining our operations by dumping our GS-FLX.

                    Our MiSeq is apparently 6 weeks from being delivered, so I can only extrapolate from the performance of our HiScanSQ (which is similar to a HiSeq). There are two issues masquerading as a single "problem".

                    (1) The first is that the actual instrument focusing software goes nuts and misfocuses if you have an empty tile (or set of tiles?) early in the run. Probably mainly during the first cycle. For our HiScanSQ this means if you don't have a fair number of clusters in the G&T channel and the A&C channel, you have a good chance that the instrument will pick the wrong focal plan to image.
                    Doesn't anyone have connections inside Illumina they could point out this blunder to? If focus is good for one channel and bad for the other, why not just use the good focal point?
                    Anyway, I think this is only an issue during the first few cycles (maybe only the first) of a read. Then the focal points seem to be carried along with little (or no) adjustments.
                    The effect? Once you are out of focus, you are hosed for that tile for the run, I think. (Or "psuedo-tile" -- but I'll just presume that everyone reading this realizes that no masonry is involved in this process and leave it at "tile".) This applies no matter how perfect your cluster spacing is.

                    (2) If the instrument did not hose up its focal plane during the first cycle, you can still have diminished results if you have a low complexity of base calls. This is the one everyone thinks about -- two adjacent clusters give the same base calls for 4 bases. If they are very close together, then neither of them may "register" (if I use the terminology correctly), well at all. This is a little more understandable. Except that Illumina foreclosed the use of methodology that could circumvent this issue on the GA-IIx for the HiSeq by making the early layers of data processing possible only on the instrument console.

                    So if issue 1 doesn't kill you, issue 2 probably will. Work-arounds are required.

                    --
                    Phillip

                    Comment


                    • #11
                      no more deferred cluster calling

                      Thanks Philip,

                      I came across this paper formally defining the problem and a possible solution:
                      Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost.


                      Sadly I was told that MiSeq doesn't save raw image files so a "deferred cluster calling" is not possible anymore. This sounds like getting rid of the evidence for a dirty job...

                      So spiking a lot of pHiX seems like the only solution for low complexity. Can one spike phiX even in custom made libraries?

                      Comment


                      • #12
                        Yes, you just buy the PhiX control from Illumina and spike it in.

                        Comment


                        • #13
                          I am looking for an alternative to spiking in 50% or more PhiX while sequencing low diversity amplicon libraries.

                          I am wondering if using longer barcodes with balanced representation of bases (up to 24 bases) would be sufficient.

                          Additionally, does anyone know if I can specify barcodes of different lengths on a MiSeq sample sheet?

                          Comment


                          • #14
                            I don't think balanced barcodes will work. The index read occurs after read 1 so the instrument will still have issues calling clusters.

                            I am doing some tag sequencing that requires a custom library prep protocol, and I have the same issue with homogeneous nucleotides for the first 10 bases of read 1. Illumina tech support recommended adding a stretch of 12 Ns (not 6) right after the read 1 primer binding site. It seems like that would be fairly straightforward to add to primers for amplicon sequencing.

                            I am pretty sure (but don't quote me on this) that you can specify indeces of different lengths on the sample sheet. I think that the 2 index reads use a 6 base and a 7 base index.

                            Comment


                            • #15
                              Originally posted by bbeitzel View Post
                              I don't think balanced barcodes will work. The index read occurs after read 1 so the instrument will still have issues calling clusters.

                              I am doing some tag sequencing that requires a custom library prep protocol, and I have the same issue with homogeneous nucleotides for the first 10 bases of read 1. Illumina tech support recommended adding a stretch of 12 Ns (not 6) right after the read 1 primer binding site. It seems like that would be fairly straightforward to add to primers for amplicon sequencing.
                              Thanks for your answer. I should have clarified that I am not using truseq/nextera indexing reads; my barcodes compose the very first bases of read1. So, it sounds like I could use balanced barcodes, in lieu of the stretch of 12 Ns they recommended to you.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM
                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-14-2024, 06:13 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-08-2024, 08:03 AM
                              0 responses
                              72 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-07-2024, 08:13 AM
                              0 responses
                              81 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-06-2024, 09:51 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X