Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help debugging a faulty MiSeq run

    Hi all,

    I've just had a faulty MiSeq run and before I re-run my samples, I'd like to make sure the next run is going to be valid, especially since it's a rather expensive technique. It'd be great if you could help me with your experience and advise.

    Background of the samples and the run: The samples constitute gel-purified PCR products of a segment of a gene (270bp), which contains 15 randomized positions. Illumina adapters were 'ligated' to the samples by PCR, one primer containing multiplex barcodes, the other containing random barcodes (NNNN). Sample concentrations were quantified with the KAPA qPCR kit.
    For the sequencing run, 16 pM of sample were loaded with a 1% spike of PhiX and 2x151bp were sequenced. These are the run's primary analysis plots from BaseSpace (secondary analysis hasn't yet been done):



    The run returned ca. 50 reads on one paired end, 104'000 from the reverse. This is especially striking because the same samples have been measured previously in a 2x36bp run with 80% PhiX and 600'000 reads for my samples were obtained.

    The possible errors that I've identified are the following:
    1) I've overloaded the flow cell. I've used 16 pM sample, thus I'm on the top end of the recommended amount. In addition, the KAPA qPCR in retrospect was not measured in the linear 15-25 cycle range, but rather around 10-15 cycles. I may have underestimated my concentration up to a factor 1.8-2. If this is the error, I'm happy to know it and rerun with less sample.
    2) The PhiX tech note says that in the first four cycles it is essential for the machine to have a complex library with an equal representation of the four bases. In addition, they say that for low complexity libraries, more PhiX is recommended. They say up to 40% (seems a waste to me, but if it's necessary can be done).
    3) Looking at the log files, I get at times a 'Piezo' error and 'Position error before settling'. Does this characterize a specific type of failure?
    4) After cloning my samples into a CloneJet vector and sequencing 20 colonies I've realized that many of the Illumina adapters appear truncated. Thus I've now ordered HPLC-purified primers to improve their quality.

    Did I miss anything here? Are there additional error sources I need to consider? I'm happy for all your inputs.

    Thanks,
    Simon
    Attached Files
    Last edited by simon_seq; 08-10-2012, 05:48 AM.

  • #2
    Hi simon_seq,

    We typically aim for 8pM when loading our Miseqs. We've found that loading >=10pM is risky and can often result in fewer clusters passing filter, poor overall Qscores, and in extreme cases, read2 failure (this last bit is a result of the clusters expanding a bit during the PE turnaround).

    It is painful to sacrifice the flowcell real-estate, but if your library has low base diversity per cycle, you're going to need a phiX spike (or some other well balanced library) --anywhere from 30-50%. The miseq uses the first 4 cycles for cluster ID and the first 12 for baseline intensity correction, but even outside of these first cycles, low base diversity has a negative impact on the sequencing quality. (Also beware, the first 12 cycles of read 2 are also used to re-calculate the baseline correction, so it's important to have reasonable diversity there too). This issue is probably more significant on a Miseq than other Illumina instruments (HiSeq or GAII) because it lacks a control lane.

    Hope this helps.

    -Z

    Comment


    • #3
      Hi,

      Nick and I have just written a blog post on sequencing low-diversity on the MiSeq, may be of interest to you...



      The piezo error relates to the focussing stage and is likely to be a hardware fault rather than your sample, I'd talk to Illumina about that one.

      Josh
      Last edited by joshquick; 08-10-2012, 12:55 PM.

      Comment


      • #4
        Hi Simon,

        Looking at your intensity plots it's quite apparent that your library didn't have enough base hetergeneity. As already noted, the Real Time Analysis program uses the first four nucleotides to determine overall intensity baselines per read. If you have highly homogenous base composition during these cycles, the system won't be able to correctly determine what a good intensity value is for a polony later in the run. It may seem like a waste to spike in 30-60% phiX, but it's a heck of a lot cheaper than wasting $1K+ on a failed run. If you want to confirm this, and have SAV set up, look at the plots for % base and FWHM. % base is self explanatory, but FWHM indicates how consistent the size of a polony is. If that graph varies a lot, then it's an indication that the software is having issues distinguishing polonies from one another. Amplicons will typically have some variance in FWHM, but large deviations aren't good.

        I'm also curious as to how you set up your sample sheet for this run. You mention that both adapters have indices, so did you do two index reads or just one? The intensity baselines are established off read 1 and read 2, so having a completely randomized index read wouldn't help anything. Is there a purpose for doing it this way that I'm just not realizing?

        One last thing to note is that during the index sequencing, if you have a large spike of standard phiX then you can get a lot of errors in the index read. We've been using the multiplexing phiX that has the A003 index and have found that it makes the calling of bases during the index read much better.

        Comment


        • #5
          Well thank you all for your helpful and quick advice!

          After reading your replies and calling Illumina I'm set to spike 40% PhiX for the next run and use 8 pM.

          @zherbert: Agreed, I'll use 8 pM. May I ask, how many reads do you get out this way? What is your quantification method? KAPA claims their method overestimates DNA concentration slightly, so using 10 pM might still be an option - what's your view?

          @Josh: Thanks a lot for the link to your blog entry. I appreciate you are a Illumina insider. But frankly, the tweaking of phasing parameters seemed a bit cryptic to me, because I do not understand how precisely the reported values will affect the phasing - has this worked universally for you, always the same params on different machines? Since I'd like to be conservative in the next run, I think I'll still use a higher PhiX spike - would you agree to that?
          Regarding the piezo error, I've read on other seqanswers post that this is a common one - does it require fixing or will our reads not be affected?

          @McNelson: Thanks, I definitely agree that wasting money is worse than loosing some reads. Still trying to get access to a Windows machine for SAV. Regarding the indexing and the sample sheets, I'm open for ideas. In fact, my sample sheet is almost blank, because I intend to do the downstream analysis 'manually' (without Illumina tools). This is my reasoning:
          - Indexing on only one end saves me four bases on the other.
          - Random barcode on the other indicates if the upstream PCR or the sequencing was biased towards specific bases.
          - Paired-end reads can be combined by their coordinate on the flow cell tile.
          Would indicating indices to the MiSeq improve the read quality?
          So are you suggesting PhiX will mess up the indexing. Haven't heard of the A003 alternative (in fact googleing PhiX A003 you're post is the top hit!). Any advice there?

          Thanks,
          Simon

          Comment


          • #6
            Hi Simon,

            May I ask, how many reads do you get out this way?
            8pM typically clusters for us around 600-800K/mm2 and yields 5-7M total clusters PF. When we spike in a given %phiX we're usually +/-5%...unless there was a sizing problem with our library

            What is your quantification method?
            Since we don't have a qpcr, we just use a qubit. It over-reports slightly, but has been very consistent in our hands. Almost all of our quant problems have been related to sizing.

            Using 10pM is usually fine for diverse libraries (ie small genome resequencing) and we tend to push the loading limits (up to 12pM) for 50cycle runs (ie ChIP-seq) that yield 8-10M reads PF. On the other hand, for low diversity with phiX spike, we shoot for 6-8pM and ideally cluster <800K/mm2.

            -Z

            Comment


            • #7
              Originally posted by simon_seq View Post
              Would indicating indices to the MiSeq improve the read quality?
              Hi Simon, there's no need to tell the MiSeq what indices you're using, just that you have one and how long it is. There's an exception that relates to MiSeq Reporter that may make you want to specify them in the sample sheet. The newest version of MiSeq Reporter (1.3.7 I beleive) doesn't give the actual index reads, just the R1 and R2 reads. If you specify each index in the sample sheet, the it will attempt to make the demultiplexed read files for each sample, or else it will just give the two R1 and R2 files that you'd get from a non-indexed run. You'll have to use the older version of Reporter (1.1.16 I believe) if you want the index reads as a separate file, which is how we want them. I don't know why Illumina changed the software like that, but it's pretty dumb since a lot of people want to demultiplex themselves.

              So are you suggesting PhiX will mess up the indexing. Haven't heard of the A003 alternative (in fact googleing PhiX A003 you're post is the top hit!). Any advice there?
              When we've run amplicons with ~40% standard phiX control, we saw that the index read quality scores were pretty low, and that a fair bit of phiX was being falsely assigned an index that would correspond to one of our samples. phiX itself being included in our samples isn't that big of a problem, but the low quality scores for the index made us concerned about sample cross-contamination. Our reasoning for why this happened is that because we were clustered at around 700K/mm2, there was a lot of bleed where signal from an indexed sample was assigned to the phiX which had no index. Since you're telling the MiSeq to sequence an index, it would normally expect most of the reads to have one and will try it's best to assign one, even if the signal isn't that great. Using the indexed phiX was recommended to us by our FAS, and it's not something you'll find easily without calling Illumina. It's part of the Multiplexing kit for the GAIIx I believe, and the phiX control carries the TruSeq A003 index (TTAGGC). We've found that using it in amplicon runs greatly improved the quality of the index reads, so we have much higher confidence that we're not getting any sample cross contamination.

              The other way to get around using that much phiX and/or buying the indexed phiX is to simply add in an indexed TruSeq library. I did this on a 50cycle test run and it worked well.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X