Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • abyss
    Member
    • Jan 2013
    • 17

    Miseq Paired End read quality much poorer! Why??

    Hi All,
    I am pretty new to sequencing on the Miseq, so just learning the ropes with amounts and cluster generation on the flow cell.
    I am currently doing ChIP-Seq Experiments and multiplex my libraries by using a 7bp in-line barcode. The way I generate sequence complexity is by multiplexing at least 6 experiments together and spiking 10% PhiX, or so I thought.
    I quantified my libraries using KAPA qPCR, and estimated size distribution using Agilent Bioanalyzer 2100 (DNA High Sensitivity). Just being conservative for an initial run I diluted each different Library separately (since amounts were less for some) to 12pM, and since I was adding equal amounts of the library, I mixed equal volumes of each for the final loading volume. I imagine the final concentration should remain the same although each individual library should get diluted. So to 900ul of this combined sample library I added 100ul of 12.5pM PhiX as spike in.
    I have attached some of the SAV pics and Run stats as a ppt file for reference.
    What I basically observed was that Read1 was really good quality and I got great clusters (98.6% greater than Q30) at a cluster density of 373K/mm2 (Lower than I thought). But as soon as the paired end clusters were formed the quality (% greater than Q30) dropped quite significantly (by 20%). The % aligned reads of the PhiX library drops from 18% in read 1 to ~2% in Read 2. The most bizarre observation was that when the sequence starts reading the sample after the barcode, Read 1 shows a good constant AT:GC ratio, as one would expect, but read 2, some how has a weird C bias. Has anyone encountered this before. Is the run OK, or should I just consider the data from Read 1.
    Please provide me your input, as I am thoroughly confused by these Read 2 metrics.
    Thanks.
    Attached Files
    Last edited by abyss; 03-18-2013, 06:59 AM.
  • danwiththeplan
    Member
    • Sep 2011
    • 72

    #2
    I am not an expert at all but possibly this post contains some useful info?



    Seems to suggest a higher level of phiX spiking and mentions a problem that sounds related to your problem (OK initial quality followed by a rapid drop-off)

    Comment

    • abyss
      Member
      • Jan 2013
      • 17

      #3
      Thanks for the feedback.
      But I believe that the problem lies when the clusters are flipped over to do Read2. I guess if I had continued all my cycles with just read 1, my data would have been better.
      I don't exactly know what happened when the clusters are flipped over.
      But maybe my assumption is entirely wrong.

      Comment

      • microgirl123
        Senior Member
        • Jun 2012
        • 199

        #4
        I'm wondering if you're having problems with not enough base balance in your indices - it looks like your Read 2 cleans up nicely in the middle of the run.

        Comment

        • GenoMax
          Senior Member
          • Feb 2008
          • 7142

          #5
          Low nucleotide diversity will throw the Q-scores off significantly. As microgirl123 pointed out they seem to have recovered after sometime, but the weird C bias is strange.

          Have you run the self test on the machine to see if everything is ok (as far as valves/flow goes)?

          Comment

          • abyss
            Member
            • Jan 2013
            • 17

            #6
            I guess one of the things I forgot to mention was that, out of the 6 multiplexed libraries, half of them had a size range of 200bp and the other half had a size range of 300bp.
            I'm wondering whether the smaller sized libraries couldn't flip properly, destroying a lot of the complexity and not getting sequenced well either??

            Comment

            • microgirl123
              Senior Member
              • Jun 2012
              • 199

              #7
              Do you mean your insert size is 200 or 300 bp or the entire library (with adapters) is 200 or 300 bp? If it is your insert that is 200-300 bp, then it should have had no trouble flipping over.

              I've looked at your run metrics again (I'm not familiar with ChipSeq) - are you using Illumina indexed adapters as well as your own individual indices? If so, are the Illumina indexed adapters properly balanced?
              Last edited by microgirl123; 03-18-2013, 10:19 AM.

              Comment

              • abyss
                Member
                • Jan 2013
                • 17

                #8
                Originally posted by microgirl123 View Post
                Do you mean your insert size is 200 or 300 bp or the entire library (with adapters) is 200 or 300 bp? If it is your insert that is 200-300 bp, then it should have had no trouble flipping over.

                I've looked at your run metrics again (I'm not familiar with ChipSeq) - are you using Illumina indexed adapters as well as your own individual indices? If so, are the Illumina indexed adapters properly balanced?
                The total size (including adapter sequence) is 200bp or 300bp.
                I don't have Illumina's indexing on these adapters and just have my own inline indicies.

                Comment

                • agent99
                  Member
                  • Jul 2010
                  • 10

                  #9
                  low quality 2nd read from HiSeq too

                  We are seeing the same low quality for the first few bases of the 2nd read in paired end sequencing on the HiSeq. We have seen this from multiple sequencing centers and from at least two different library prep methods. It seems like there is a problem with chemistry on the sequencer.

                  We were told by one sequencing center:
                  "It turned out that there was a NaOH problem…the protocol uses NaOH to strip of the index prior to sequencing the 2nd read. While the NaOH reagent sat on the machine, for some reason, it degraded and proper removal of the index was not achieved. Adding fresh NaOH a day before index2 did the trick. The Qscore looks amazing...."

                  Another sequencing center said:

                  "They [Illumina] have suggested that NaOH is not working well anymore to denature index1 away. If this is not complete, it will continue sequencing 7 dark cycles after index1 and then 8nt from the adapter, which is exactly the adapter sequence that is the top match in my 8mer analysis."

                  I'm attaching an image of what our poor quality scores look like for the 2nd end; the first end looks great. I'd like to hear if anyone else is seeing the same problem and whether you have a similar or different answer from Illumina or your sequencing centers.

                  Thanks!
                  Attached Files

                  Comment

                  • HeinKey
                    Member
                    • May 2009
                    • 21

                    #10
                    primer hybridization?

                    Originally posted by agent99 View Post
                    We are seeing the same low quality for the first few bases of the 2nd read in paired end sequencing on the HiSeq. We have seen this from multiple sequencing centers and from at least two different library prep methods. It seems like there is a problem with chemistry on the sequencer.

                    We were told by one sequencing center:
                    "It turned out that there was a NaOH problem…the protocol uses NaOH to strip of the index prior to sequencing the 2nd read. While the NaOH reagent sat on the machine, for some reason, it degraded and proper removal of the index was not achieved. Adding fresh NaOH a day before index2 did the trick. The Qscore looks amazing...."

                    Another sequencing center said:

                    "They [Illumina] have suggested that NaOH is not working well anymore to denature index1 away. If this is not complete, it will continue sequencing 7 dark cycles after index1 and then 8nt from the adapter, which is exactly the adapter sequence that is the top match in my 8mer analysis."

                    I'm attaching an image of what our poor quality scores look like for the 2nd end; the first end looks great. I'd like to hear if anyone else is seeing the same problem and whether you have a similar or different answer from Illumina or your sequencing centers.

                    Thanks!
                    Hello Agent99,
                    If I understand correctly the issue of poor read3 is due to the index (read2) still present on your strands.
                    I can't understand how this would happen since reclustering takes place after read2.
                    So 14 cycli of turnaround would not remove the index read? I find this hard to believe.
                    I wonder if the read3 (reverse read) primer has hybridized correctly? If this did not work well you will get low intensities and lower Qscores.
                    Could this be causing your problems?

                    Regards,
                    Hein

                    Comment

                    • agent99
                      Member
                      • Jul 2010
                      • 10

                      #11
                      Thanks for the response, Hein.

                      That sounds plausible except that these data are coming from 6-7 different experiments where the libraries were produced and sequenced at 3 different sequencing centers around the country. The library production methods differed between the 3 centers, but the sequencing was all done on HiSeq2000s within a one month period. This sounds more like a systemic problem with chemistry on the sequencer to me, but I could be wrong. I'm not in the lab handling samples - just reporting what I'm seeing on the bioinformatics side and what Illumina has told my colleagues at sequencing centers. Hoping to hear if you have seen the same thing and whether Illumina recommends other solutions to the problem.

                      --Alisha

                      Comment

                      • GenoMax
                        Senior Member
                        • Feb 2008
                        • 7142

                        #12
                        Alisha: You should have started a new thread for your observation since the sequencing times are vastly different on MiSeq and HiSeq.

                        Sticking this observation in a MiSeq thread is sure to get some folks (like me) confused.

                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          Yesterday, 10:05 AM
                        • SEQadmin2
                          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                          by SEQadmin2


                          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                          Introduction

                          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                          05-22-2026, 06:42 AM
                        • SEQadmin2
                          Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                          by SEQadmin2

                          Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                          Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                          05-06-2026, 09:04 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, Yesterday, 12:03 PM
                        0 responses
                        17 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, Yesterday, 11:40 AM
                        0 responses
                        13 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 05-28-2026, 11:40 AM
                        0 responses
                        29 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 05-26-2026, 10:12 AM
                        0 responses
                        31 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...