Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster density vs. %PF on HiscanSQ and HiSeq

    New instrument, HiSeq2000, first run. Overshot the recommended cluster densities on a few of the lanes. Unexpectedly the "penalty" for doing so was fairly minor:





    Compared with a HiscanSQ run from last September:


    Anyone know what the difference in performance is caused by?

    Software is a possibility. The HiSeq is running the newest stable software -- 1.5.15, I think. Not sure about the HiScan run -- but it was about 1/2 a year ago.

    Use of a control lane is a possibility. We did not use one on the HiScanSQ run.

    Of course they are different instruments, and different instrument types. Although the HiScanSQ uses the same reagents and flowcells (v3) as the HiSeq.

    Anyone know, or have an opinion?

    --
    Phillip

  • #2
    Phillip,

    I can't comment on the differences; I've never worked with a HiScanSQ so am not sufficiently familiar. But with regard to the 'penalty' for over clustering you should look at other metrics as well, such as % > Q30 bases and Median Q score. We have also been having problems recently with very poor quality of index reads which may be (partially) caused by over clustering. With error prone index reads you end up loosing a significant fraction of your reads because you can't identify what library they come from.

    Comment


    • #3
      There is a fine line that separates runs with lots of data and failed lanes.

      If you try to push things a bit too far (in terms of number of clusters) to get more data then you will have problems with the index reads as noted by kmcarr. That is our observation as well.

      Comment


      • #4
        Sure, I am perfectly happy with runs at 600 K/mm2 average densities. In fact I thought I might have blown it with a couple of the lanes when I saw their post-4th cycle densities.

        You would not believe the number of QC tests we have been doing prior to loading a lane with a new library. Even so, we see crazy variation in densities sometimes. We just got a couple of MiSeqs and I am hoping we might use them to get dialed in a little tighter prior to a run on the HiSeq. But part of the problem is we sometimes are putting >50 libraries from a dozen researchers on a single flow cell.

        So don't take me as some sort of wild-eyed desperado living life on the edge wherever possible. I'm just hoping Illumina found a way to move the edge a little farther from where I tend to stand.

        As of cycle 30, the Q values appear to be holding. I guess we'll sweat the indexes later. We have some techniques developed in the bad old days (8 months ago) for the HiScanSQ that might be transferable.

        --
        Phillip

        Comment


        • #5
          Here are the median Q score and %>Q30 plots for my most "overloaded" lane (mean density 946 K clusters/mm2).



          Granted, it might all fall apart later in the run -- but does this look like a normal result? Or was HCS 1.5.15.1 (and associated RTA) a bigger deal than Illumina is letting on?

          If not, do you think having a control lane (~450 K cluster/mm2 phiX) explains why things look as good as they do?

          --
          Phillip
          Last edited by pmiguel; 04-19-2012, 07:55 AM.

          Comment


          • #6
            Phillip,

            Those look typical even when the lane is overloaded. The quality of the sequence reads may only be slightly lower but the index reads go right in the crapper. That said I don't think a mean density of 946K clusters/mm^2 is horribly overloaded; we've been experiencing densities of 1100-1200K. Isn't the recommended density 800K?

            Comment


            • #7
              750-850K, yes. Some of the tiles reach 1200 K clusters/mm^2 or slightly above. Maybe the HiSeq is just superior to the HiScanSQ in this regard. Good news for us...

              Actually, it would make sense -- the HiScanSQ scans in 2 channels (A/C and G/T) then presumably does a spectrographic binning to create the 4 channel images. Whereas the HiSeq is capturing each channel with a different camera.

              Just to be clear though, this:



              looks normal to you? That is, PF percentages >80, even with densities above 850?

              Also, do you use a control lane?

              --
              Phillip

              Comment


              • #8
                Originally posted by pmiguel View Post
                Just to be clear though, this:

                [see image above]

                looks normal to you? That is, PF percentages >80, even with densities above 850?
                Yep, that looks pretty typical to me. We don't see huge drops in %PF until densities start approaching 1100-1200. Even at 900-1000 we can still see 75-85% PF. But I never considered cluster PF a very high bar to get over.

                Also, do you use a control lane?
                No, never. We do spike phiX into each lane but it's only present @ < 0.5%.

                Comment


                • #9
                  Always nice to have the wind at my back...

                  So if you had a run with all the lanes at 1M clusters/mm^2 what kind of Q30 base yield do you see?

                  --
                  Phillip

                  Comment


                  • #10
                    Haven't run the HiScan either, but these look pretty typical based on our experience with the MiSeq. See poorly labeled graph below.

                    Comment


                    • #11
                      I doubt that someone with 1200+ posts on this forum would be in the "desperado" category

                      I wanted to reinforce kmcarr's point that it is possible to push things quite a bit beyond published specs and still get good data. What is difficult is to determine is the fine line between good data and a failed lane. Soon as you cross that limit, things go downhill real fast (specially with index reads).

                      Originally posted by pmiguel View Post

                      So don't take me as some sort of wild-eyed desperado living life on the edge wherever possible. I'm just hoping Illumina found a way to move the edge a little farther from where I tend to stand.

                      --
                      Phillip

                      Comment


                      • #12
                        Originally posted by GenoMax View Post
                        I wanted to reinforce kmcarr's point that it is possible to push things quite a bit beyond published specs and still get good data. What is difficult is to determine is the fine line between good data and a failed lane.
                        And to make life even more interesting the line keeps moving!!
                        Last edited by kmcarr; 04-19-2012, 10:29 AM.

                        Comment


                        • #13
                          I found an instructive example from a recent run. Here is the density by lane plot:



                          Density of lanes 1-4 is good and their PF rates are 90-95%. The %PF for lane 6 is still 82% even though its density is 1082K/mm^2.

                          Here are the median Q-scores per cycle (this was only a 50bp SR + Index):




                          The Q-scores for the sequencing read in lane 6 were only slightly degraded by the over clustering, however the index read was extremely poor. As a consequence 58% of the reads from lane 6 (compared to only 1.8% from lane 4) could not be used because the barcodes contained too many errors. If you are not running indexed pools in a lane you can push the density higher since the read quality won't suffer as much, but if you need to demultiplex make sure you don't overload.

                          Comment


                          • #14
                            Could you tell what the source of your poor index cycles was? Are the beads not in focus for some of the tiles? Also, I generally see tiles at the bottom of the flow cell with 1.5x or so more clusters than tiles at the top of the flow cell.

                            If you do a "scatter plot" from SAV by going to the "imaging" tab and clicking on the "scatter plot" button:




                            You get this really powerful graphing tool. You can add an extra "dimension" to your plots by using the "Labels" tab and choosing another parameter:




                            I also click the "show legend" button.

                            Here is "%Q30" as a function of "Density" with "cycle" encoded with color. The nice thing is that each point is a tile, not an entire lane. So you can dissect density effects a little better.



                            All of which you may already know. But SAV has lots of features. I bet most people are not aware of all of them...

                            Anyway, would be great to see a similar plot of your run above to see where the high density effects start killing your index cycles.

                            --
                            Phillip

                            Comment


                            • #15
                              Good idea Phillip. Here are plots of % ≥ Q30 vs Density with tiles colored by Lane (lane 3 was excluded as that library did not contain an index).

                              This first plot is for the sequence read (cycles 1-51):



                              And here is for the index read (cycles 52-58):



                              These seem to suggest that the sequencing read can tolerate densities > 1000K/mm^2 but for clean index reads going over 900K/mm^2 is risky.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X