Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks for bringing this up pmiguel. I saw this document a few days ago and was surprised to see that single amplicons got relatively good Q30 values and %PF. I'll be sure to ask our sequencing provider the cluster density and software version for our 3 failed HiSeq runs. Perhaps we are barking up the wrong tree and should consider alternative hypotheses.

    One thing we just came to realise is that a non-optimal combination of indexes was used (AR001, AR002, AR003, AR004) rather than the Illumina-endorsed 4 plexes. I thought that should be added in case that could cause 0 clusters to pass filter during the index sequencing and/or data processing stages.

    Comment


    • #17
      Eh? I thought you were using "in-line" indexes.
      There is no combination of 4 Illumina TruSeq indexes that would screw-up indexing. Some combinations of 2 can (some times) cause issues.
      In any case, index reads are reads 2 (and 3, if dual indexing is used). Pass-filter occurs early in read 1. So index reads can't cause problems with pass-filter.

      --
      Phillip

      Comment


      • #18
        I should clarify an earlier statement by Jean-Rene. We are using 20 different 6bp in-line barcodes on the P1 end, rather than "indexes". On the P2 end we have the 4 Illumina indexes (AR001 through 4) for each of our 3 libraries. When I aligned the 4 indexes I did notice that position 2 and 3 have all G/T or A/C, which I thought might cause some difficulty after I noticed Illuminas warning:
        Illumina uses a green laser to sequence G/T and a red laser to sequence A/C. At each
        cycle at least one of two nucleotides for each color channel needs to be read to ensure
        proper image registration. It is important to maintain color balance for each base of the
        index read being sequenced, otherwise index read sequencing could fail due to
        registration failure. Follow these low plex pooling guidelines, depending on the TruSeq
        Stranded mRNA Sample Prep kit you are using.
        It is good to more or less eliminate the indicies as the reason for our low %PF.

        We just got back some metrics from our sequencing centre. They added 11 pM of each library on three lanes of two HiSeq 2000 machines using PE100 v3 chemistry (all other lanes from other clients worked fine). Our raw cluster densities were 885k, 897k and 880k with 60k, 50k and 79k passing filter. Therefore, ~6.7%, 5.6%, and 9% of clusters passed filter (this is somewhat better than the 0% PF that was reported to us initially). Only 0.2% of reads passing filter mapped to phiX despite an initial 10% spike-in. Our sequencing centre says that ~850k raw clusters is normal for them for a ddRAD library with this percentage of phiX but we do not know whether these other libraries were produced to incorporate more diversity in basepairs 6-12 bp.

        We did not mention it before, but our adapter-ligated libraries range from 325-425 bp. We used a pipin prep so the size distribution is quite consistent.

        We will have to wait till next week to get the data files to see if our organisms reads make sense and get a better idea of how clustering was distributed across the lane. Thanks again and sorry for the mix-up.

        Comment


        • #19
          Originally posted by ATϟGC View Post
          It is good to more or less eliminate the indicies as the reason for our low %PF.

          We just got back some metrics from our sequencing centre. They added 11 pM of each library on three lanes of two HiSeq 2000 machines using PE100 v3 chemistry (all other lanes from other clients worked fine). Our raw cluster densities were 885k, 897k and 880k with 60k, 50k and 79k passing filter. Therefore, ~6.7%, 5.6%, and 9% of clusters passed filter (this is somewhat better than the 0% PF that was reported to us initially). Only 0.2% of reads passing filter mapped to phiX despite an initial 10% spike-in. Our sequencing centre says that ~850k raw clusters is normal for them for a ddRAD library with this percentage of phiX but we do not know whether these other libraries were produced to incorporate more diversity in basepairs 6-12 bp.

          We did not mention it before, but our adapter-ligated libraries range from 325-425 bp. We used a pipin prep so the size distribution is quite consistent.

          We will have to wait till next week to get the data files to see if our organisms reads make sense and get a better idea of how clustering was distributed across the lane. Thanks again and sorry for the mix-up.
          Cluster densities mentioned here (850k) are for high diversity libraries and will be considered over clustering for your libraries. If you have to use these libraries a 25% spike-in with clustering around 600k should give good results. But I would suggest trying it in one lane first.

          This is from Illumina publications re PF%: “During the first 25 cycles of Read 1, the chastity filter removes the least reliable clusters from analysis results. Clusters pass filter if no more than 2 base calls have a chastity value below 0.6 in the first 25 cycles. Chastity is the ratio of the brightest base intensity divided by the sum of the brightest and the second brightest base intensities. The percentage of clusters passing filter is represented in analysis reports as %PF.”

          Comment


          • #20
            Originally posted by nucacidhunter View Post
            Cluster densities mentioned here (850k) are for high diversity libraries and will be considered over clustering for your libraries. If you have to use these libraries a 25% spike-in with clustering around 600k should give good results. But I would suggest trying it in one lane first.

            This is from Illumina publications re PF%: “During the first 25 cycles of Read 1, the chastity filter removes the least reliable clusters from analysis results. Clusters pass filter if no more than 2 base calls have a chastity value below 0.6 in the first 25 cycles. Chastity is the ratio of the brightest base intensity divided by the sum of the brightest and the second brightest base intensities. The percentage of clusters passing filter is represented in analysis reports as %PF.”
            Ah, thank you, I had been trying to find some information on how PF% was determined but never came across that. We were already considering asking the facility to do a trial lane with lower cluster density, so it's good to have numbers to go by. We also realized yesterday that we have access to some samples with diversified barcodes ligated to them (for an unrelated project) that we could spare and mix in with our sequencing samples, so I think we'll use theses library constructs instead of using PhiX. Thanks again for your replies everyone!

            Comment


            • #21
              Based on post #18 you are going to receive some data that *should* be for your libraries (once you separate the small amount of phiX reads). I would suggest that you first look at that data to see if everything looks as expected before you run any samples again.

              Comment


              • #22
                Yes I completely agree with you on that point; we were already planning on doing this, but thanks for the reminder.

                Comment


                • #23
                  The University of Oregon sequencing facility runs many lanes of RAD-Seq libraries. You might contact the director Doug Turnbull ([email protected]) and ask him about how they are running lanes these days. I know lower cluster density and higher PhiX are both done.
                  Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                  Comment


                  • #24
                    Originally posted by SNPsaurus View Post
                    The University of Oregon sequencing facility runs many lanes of RAD-Seq libraries. You might contact the director Doug Turnbull ([email protected]) and ask him about how they are running lanes these days. I know lower cluster density and higher PhiX are both done.
                    I'll try to contact him, thanks for the info!

                    Comment


                    • #25
                      I just wanted to give an update and see if anybody wants to give their opinion. One of my colleagues talked with our sequencing service provider and they believe that low-diversity is the cause of our problem. We have not seen any screen-shots from SAV, but they assure us that clustering was normal (i.e. the ~850 clusters/mm^2 was real and not a deflated estimate due to over clustering).

                      So what I think we are going to try is re-preparing the failed libraries by ligating adapters with in-line barcodes of 5 and 4 base pairs and add them to our libraries that already have 6bp barcodes in order to “offset” the restriction enzyme cut site so that the same bases are not always in the same position. The proportions of the 4, 5 and 6 bp barcodes would be approximately 3:3:16 respectively (this is necessary to ensure evenness of coverage across samples). Our lab has previously used these 4, 5, and 6 bp barcodes together in equal proportions for an sdRAD project that did not have any sequencing problems.

                      I have attached a table with the predicted percentages of each base for the first 12 positions with no phiX, 10% phiX spike-in, and 25% phiX spike-in.

                      As you can see from the table, the first 4 bp are quite balanced but positions 5-11 can have some rather severe bias. I think that we are going to try a 10-25% phiX spike-in and perhaps reduce cluster density down to ~700/mm^2. We need all the reads we can get to obtain maximum coverage so we do have a trade-off between ensuring our first trial run works and maximizing read output.

                      I realise that we have to empirically determine the optimal phiX spike-in ratio and cluster density but if any of you have experience-based opinions with respect to a sample with such an imbalance of nucleotide diversity in the first 12 bp your input would be greatly appreciated and potentially save us from further loss of data and $$$.
                      Attached Files

                      Comment


                      • #26
                        Is there an update on the real data you should have received from the last run? Did that look ok? I assume it was ok otherwise you would not be proceeding with this re-run.

                        All the best for that next run. Hopefully other samples on that flowcell will be normal.

                        Comment


                        • #27
                          Except where over-clustering plays a role, modern versions of HCS (v2.2.38 or later) utilizing HiSeq should not be negatively impacted by low diversity. See this pdf from Illumina's website.

                          If the HiSeq still has problems in this area, then Illumina must withdraw its claim to having solved it!

                          --
                          Phillip

                          Comment


                          • #28
                            Yes the data did look normal in the sense that the ~5-7% of reads that passed filter were primarily (+99%) ddRAD loci. The sequence quality was poorer than we are used to, particularly in bases 6-12 of the forward and the reverse reads. I have attached screen shots of fastqc plots in case you would like to see them. I passed them through the process_radtags in STACKS and I had to use barcode recovery (-r) to get a good proportion of reads due to the large number of N's in the first 12 bases of the forward and reverse reads.

                            I also realized that I had the wrong strand of the restriction site in my alignment for calculating nucleotide diversity so I have attached the corrected table (now nucleotide diversity is slightly improved across most sites).

                            Philip:

                            I have seen the table in the Illumina document where they sequenced single amplicons at 557K/mm^2 and ~93% of clusters passed filter. So I wonder if the low-diversity claim is only valid at cluster densities far below what the typical target (800-900K with our provider). There exists the possibility that our lanes were over-clustered. We have not seen the SAV outputs ourselves and the core facility has only told us that the Illumina field application specialist (FAS) recommends 700K/mm^2 with a 25% phiX spike-in but they did not tell us whether the FAS saw the SAV outputs.

                            This is a pretty big gamble for us so thanks for your continued help.
                            Attached Files

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            31 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            32 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            28 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            53 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X