Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GC Bias on Illumina Platforms

    Does anyone know if Illumina sequencing is still expected to be biased against GC rich sequences?

    In doing some bacterial genome resequencing it is clear that our data does not give Poisson coverage of the genome, but that the variance is much higher than the mean (~35 for some runs). We found that about half of this unexplained dispersion is due to a bias against GC rich sequences, and that local GC content (within 10-20 bp) is the strongest determinant of the differences in sequencing coverage, and this influence decreases to about 100 bp away from the center base, where GC content matters little if at all.

    The problem is that I have seen data from other sequencing centers that do not show any GC effect, and have much lower dispersion (variance/mean around 3.5). I would love to get data like this, but can't figure out what is different about our two attempts. Anyone have any insight? Did they change their machines or protocols to avoid this?

  • #2
    Are your libraries PCR free?

    Comment


    • #3
      Yes, as they are made from genomic preps

      Comment


      • #4
        Let me rephrase. After ligation and prior to clustering, is there an amplification step of the library?

        Comment


        • #5
          So my understanding was that there wasn't any. However, I am double checking with the technician now, should know tomorrow. Can this introduce a lot of bias?

          Comment


          • #6
            PS Thanks for the help so far!!!

            Comment


            • #7
              Seems like all the talks/papers I've seen from Broad and Sanger seem to attribute the majority of the bias to the PCR step. Getting around this involves either eliminating PCR altogether or improving the PCR conditions.

              I'll try to dig up a reference if no one else jumps in.

              Comment


              • #8
                Alright, looks like there is an amplification step! The protocol used was the Illumina TruSeq kit protocol, which I can't get a copy of right now but hopefully will tomorrow.

                I am a bit surprised if it is the amplification because when I tried to model coverage at a site the most influential predictors were the local GC content (I used bins of 10 bp around the central base, so 0-10, 10-20, 20-30, 30-40, etc). Effects declined for more distant GC content (so the number of G or C within 10 bp was more important than that within 20 to 30 bp. Past about the read length GC didn't seem to matter which made me think it was the sequencing step, but I suppose the declining effect could be do to degrading quality to).

                I'll try to look into avoid GC bias in the amplification, this is rather strange! Thanks again for the help, comments from any others welcome!

                Comment


                • #9
                  Here you go, they solve it with modified cycling conditions IIRC:

                  Comment


                  • #10
                    Ah, interesting, also just got this response from Illumina:

                    "GC bias observed when sequencing standard genomes is often the result of high cluster density on the sequencing slide (much higher than the recommended specification). This is due to the differential formation of AT and GC rich clusters. For the hiSeq2000/hiSeq1000/HiScanSQ we have just released new chemistry (Truseq v3 reagents) that will allow higher density whilst reducing the bias toward AT clusters. The new chemistry will also assist in the sequencing of GC or AT rich genomes."

                    So it seems like there are at least two steps that could be improved, going to read the paper and mull this over more, thanks!!

                    Comment


                    • #11
                      I've noticed a further complication in some of my samples. The samples with really good quality DNA have a much greater GC bias than more degraded samples. When I sequence DNA from FFPE blocks there is pretty much no bias at all. I've presumed that it's something to do with fragmentation of the DNA. The high quality DNA can have its bias reduced after a few freeze/thaw cycles if it's taken out of the freezer a few times.

                      Comment


                      • #12
                        Thanks for the responses all! It looks like substantial headway can be made on this issue by :

                        1- Using optimal PCR settings per the papers specification

                        and

                        2- Not overloading the cluster density.

                        Going to try both. Henry your suggestion sounds applicable to DNA extraction methods not used for bacteria, but thanks for passing it on! Sounds like a great hint for somebody.

                        Thanks again!

                        Comment


                        • #13
                          Starting quantity

                          Due to the TruSeq not requiring PCR, does anyone know how they have modified that for CG rich regions? How much starting material does one use, and since each ug now is one prep-do you multiply by the number of ug or can you just treat it as one prep?

                          Comment


                          • #14
                            Originally posted by kwaraska View Post
                            Due to the TruSeq not requiring PCR,
                            Keep in mind that the standard protocol includes a 10 cycle PCR amplification. You have to go "off protocol" to produce an amplification free library.
                            Originally posted by kwaraska View Post
                            Due to the TruSeq not requiring PCR, does anyone know how they have modified that for CG rich regions?
                            Not sure what "that" refers to in this context. As detailed up thread much of the coverage bias results from the PCR "enrichment" step.
                            Originally posted by kwaraska View Post
                            How much starting material does one use, and since each ug now is one prep-do you multiply by the number of ug or can you just treat it as one prep?
                            TruSeq DNA asks for 1 ug per sample as you state. Did you want to start with more than 1 ug for some reason?

                            --
                            Phillip

                            Comment


                            • #15
                              For GC rich genomes, in addition to reducing overly high clusters, we definitely recommend eliminating the PCR step too. There are biases that can be attributed to the polymerase even if you optimize your PCR steps. Reducing the number of cycles helps, but we’ve found eliminating the step completely works the best. I can send you our protocol if you are interested.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X