Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ndelaney
    Member
    • May 2011
    • 19

    GC Bias on Illumina Platforms

    Does anyone know if Illumina sequencing is still expected to be biased against GC rich sequences?

    In doing some bacterial genome resequencing it is clear that our data does not give Poisson coverage of the genome, but that the variance is much higher than the mean (~35 for some runs). We found that about half of this unexplained dispersion is due to a bias against GC rich sequences, and that local GC content (within 10-20 bp) is the strongest determinant of the differences in sequencing coverage, and this influence decreases to about 100 bp away from the center base, where GC content matters little if at all.

    The problem is that I have seen data from other sequencing centers that do not show any GC effect, and have much lower dispersion (variance/mean around 3.5). I would love to get data like this, but can't figure out what is different about our two attempts. Anyone have any insight? Did they change their machines or protocols to avoid this?
  • ECO
    --Site Admin--
    • Oct 2007
    • 1360

    #2
    Are your libraries PCR free?

    Comment

    • ndelaney
      Member
      • May 2011
      • 19

      #3
      Yes, as they are made from genomic preps

      Comment

      • ECO
        --Site Admin--
        • Oct 2007
        • 1360

        #4
        Let me rephrase. After ligation and prior to clustering, is there an amplification step of the library?

        Comment

        • ndelaney
          Member
          • May 2011
          • 19

          #5
          So my understanding was that there wasn't any. However, I am double checking with the technician now, should know tomorrow. Can this introduce a lot of bias?

          Comment

          • ndelaney
            Member
            • May 2011
            • 19

            #6
            PS Thanks for the help so far!!!

            Comment

            • ECO
              --Site Admin--
              • Oct 2007
              • 1360

              #7
              Seems like all the talks/papers I've seen from Broad and Sanger seem to attribute the majority of the bias to the PCR step. Getting around this involves either eliminating PCR altogether or improving the PCR conditions.

              I'll try to dig up a reference if no one else jumps in.

              Comment

              • ndelaney
                Member
                • May 2011
                • 19

                #8
                Alright, looks like there is an amplification step! The protocol used was the Illumina TruSeq kit protocol, which I can't get a copy of right now but hopefully will tomorrow.

                I am a bit surprised if it is the amplification because when I tried to model coverage at a site the most influential predictors were the local GC content (I used bins of 10 bp around the central base, so 0-10, 10-20, 20-30, 30-40, etc). Effects declined for more distant GC content (so the number of G or C within 10 bp was more important than that within 20 to 30 bp. Past about the read length GC didn't seem to matter which made me think it was the sequencing step, but I suppose the declining effect could be do to degrading quality to).

                I'll try to look into avoid GC bias in the amplification, this is rather strange! Thanks again for the help, comments from any others welcome!

                Comment

                • ECO
                  --Site Admin--
                  • Oct 2007
                  • 1360

                  #9
                  Here you go, they solve it with modified cycling conditions IIRC:

                  Comment

                  • ndelaney
                    Member
                    • May 2011
                    • 19

                    #10
                    Ah, interesting, also just got this response from Illumina:

                    "GC bias observed when sequencing standard genomes is often the result of high cluster density on the sequencing slide (much higher than the recommended specification). This is due to the differential formation of AT and GC rich clusters. For the hiSeq2000/hiSeq1000/HiScanSQ we have just released new chemistry (Truseq v3 reagents) that will allow higher density whilst reducing the bias toward AT clusters. The new chemistry will also assist in the sequencing of GC or AT rich genomes."

                    So it seems like there are at least two steps that could be improved, going to read the paper and mull this over more, thanks!!

                    Comment

                    • henry.wood
                      Member
                      • Apr 2010
                      • 63

                      #11
                      I've noticed a further complication in some of my samples. The samples with really good quality DNA have a much greater GC bias than more degraded samples. When I sequence DNA from FFPE blocks there is pretty much no bias at all. I've presumed that it's something to do with fragmentation of the DNA. The high quality DNA can have its bias reduced after a few freeze/thaw cycles if it's taken out of the freezer a few times.

                      Comment

                      • ndelaney
                        Member
                        • May 2011
                        • 19

                        #12
                        Thanks for the responses all! It looks like substantial headway can be made on this issue by :

                        1- Using optimal PCR settings per the papers specification

                        and

                        2- Not overloading the cluster density.

                        Going to try both. Henry your suggestion sounds applicable to DNA extraction methods not used for bacteria, but thanks for passing it on! Sounds like a great hint for somebody.

                        Thanks again!

                        Comment

                        • kwaraska
                          Senior Member
                          • Nov 2008
                          • 131

                          #13
                          Starting quantity

                          Due to the TruSeq not requiring PCR, does anyone know how they have modified that for CG rich regions? How much starting material does one use, and since each ug now is one prep-do you multiply by the number of ug or can you just treat it as one prep?

                          Comment

                          • pmiguel
                            Senior Member
                            • Aug 2008
                            • 2328

                            #14
                            Originally posted by kwaraska View Post
                            Due to the TruSeq not requiring PCR,
                            Keep in mind that the standard protocol includes a 10 cycle PCR amplification. You have to go "off protocol" to produce an amplification free library.
                            Originally posted by kwaraska View Post
                            Due to the TruSeq not requiring PCR, does anyone know how they have modified that for CG rich regions?
                            Not sure what "that" refers to in this context. As detailed up thread much of the coverage bias results from the PCR "enrichment" step.
                            Originally posted by kwaraska View Post
                            How much starting material does one use, and since each ug now is one prep-do you multiply by the number of ug or can you just treat it as one prep?
                            TruSeq DNA asks for 1 ug per sample as you state. Did you want to start with more than 1 ug for some reason?

                            --
                            Phillip

                            Comment

                            • Bioo Scientific
                              Registered Vendor
                              • Oct 2009
                              • 99

                              #15
                              For GC rich genomes, in addition to reducing overly high clusters, we definitely recommend eliminating the PCR step too. There are biases that can be attributed to the polymerase even if you optimize your PCR steps. Reducing the number of cycles helps, but we’ve found eliminating the step completely works the best. I can send you our protocol if you are interested.

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 10:09 AM
                              0 responses
                              10 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              20 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              27 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...