Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why GC bias affect the read coverage in RNA-SEQ ?

    I am now reading a paper. In this paper, the author said that GC bias which affect the read coverage in RNA-SEQ can be included in the definition of effective exon length.
    I am not quite understand this problem, so I come here to ask for help.
    Thanks a billion ahead!

  • #2
    Originally posted by mozart View Post
    I am now reading a paper. In this paper, the author said that GC bias which affect the read coverage in RNA-SEQ can be included in the definition of effective exon length.
    I am not quite understand this problem, so I come here to ask for help.
    Thanks a billion ahead!
    Which paper? In general, NGS has difficulty in sequencing GC rich region. Suppose u do 100bp read length, your transcriptome will be fragmented randomly. During sequencing, the GC rich fragment is effectively downsampled with respect to those fragments with lower GC content. I suppose the authors may mean that these regions, being disadvantaged in sequencing, might be missed, which in turns affect the reconstruction of gene model
    Marco

    Comment


    • #3
      Can any one explain what GC bias is and what is the effect on data generated
      Khawar Sohail

      Comment


      • #4
        "GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

        That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

        One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

        --
        Phillip

        Comment


        • #5
          Thank You

          Dear Philip:
          Thank you very much for your reply, deeply appreciate, was on vacation so did not see earlier. You have explained quite beautifully and i have a better understanding of the phenomenon owing to you
          Khawar Sohail

          Comment


          • #6
            Originally posted by pmiguel View Post
            "GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

            That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

            One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

            --
            Phillip
            Thank you very much. I'm preparing my review presentation and I was confused by this GC-bias problem. It really helps a lot.

            Comment


            • #7
              Originally posted by pmiguel View Post
              "GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

              That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

              One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

              --
              Phillip
              But does anyone know what causes GC bias during PCR? Why is there a tendency for amplyfing more or less GC-rich DNA fragments?

              Comment


              • #8
                Originally posted by salamandra View Post
                But does anyone know what causes GC bias during PCR? Why is there a tendency for amplyfing more or less GC-rich DNA fragments?
                I would speculate as follows: High or low %GC will increase the chances of stable single-stranded structures (EG stem loops) forming. In vivo, polymerases will act in concert with a host of other accessory proteins while replicating a DNA strand. In a PCR reaction the polymerase is pretty much on its own, so structures forming in the template or product strand could interfere with polymerization.

                This is based on little more than intuition on my part though. But if you think about the chances of randomly encountering a stem-loop (inverted repeat)--seems like it is higher as you approach 100% GC or 0% GC. Right?

                If you see a sequence GCCCGCGC what is the chance you will see the reverse complement of that sequence 5 bases down-stream? If you are at 50% GC, then the chance would be 1/(4^8) or 1/(2^16). Basically one chance in 64,000. But if that stretch of DNA is 100% GC, your chance of encountering that reverse complement stretch exactly 5 bases downstream falls to 1/(2^8). One in 256.

                That's my guess.

                --
                Phillip

                Comment


                • #9
                  Originally posted by pmiguel View Post
                  I would speculate as follows: High or low %GC will increase the chances of stable single-stranded structures (EG stem loops) forming. In vivo, polymerases will act in concert with a host of other accessory proteins while replicating a DNA strand. In a PCR reaction the polymerase is pretty much on its own, so structures forming in the template or product strand could interfere with polymerization.

                  This is based on little more than intuition on my part though. But if you think about the chances of randomly encountering a stem-loop (inverted repeat)--seems like it is higher as you approach 100% GC or 0% GC. Right?

                  If you see a sequence GCCCGCGC what is the chance you will see the reverse complement of that sequence 5 bases down-stream? If you are at 50% GC, then the chance would be 1/(4^8) or 1/(2^16). Basically one chance in 64,000. But if that stretch of DNA is 100% GC, your chance of encountering that reverse complement stretch exactly 5 bases downstream falls to 1/(2^8). One in 256.

                  That's my guess.

                  --
                  Phillip

                  I think the speculation from Phillip totally make sense, this also would explain why there is also a bias in AT content, but weaker than for GC content, as GC would form stronger base-pairing.

                  I could also speculate that the ratio of nucleotides available is always around 0.25 for each base, and high GC content sequences might be harder to amplify as there is a higher demand of G and C than A and T, in the same way a non biased sequence is harder to duplicate when the proportions of nucleotides are not equitable.
                  Last edited by cuentaparaforos; 04-26-2019, 01:35 AM.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X