SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-seq read coverage questions pasta RNA Sequencing 15 05-11-2012 07:02 AM
RNA-Seq: A new approach to bias correction in RNA-Seq. Newsbot! Literature Watch 0 01-31-2012 04:00 AM
RNA-Seq: Improving RNA-Seq expression estimates by correcting for fragment bias. Newsbot! Literature Watch 0 03-18-2011 03:00 AM
RNA-Seq: Length Bias Correction for RNA-seq Data in Gene Set Analyses. Newsbot! Literature Watch 0 01-22-2011 03:02 AM
RNA-seq read coverage plot for genes? asiangg Bioinformatics 8 12-07-2010 06:17 AM

Reply
 
Thread Tools
Old 12-08-2011, 10:53 AM   #1
mozart
Junior Member
 
Location: Houston

Join Date: Apr 2011
Posts: 4
Smile Why GC bias affect the read coverage in RNA-SEQ ?

I am now reading a paper. In this paper, the author said that GC bias which affect the read coverage in RNA-SEQ can be included in the definition of effective exon length.
I am not quite understand this problem, so I come here to ask for help.
Thanks a billion ahead!
mozart is offline   Reply With Quote
Old 12-22-2011, 06:01 AM   #2
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

Quote:
Originally Posted by mozart View Post
I am now reading a paper. In this paper, the author said that GC bias which affect the read coverage in RNA-SEQ can be included in the definition of effective exon length.
I am not quite understand this problem, so I come here to ask for help.
Thanks a billion ahead!
Which paper? In general, NGS has difficulty in sequencing GC rich region. Suppose u do 100bp read length, your transcriptome will be fragmented randomly. During sequencing, the GC rich fragment is effectively downsampled with respect to those fragments with lower GC content. I suppose the authors may mean that these regions, being disadvantaged in sequencing, might be missed, which in turns affect the reconstruction of gene model
__________________
Marco
marcowanger is offline   Reply With Quote
Old 04-14-2012, 07:04 AM   #3
Khawar Sohail
Junior Member
 
Location: Pakistan

Join Date: Feb 2012
Posts: 2
Default

Can any one explain what GC bias is and what is the effect on data generated
__________________
Khawar Sohail
Khawar Sohail is offline   Reply With Quote
Old 04-16-2012, 04:24 AM   #4
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,237
Default

"GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

--
Phillip
pmiguel is offline   Reply With Quote
Old 04-20-2012, 08:04 AM   #5
Khawar Sohail
Junior Member
 
Location: Pakistan

Join Date: Feb 2012
Posts: 2
Default Thank You

Dear Philip:
Thank you very much for your reply, deeply appreciate, was on vacation so did not see earlier. You have explained quite beautifully and i have a better understanding of the phenomenon owing to you
__________________
Khawar Sohail
Khawar Sohail is offline   Reply With Quote
Old 07-08-2013, 05:52 PM   #6
Elsie1010
Junior Member
 
Location: Tokyo

Join Date: Jun 2013
Posts: 1
Smile

Quote:
Originally Posted by pmiguel View Post
"GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

--
Phillip
Thank you very much. I'm preparing my review presentation and I was confused by this GC-bias problem. It really helps a lot.
Elsie1010 is offline   Reply With Quote
Old 11-10-2017, 02:26 AM   #7
salamandra
Junior Member
 
Location: Europe

Join Date: Nov 2017
Posts: 1
Default

Quote:
Originally Posted by pmiguel View Post
"GC bias" means that I (for example) resequence a genome and notice that all the lowest and highest GC regions of the genome have the lowest sequence coverage. (Meaning few reads map to very low and very high GC regions.)

That is an empirical result. The major factor that causes it is likely "enrichment PCR" bias. Prior to the step where actual bridge or emulsion PCR is used to create templated "clusters" or "beads" -- the actual templates for sequencing -- most protocols have a step where (bulk or "normal") PCR is used to amplify the library products that can serve as templated for bridge/emulsion PCR. This type of PCR is inherently biasing -- templates that are more easily amplified in one cycle become the templates for the next. So even small differences in replication efficiency build up each cycle.

One solution that appears to remove nearly all of the GC bias is to not do enrichment amplification. However enrichment amplification is a crutch most of us lean heavily on -- especially Illumina with their TruSeq library protocols, which produce templates that pre-amplification, cluster at very low efficiency -- so we might, at most, try to minimize the number of enrichment amp cycles we use. And, for most purposes, this is probably sufficient.

--
Phillip
But does anyone know what causes GC bias during PCR? Why is there a tendency for amplyfing more or less GC-rich DNA fragments?
salamandra is offline   Reply With Quote
Old 11-10-2017, 05:29 AM   #8
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,237
Default

Quote:
Originally Posted by salamandra View Post
But does anyone know what causes GC bias during PCR? Why is there a tendency for amplyfing more or less GC-rich DNA fragments?
I would speculate as follows: High or low %GC will increase the chances of stable single-stranded structures (EG stem loops) forming. In vivo, polymerases will act in concert with a host of other accessory proteins while replicating a DNA strand. In a PCR reaction the polymerase is pretty much on its own, so structures forming in the template or product strand could interfere with polymerization.

This is based on little more than intuition on my part though. But if you think about the chances of randomly encountering a stem-loop (inverted repeat)--seems like it is higher as you approach 100% GC or 0% GC. Right?

If you see a sequence GCCCGCGC what is the chance you will see the reverse complement of that sequence 5 bases down-stream? If you are at 50% GC, then the chance would be 1/(4^8) or 1/(2^16). Basically one chance in 64,000. But if that stretch of DNA is 100% GC, your chance of encountering that reverse complement stretch exactly 5 bases downstream falls to 1/(2^8). One in 256.

That's my guess.

--
Phillip
pmiguel is offline   Reply With Quote
Reply

Tags
effective exon length

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO