SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: A new approach to bias correction in RNA-Seq. Newsbot! Literature Watch 0 01-31-2012 03:00 AM
RNA-Seq: Improving RNA-Seq expression estimates by correcting for fragment bias. Newsbot! Literature Watch 0 03-18-2011 02:00 AM
RNA-Seq: Length Bias Correction for RNA-seq Data in Gene Set Analyses. Newsbot! Literature Watch 0 01-22-2011 02:02 AM
RNA-seq bias Rodrigo Arzate RNA Sequencing 0 08-18-2009 01:14 PM
RNA-seq bias Rodrigo Arzate Illumina/Solexa 0 08-18-2009 01:10 PM

Reply
 
Thread Tools
Old 03-03-2011, 08:39 AM   #1
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default 3' Bias in RNA-Seq

Would anyone happen to know what the specific factor in cDNA fragmentation is that causes the 3' bias? It seems I've read a few papers that mention the bias, but don't go into explaining why? Thank you!

J
JohnK is offline   Reply With Quote
Old 03-03-2011, 08:52 AM   #2
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default

Moving to RNA-seq.
ECO is offline   Reply With Quote
Old 03-03-2011, 09:29 AM   #3
irit
Junior Member
 
Location: Pacific Northwest

Join Date: Oct 2010
Posts: 5
Default

If you are doing PolyA enrichment that would be the source of 3' bias. I don't know if there should be bias if not doing this (say, ribodepletion or dsn from total RNA).
irit is offline   Reply With Quote
Old 03-03-2011, 09:38 AM   #4
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by irit View Post
If you are doing PolyA enrichment that would be the source of 3' bias. I don't know if there should be bias if not doing this (say, ribodepletion or dsn from total RNA).
Hi, irit. No doubt that polyA enrichment sounds like a perfect culprit. What are your thoughts on cDNA fragmentation though? From "RNA-Seq: a revolutionary tool for transcriptomics":

"Conversely, cDNA fragmentation is usually strongly biased towards the identification of sequences from the 3' ends of transcripts, and thereby provides valuable information about the precise identity of these ends."
JohnK is offline   Reply With Quote
Old 03-05-2011, 06:13 AM   #5
VanessaS
Member
 
Location: Dallass

Join Date: Nov 2009
Posts: 49
Default

In our protocol, the cDNA isn't fragmented, the RNA is.
VanessaS is offline   Reply With Quote
Old 03-05-2011, 10:07 AM   #6
Seqasaurus
Member
 
Location: EU

Join Date: Sep 2010
Posts: 24
Default

I'd have thought polyA selection would have been the main source of 3' bias. Maybe also fragmentation of cDNA. On 454 at least, cDNAs of a certain size range don't nebulize well. This would bias reads of those fragments towards the transcript ends. If I remember correctly, this is no longer a problem with newer more rapid methods for non-normalized cDNA sequencing. For full-length, normalized cDNA sequencing (on 454), coligation (before nebulization) probably helps reduce the bias. Not sure if fragmentation produces bias on illumina as I'm not familiar with the library prep. I do know that we also have our cDNA coligated before nebulization for subsequent RNAseq on HiSeq2000.

someone correct me if I'm wrong.
Seqasaurus is offline   Reply With Quote
Old 03-05-2011, 12:34 PM   #7
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by VanessaS View Post
In our protocol, the cDNA isn't fragmented, the RNA is.
Hi, Vanessa. So, would you say it's definitely a result of the poly(A) selection?? Thanks!
JohnK is offline   Reply With Quote
Old 03-06-2011, 09:45 PM   #8
VanessaS
Member
 
Location: Dallass

Join Date: Nov 2009
Posts: 49
Default

Quote:
Originally Posted by JohnK View Post
Hi, Vanessa. So, would you say it's definitely a result of the poly(A) selection?? Thanks!
I'm not qualified to say anything is definite, other than its not from shearing the cDNA. Maybe someone can comment on the kinds of biases, if any, introduced from enzymatic shearing of the RNA? We use RNAse III. I thought it was non-specific so no bias?
VanessaS is offline   Reply With Quote
Old 03-07-2011, 09:32 AM   #9
JohnK
Senior Member
 
Location: Los Angeles, China.

Join Date: Feb 2010
Posts: 106
Default

Quote:
Originally Posted by VanessaS View Post
I'm not qualified to say anything is definite, other than its not from shearing the cDNA. Maybe someone can comment on the kinds of biases, if any, introduced from enzymatic shearing of the RNA? We use RNAse III. I thought it was non-specific so no bias?
Hey, Vanessa. Not so much the fragmentation of the RNA, but the poly(A) purification step, which you might expect to definitely generate a 3' bias as someone stated above... What do you think of that?
JohnK is offline   Reply With Quote
Old 03-11-2011, 09:37 AM   #10
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Consider RNA stability issues too. Combined with poly(A) selection, this can result in a dramatic enrichment of terminal fragments.
steven is offline   Reply With Quote
Old 03-11-2011, 09:50 AM   #11
pbluescript
Senior Member
 
Location: Boston

Join Date: Nov 2009
Posts: 224
Default

Another contributor (depending on how you isolate and prepare your RNA) would be priming with oligo dT for the reverse transcription.
pbluescript is offline   Reply With Quote
Old 03-11-2011, 11:39 AM   #12
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

Also, the 5' cap on mRNA stabilizes the 5' end of RNA over the 3' end.
NextGenSeq is offline   Reply With Quote
Old 04-15-2011, 05:16 AM   #13
roryk
Member
 
Location: boston

Join Date: Aug 2010
Posts: 15
Default

I was wondering if other people were also noticing a 3' bias in their Illumina prepped samples for RNA-Seq using the 8-sample bead-based poly-A selection kit. For shorter transcripts (< 6 kb or so) I do not notice any 3' bias but for longer transcripts there is definitely a fairly severe bias towards the 3' end of the transcript. I also see a peak at the 5' end of the longer transcripts as well which makes me think it is not due to degradation-- is that a reasonable thought? My total RNA looked great as assayed on a Bioanalyzer, but I'm not sure if that is true of the mRNA step, I wasn't sure how to check that. I visualized the size of the fragmented RNA by converting to cDNA and running on a gel and saw a fairly broad smear, so at least I know the entire mRNA library was not degraded. I've done hundreds of total RNA preps without RNAse contamination, it's hard for me to imagine that somehow I am introducing RNAse during the poly-A selection but this 3' bias sure does look like that is exactly what has happened.

Could the 3' bias in longer transcripts, as people have suggested above in the thread, simply be a byproduct of the poly-A selection? There are a couple of questionable vortexing steps in the Illumina protocol; I'm not worried that vortexing alone would shear the RNA since I do it as a standard part of my total RNA prep, but they do have you vortex the RNA-bound beads briefly. Could that be shearing the RNA? The flopping ends of the RNA banging into those beads, would that shear the longer transcripts? If so, why do I see a peak at the 5' end too? The 5' peak is about half the size of the 3' peak.

I visualized the 3'-5' bias using Simon Andrews' excellent SeqMonk visualizer, using view->probe trend plot on a probe list of mRNA. The 3' bias is there if I look at a probe list of the CDS as well. It is not there if I cut the annotations up into single exons and run. It is also much, much less pronounced (think about 10% difference) if I look at single exons which are > 5kb and there is no spike at both ends of the probe list. Am I maybe not understanding what the probe-trend plot is showing me?
roryk is offline   Reply With Quote
Old 04-15-2011, 06:34 AM   #14
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by roryk View Post
I visualized the 3'-5' bias using Simon Andrews' excellent SeqMonk visualizer, using view->probe trend plot on a probe list of mRNA. The 3' bias is there if I look at a probe list of the CDS as well. It is not there if I cut the annotations up into single exons and run. It is also much, much less pronounced (think about 10% difference) if I look at single exons which are > 5kb and there is no spike at both ends of the probe list. Am I maybe not understanding what the probe-trend plot is showing me?
Rory - glad to hear you're liking SeqMonk!

If you've put probes over mRNA features and then done a trend plot then the peak you see at the end might not be due to true 3' bias.

Different transcripts will have exons at different places along their length. Therefore the trend plot for any individual transcript will go up and down as you pass in and out of an exon. If you average over all transcripts then you'll see the combined signals from all of the transcripts doing this which will even itself out for the most part - however the only places you're guaranteed to be in an exon are at the beginning and end of each transcript, so a trend plot over all transcripts will probably show a peak at each end because of the higher probability of being in an exon. Since 3' exons are generally larger than 5' exons you'll probably also see a bigger peak at the 3' end.

What you'd need for a true view of the trend over a spliced transcript would be to concatenate the exons for each transcript together and do a trend plot over those - missing out the introns. This could actually be a good addition to the program so I'll look at adding that in the a future release.

This same problem wouldn't apply to trend plots over exons where you would expect the signal from the reads to be continuous.
simonandrews is offline   Reply With Quote
Old 04-15-2011, 07:34 AM   #15
roryk
Member
 
Location: boston

Join Date: Aug 2010
Posts: 15
Default

Quote:
Originally Posted by simonandrews View Post
Rory - glad to hear you're liking SeqMonk!

If you've put probes over mRNA features and then done a trend plot then the peak you see at the end might not be due to true 3' bias.

Different transcripts will have exons at different places along their length. Therefore the trend plot for any individual transcript will go up and down as you pass in and out of an exon. If you average over all transcripts then you'll see the combined signals from all of the transcripts doing this which will even itself out for the most part - however the only places you're guaranteed to be in an exon are at the beginning and end of each transcript, so a trend plot over all transcripts will probably show a peak at each end because of the higher probability of being in an exon. Since 3' exons are generally larger than 5' exons you'll probably also see a bigger peak at the 3' end.

What you'd need for a true view of the trend over a spliced transcript would be to concatenate the exons for each transcript together and do a trend plot over those - missing out the introns. This could actually be a good addition to the program so I'll look at adding that in the a future release.

This same problem wouldn't apply to trend plots over exons where you would expect the signal from the reads to be continuous.
Ahh-- that makes complete sense. I had it in my head that when I was putting probes over the mRNA features I was putting them on an introns spliced out stitched-together version of the mRNA, thank you so much for the clarification.

Thanks again, Simon. I say again because you have answered about 15 other questions I have had while looking at my data when you were answering forum posts of other people!
roryk is offline   Reply With Quote
Old 04-15-2011, 08:33 AM   #16
roryk
Member
 
Location: boston

Join Date: Aug 2010
Posts: 15
Default

Even looking at single exons which are very large (6kb), I can see there is a bit of 3' bias. Is this something to be concerned about for downstream quantitation? I have seen several papers where they look at coverage across entire transcripts and it appears to be mostly-uniform-- not so here. I attached an image of the probe trend plot for all exons > 6kb.

roryk is offline   Reply With Quote
Old 04-17-2011, 09:08 AM   #17
censinis
Junior Member
 
Location: Italy

Join Date: Nov 2008
Posts: 4
Exclamation Directional RNA-Seq for bacterial transcriptome analysis...

Hi guys,

I am particularly interested on directional RNA-seq to be determined by means of RNA-seq and Illumina HiSeq 2000. Same authors (like N.Croucher of Sanger) already mentioned few approaches but I am wondering if anyone already tested them with bacterial totRNA. In particular I am looking at protocols suggested for ribosomal RNA depletion, RNA fragmentation and retro-transcription. Could you be so gentle to help me?

Thanks in advance
Best
SC
censinis is offline   Reply With Quote
Old 04-17-2011, 10:26 AM   #18
adarob
Member
 
Location: Berkeley, CA

Join Date: Jul 2010
Posts: 71
Default

This type of bias (as well as sequence-specific bias) is corrected for in Cufflinks. The importance of doing this correction is detailed in our paper here: http://genomebiology.com/2011/12/3/R22/
adarob is offline   Reply With Quote
Old 05-10-2011, 04:44 AM   #19
kalidaemon
Member
 
Location: Boston

Join Date: Sep 2010
Posts: 14
Default Seqmonk bug?

I'm also trying to visualize/correct for potential 3' bias in my RNA-Seq data-set and want to try Seqmonk. The problem is that I can't get it to run off my PC which has a Windows XP operating system. Have other people run into this problem? What have you done to fix it?
kalidaemon is offline   Reply With Quote
Old 05-10-2011, 04:49 AM   #20
simonandrews
Simon Andrews
 
Location: Babraham Inst, Cambridge, UK

Join Date: May 2009
Posts: 871
Default

Quote:
Originally Posted by kalidaemon View Post
I'm also trying to visualize/correct for potential 3' bias in my RNA-Seq data-set and want to try Seqmonk. The problem is that I can't get it to run off my PC which has a Windows XP operating system. Have other people run into this problem? What have you done to fix it?
WinXP is not the problem. If SeqMonk won't start at all then it's either:
  • You don't have java installed (or it's not been added to your path)
  • You have less than 2GB RAM in your machine

If you don't have java installed then just get the latest version from java.com and install it.

If you have less than 2GB RAM in your machine then you'll need to lower the default memory allocation in the configuration which is shipped with SeqMonk. Instructions for how to do this can be found here.
simonandrews is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:33 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO