Seqanswers Leaderboard Ad

**ECO** · 03-03-2011, 09:52 AM

Moving to RNA-seq.

**irit** · 03-03-2011, 10:29 AM

If you are doing PolyA enrichment that would be the source of 3' bias. I don't know if there should be bias if not doing this (say, ribodepletion or dsn from total RNA).

**JohnK** · 03-03-2011, 10:38 AM

Originally posted by irit View Post

If you are doing PolyA enrichment that would be the source of 3' bias. I don't know if there should be bias if not doing this (say, ribodepletion or dsn from total RNA).

Hi, irit. No doubt that polyA enrichment sounds like a perfect culprit. What are your thoughts on cDNA fragmentation though? From "RNA-Seq: a revolutionary tool for transcriptomics":

"Conversely, cDNA fragmentation is usually strongly biased towards the identification of sequences from the 3' ends of transcripts, and thereby provides valuable information about the precise identity of these ends."

**VanessaS** · 03-05-2011, 07:13 AM

In our protocol, the cDNA isn't fragmented, the RNA is.

**Seqasaurus** · 03-05-2011, 11:07 AM

I'd have thought polyA selection would have been the main source of 3' bias. Maybe also fragmentation of cDNA. On 454 at least, cDNAs of a certain size range don't nebulize well. This would bias reads of those fragments towards the transcript ends. If I remember correctly, this is no longer a problem with newer more rapid methods for non-normalized cDNA sequencing. For full-length, normalized cDNA sequencing (on 454), coligation (before nebulization) probably helps reduce the bias. Not sure if fragmentation produces bias on illumina as I'm not familiar with the library prep. I do know that we also have our cDNA coligated before nebulization for subsequent RNAseq on HiSeq2000.

someone correct me if I'm wrong.

**JohnK** · 03-05-2011, 01:34 PM

Originally posted by VanessaS View Post

In our protocol, the cDNA isn't fragmented, the RNA is.

Hi, Vanessa. So, would you say it's definitely a result of the poly(A) selection?? Thanks!

**VanessaS** · 03-06-2011, 10:45 PM

Originally posted by JohnK View Post

Hi, Vanessa. So, would you say it's definitely a result of the poly(A) selection?? Thanks!

I'm not qualified to say anything is definite, other than its not from shearing the cDNA. Maybe someone can comment on the kinds of biases, if any, introduced from enzymatic shearing of the RNA? We use RNAse III. I thought it was non-specific so no bias?

**JohnK** · 03-07-2011, 10:32 AM

Originally posted by VanessaS View Post

I'm not qualified to say anything is definite, other than its not from shearing the cDNA. Maybe someone can comment on the kinds of biases, if any, introduced from enzymatic shearing of the RNA? We use RNAse III. I thought it was non-specific so no bias?

Hey, Vanessa. Not so much the fragmentation of the RNA, but the poly(A) purification step, which you might expect to definitely generate a 3' bias as someone stated above... What do you think of that?

**steven** · 03-11-2011, 10:37 AM

Consider RNA stability issues too. Combined with poly(A) selection, this can result in a dramatic enrichment of terminal fragments.

**pbluescript** · 03-11-2011, 10:50 AM

Another contributor (depending on how you isolate and prepare your RNA) would be priming with oligo dT for the reverse transcription.

**NextGenSeq** · 03-11-2011, 12:39 PM

Also, the 5' cap on mRNA stabilizes the 5' end of RNA over the 3' end.

**roryk** · 04-15-2011, 05:16 AM

I was wondering if other people were also noticing a 3' bias in their Illumina prepped samples for RNA-Seq using the 8-sample bead-based poly-A selection kit. For shorter transcripts (< 6 kb or so) I do not notice any 3' bias but for longer transcripts there is definitely a fairly severe bias towards the 3' end of the transcript. I also see a peak at the 5' end of the longer transcripts as well which makes me think it is not due to degradation-- is that a reasonable thought? My total RNA looked great as assayed on a Bioanalyzer, but I'm not sure if that is true of the mRNA step, I wasn't sure how to check that. I visualized the size of the fragmented RNA by converting to cDNA and running on a gel and saw a fairly broad smear, so at least I know the entire mRNA library was not degraded. I've done hundreds of total RNA preps without RNAse contamination, it's hard for me to imagine that somehow I am introducing RNAse during the poly-A selection but this 3' bias sure does look like that is exactly what has happened.

Could the 3' bias in longer transcripts, as people have suggested above in the thread, simply be a byproduct of the poly-A selection? There are a couple of questionable vortexing steps in the Illumina protocol; I'm not worried that vortexing alone would shear the RNA since I do it as a standard part of my total RNA prep, but they do have you vortex the RNA-bound beads briefly. Could that be shearing the RNA? The flopping ends of the RNA banging into those beads, would that shear the longer transcripts? If so, why do I see a peak at the 5' end too? The 5' peak is about half the size of the 3' peak.

I visualized the 3'-5' bias using Simon Andrews' excellent SeqMonk visualizer, using view->probe trend plot on a probe list of mRNA. The 3' bias is there if I look at a probe list of the CDS as well. It is not there if I cut the annotations up into single exons and run. It is also much, much less pronounced (think about 10% difference) if I look at single exons which are > 5kb and there is no spike at both ends of the probe list. Am I maybe not understanding what the probe-trend plot is showing me?

**simonandrews** · 04-15-2011, 06:34 AM

Originally posted by roryk View Post

I visualized the 3'-5' bias using Simon Andrews' excellent SeqMonk visualizer, using view->probe trend plot on a probe list of mRNA. The 3' bias is there if I look at a probe list of the CDS as well. It is not there if I cut the annotations up into single exons and run. It is also much, much less pronounced (think about 10% difference) if I look at single exons which are > 5kb and there is no spike at both ends of the probe list. Am I maybe not understanding what the probe-trend plot is showing me?

Rory - glad to hear you're liking SeqMonk!

If you've put probes over mRNA features and then done a trend plot then the peak you see at the end might not be due to true 3' bias.

Different transcripts will have exons at different places along their length. Therefore the trend plot for any individual transcript will go up and down as you pass in and out of an exon. If you average over all transcripts then you'll see the combined signals from all of the transcripts doing this which will even itself out for the most part - however the only places you're guaranteed to be in an exon are at the beginning and end of each transcript, so a trend plot over all transcripts will probably show a peak at each end because of the higher probability of being in an exon. Since 3' exons are generally larger than 5' exons you'll probably also see a bigger peak at the 3' end.

What you'd need for a true view of the trend over a spliced transcript would be to concatenate the exons for each transcript together and do a trend plot over those - missing out the introns. This could actually be a good addition to the program so I'll look at adding that in the a future release.

This same problem wouldn't apply to trend plots over exons where you would expect the signal from the reads to be continuous.

**roryk** · 04-15-2011, 07:34 AM

Originally posted by simonandrews View Post

Rory - glad to hear you're liking SeqMonk!

If you've put probes over mRNA features and then done a trend plot then the peak you see at the end might not be due to true 3' bias.

Different transcripts will have exons at different places along their length. Therefore the trend plot for any individual transcript will go up and down as you pass in and out of an exon. If you average over all transcripts then you'll see the combined signals from all of the transcripts doing this which will even itself out for the most part - however the only places you're guaranteed to be in an exon are at the beginning and end of each transcript, so a trend plot over all transcripts will probably show a peak at each end because of the higher probability of being in an exon. Since 3' exons are generally larger than 5' exons you'll probably also see a bigger peak at the 3' end.

What you'd need for a true view of the trend over a spliced transcript would be to concatenate the exons for each transcript together and do a trend plot over those - missing out the introns. This could actually be a good addition to the program so I'll look at adding that in the a future release.

This same problem wouldn't apply to trend plots over exons where you would expect the signal from the reads to be continuous.

Ahh-- that makes complete sense. I had it in my head that when I was putting probes over the mRNA features I was putting them on an introns spliced out stitched-together version of the mRNA, thank you so much for the clarification.

Thanks again, Simon. I say again because you have answered about 15 other questions I have had while looking at my data when you were answering forum posts of other people!

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

3' Bias in RNA-Seq

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News