SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
can small rnaseq data be analyzed like rnaseq data? PFS Bioinformatics 5 05-02-2017 08:16 AM
who coined RNAseq? RNAseq as an alignment first approach only brachysclereid Bioinformatics 3 01-10-2012 12:17 PM
RNASeq from total RNA with a RIN under 8 jo_mason Sample Prep / Library Generation 2 11-17-2011 06:27 PM
Why do some Ns have higher quality values than other Ns? lcollado Illumina/Solexa 2 08-26-2010 08:17 PM
Question about the values of quality zino SOLiD 5 05-28-2010 03:31 AM

Reply
 
Thread Tools
Old 06-29-2011, 04:31 AM   #1
pogaora
Member
 
Location: Dublin

Join Date: Oct 2008
Posts: 11
Default RIN values for RNAseq

Hi All,

We have some RNA samples from frozen tissue which we want to have sequenced using the Illumina platform. However, some of the samples have low RIN values on the Agilent Bioanalyzer. We would really like to get the sequence but can't afford to blow the money it would cost. We have a range of RIN values from 3.6 to 10. BGI (our potential sequence service provider) suggests a RIN > 7 for samples.

My question is, has anybody had a good (preferably!) experience of sequencing low RIN value RNA. Did you use DSN in the library prep?

Also, in another project, we are thinking of sending material from FFPE tissue - any experience there?

Thanks a lot,
Peadar
pogaora is offline   Reply With Quote
Old 06-29-2011, 06:09 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Hi Pogaora,
RIN is a metric offered by the Agilent Bioanalyzer as an estimate of the extent of degradation of total RNA.

Generally the amplicons sequenced by second generation sequencers will be quite limited in length (less than 400 bp inserts). Therefore, length of your RNA, per se, is not an issue. The issue we are really speaking to here is: what bias do I cause by avoiding the often >90% of the RNA contributed to total RNA by the large and small ribosomal sub unit RNAs when my RNA is somewhat degraded. Basically there are a handful of methodologies employed:

(1) Isolate polyA+ RNAs
(2) Remove ribosomal by hybridization+immobilization
(3) Prime first strand cDNA synthesis from the polyA tail
(4) Normalize the total RNA sample to bring the highly expressed RNAs (eg, rRNA) down near to the level of more lowly expressed RNA (eg, messenger RNA) -- this would include the DSN method you mention.
(5) Degrade all messages without a 5' cap. (Epicentre terminal exonuclease)

(1) or (2) are very common. If you have a low RIN value, neither works well. poly A purification of degraded RNA (or priming cDNA synthesis from the polyA tail) will pull out only the most 3' segment of the RNA population. That is, the full length of all your messenger RNA is present, for the most part, even in highly degraded sample. But because the more 5' sequence has become detached from the polyA tail, it is no longer accessible. Below a RIN of 7 you will likely be seeing strong 3' bias in your sequences. This may be especially problematic because 3' untranslated regions (UTRs) of messages may harbor repetitive and/or low complexity sequences which may thwart even mapping them uniquely.

Conversely ribosomal RNA removal methodologies that work by hybridization generally target a handful of well-conserved sites on the ribosomal RNA. So you will see higher percentages of rRNA sequence resulting from this depletion method as the amount of degradation increases. Epicentre's "Ribo-zero" kits are touted as being able to remove ribosomal RNA from somewhat more degraded RNA than, well, the other kit is Invitrogen's "Ribo-minus". I presume they do this by increasing the number of rRNA binding oligo sites.

Normalization methodologies may well be very attractive in situations where the actual RNA counts are unimportant (eg, de novo transcriptome). Normalization methodologies rely on the kinetics of hybridization. Most current kits normalize by denaturing double-stranded cDNA, allowing a limited hybridization time, followed by degradation of double-stranded molecules with a double stranded nuclease (DSN). The more highly expressed molecules stand a better chance of finding their complement strand in a given amount of time and so they can be preferentially removed on that basis. Multiple cycles of this process will probably be necessary to remove the majority of the rRNA. The old school method of fractionating the single from double stranded molecules was of hydroxylapatite (HA) chromatography. Not sure why it has fallen out of favor.

I do not think TEX (method 5) would work well with degraded RNA because TEX requires a 5' phosphate to initiate exonucleolytic activity. I think (and please anyone that knows otherwise correct me if I am wrong here) most processes that result in RNA degradation result in 5'-hydroxyls and 3'-phosphates.

That is the overview. The answer is going to be application specific as well. If you just want to count transcripts, ignore splice variants and you know that 3'-UTRs are uniquely mappable in your genome of interest, then degradation will have little net negative effect for you. If you are going to use the Applied Biosystems (SOLiD) RNAseq library construction methodology you have to keep in mind that ends present in your sample prior to fragmentation probably are not ligatable to the adapters in that kit. Although treatment with T4-PNK prior to fragmentation might mitigate that issue.

A further issue is that degradation might occur during ribosomal depletion. So, for libraries where this issue is critical, running a pico RNA chip prior to fragmentation would be useful. If, for example, you saw nearly all of your polyA+ RNA is less than 500 nt. prior to fragmentation, you would know nearly all your resulting sequence will be localized to less than 500 bases from the polyadenylation site. However, some protocols make this nearly impossible (eg, Illumina's TruSeq RNA prep kit.) As RNAseq becomes a replacement for microarray-based expression analysis methodologies there is a strong need for less expensive and higher-throughput library construction methodologies. So it is not surprising that this would begin to interfere with QC steps that previously would have warned against continuing with a given samples. It is a trade-off that needs to be weighed.

--
Phillip
pmiguel is offline   Reply With Quote
Old 01-24-2012, 01:39 AM   #3
mnandita
Junior Member
 
Location: Bangalore, India

Join Date: Dec 2011
Posts: 7
Default

Has anyone looked at RNA-seq data from samples with low RIN? We recently sequenced a few samples and were unable to detect any kind of 3' sequence bias from within the raw reads. Not what I expected.
mnandita is offline   Reply With Quote
Old 01-24-2012, 04:17 AM   #4
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Were the samples actually degraded, or just gave a low RIN score for some other reason? What species/tissue was the RNA from? What library construction and sequencing method did you use? What method did you use to detect "any kind of 3' sequence bias from within the raw reads"?

As implied above, yes I have seen 3' bias from RNA with lower RIN scores using a polyA mRNA isolation method. When we repeated the experiment using RiboZero ribo-depletion we did not the 3' bias.

--
Phillip
pmiguel is offline   Reply With Quote
Old 01-24-2012, 06:25 PM   #5
mnandita
Junior Member
 
Location: Bangalore, India

Join Date: Dec 2011
Posts: 7
Default

Philip,

Samples were from a few different species - mammalian, worm and reptile. RINs ranged from 4-6. Typically such samples would not even pass QC but these were all we had. We had also seen good samples from the same species so believe that the low RIN is adequately telling of degradation.

Illumina TruSeq RNA-seq kit, library protocol involves polyA purification.

To detect 3 bias, we did the following
1) Took the assembled transcripts (this had already been done by the bioinfomaticians)
2) Aligned raw reads back to assembled transcripts and conducted a simple calculation of ratio of coverage at 3 end (3 end was defined as 15% of full length for apolyAdenylated transcript) to total coverage of that transcript.

In all these cases (n = 4) , we picked up only a small fraction of assembled transcripts that reflected a greater than 5x ratio of coverage at 3 end versus total coverage for a transcript as calculated above. (Less than 10 incidences, where I would expect a lot more)

I am wondering if
1) Assembly algorithms filter out biased transcripts, so that they never make it to assembly? (I need to still understand the assembly process as that is not done by me)

2) Our calculation method described is flawed? May be 15% is too long?

3) Anything else?

How did you detect 3 bias in your data set?

Thanks,
Nandita
mnandita is offline   Reply With Quote
Old 01-25-2012, 08:37 AM   #6
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

So naively one would expect that the extent of degradation present in the large/small sub-unit rRNAs reflects that of the whole sample. RIN is meant to assess that degradation. But it does not always get it right. RNA prep methods, for example, that pull down all the small RNAs (eg Trizol) can result in lower RINs, or even a failure to calculate a RIN for a lane. And, obviously, strange rRNA patterns generally result in low RIN scores whether the samples are degraded or not.

I generally check for 3' bias by loading up a .bam file in IGV and looking at some fairly highly expressed genes. If I see a pattern where most of the reads are mapping towards the 3' ends of genes and nearly not towards the 5' end, then I have evidence of and RNA degradation issue.

We try to get people to re-isolate RNA if it looks degraded. So bias is usually not a big issue.

Most of our libraries were either SOLiD Total RNA seq kit or Illumina RNA TruSeq. Either of these are susceptible to 3' bias when fed degraded total RNA that is ribodepleted using oligo dT capture methods.

Normally I think of samples with a RIN of 4-6 being moderately degraded. As to whether libraries constructed from such samples would generate 5x as many reads mapped to the 3' 15% of the transcript or not, I could not say. Sounds like your data says "no".

--
Phillip
pmiguel is offline   Reply With Quote
Old 01-25-2012, 08:42 AM   #7
mnkyboy
Member
 
Location: Seattle, WA

Join Date: Mar 2009
Posts: 87
Default

We routinely do degraded and FFPE RNA samples. With mRNA-seq its a disaster, however whole transcriptome RNA-seq seems to be less sensitive to degraded RNA. For this we use the RiboZero FFPE kits and a standard TruSeq whole transcriptome library with 50 or 2x50 bp reads. And since we prefer whole transcriptome for our research it works out fairly well.

We prefer RINs over 8 but sometimes with out samples and our collaborators we do not have a choice.
mnkyboy is offline   Reply With Quote
Old 06-20-2013, 01:08 PM   #8
Enrique Zudaire
Member
 
Location: DC

Join Date: Feb 2013
Posts: 10
Default

Quote:
Originally Posted by pmiguel View Post
So naively one would expect that the extent of degradation present in the large/small sub-unit rRNAs reflects that of the whole sample. RIN is meant to assess that degradation. But it does not always get it right. RNA prep methods, for example, that pull down all the small RNAs (eg Trizol) can result in lower RINs, or even a failure to calculate a RIN for a lane. And, obviously, strange rRNA patterns generally result in low RIN scores whether the samples are degraded or not.

I generally check for 3' bias by loading up a .bam file in IGV and looking at some fairly highly expressed genes. If I see a pattern where most of the reads are mapping towards the 3' ends of genes and nearly not towards the 5' end, then I have evidence of and RNA degradation issue.

We try to get people to re-isolate RNA if it looks degraded. So bias is usually not a big issue.

Most of our libraries were either SOLiD Total RNA seq kit or Illumina RNA TruSeq. Either of these are susceptible to 3' bias when fed degraded total RNA that is ribodepleted using oligo dT capture methods.

Normally I think of samples with a RIN of 4-6 being moderately degraded. As to whether libraries constructed from such samples would generate 5x as many reads mapped to the 3' 15% of the transcript or not, I could not say. Sounds like your data says "no".

--
Phillip
Hi Phillip,

I saw your post and I was wondering if you could help. I'm trying to understand the consequences of low RIN values for differential gene expression analysis.

We have run Illumina TruSeq on samples with low RIN values (ie 3.5) in parallel with samples with higher RIN values (>7). The mapped reads for a typical sample with low RIN are ~380M and for samples with high RIN ~420M. So, I do see lower mapped reads mapped in samples with lower RIN values. However, would this really affect when assessing differential gene expression since we will be using FPKM?

Also, you suggest to have a look at some highly expressed genes with IGV and look for 3' bias. Attached are the images of GAPDH in the BAMs for two samples with disparate RIN values (B70 RIN=3.5 and B71 RIN=7.1). I don't see much of a 3' bias (this is the case for other highly expressed genes and across samples). Is that your assessment as well? If so, can I trust the data generated with samples with low RIN values? and what is really the consequence of low RIN values?

Thank you in advance for the help.
Kike
Attached Images
File Type: png igv_panel_GAPDH_Sample_B70_sorted_PE.bam.png (12.1 KB, 91 views)
File Type: png igv_panel_GAPDH_Sample_B71_sorted_PE.bam.png (14.5 KB, 67 views)
Enrique Zudaire is offline   Reply With Quote
Old 06-22-2013, 11:49 AM   #9
DonDolowy
Member
 
Location: Freiburg

Join Date: Oct 2012
Posts: 56
Default

An easy way to check if you have 3ībias is to use the program RSEQC. With that you can calculate the gene body coverage.
https://code.google.com/p/rseqc/
DonDolowy is offline   Reply With Quote
Old 06-23-2014, 07:10 AM   #10
earonesty
Member
 
Location: United States of America

Join Date: Mar 2011
Posts: 52
Default

Quote:
Originally Posted by DonDolowy View Post
An easy way to check if you have 3ībias is to use the program RSEQC. With that you can calculate the gene body coverage.
https://code.google.com/p/rseqc/
also sam-stats -R cacluates coverage, bias and a skewness metric for rna seq. median skewness can be a single-metric summary of coverage bias per sample.

More about coverage skewness:
https://code.google.com/p/ea-utils/wiki/SamStatsDetails
earonesty is offline   Reply With Quote
Old 06-23-2014, 09:07 AM   #11
Enrique Zudaire
Member
 
Location: DC

Join Date: Feb 2013
Posts: 10
Default

Thank you DonDolowy and earonesty.

We do use RNAseQC which does coverage. I have never used sam-stats and I like the idea of the skewness metric.

In our datasets I haven't been able to find a strong correlation between RIN value and 3' bias (more like a weak trend). This is important for us since our samples may never have great RIN values. In addition, it is nos clear to me what 3' bias does to differential gene expression analysis and if it does something, to what extent. Any thoughts on this?

Kike
Enrique Zudaire is offline   Reply With Quote
Old 06-23-2014, 12:25 PM   #12
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

You might want to check out this paper: http://www.plosone.org/article/info%...l.pone.0091851
kopi-o is offline   Reply With Quote
Old 06-24-2014, 01:05 PM   #13
earonesty
Member
 
Location: United States of America

Join Date: Mar 2011
Posts: 52
Default

Quote:
Originally Posted by Enrique Zudaire View Post
Thank you DonDolowy and earonesty.

We do use RNAseQC which does coverage. I have never used sam-stats and I like the idea of the skewness metric.

In our datasets I haven't been able to find a strong correlation between RIN value and 3' bias (more like a weak trend). This is important for us since our samples may never have great RIN values. In addition, it is nos clear to me what 3' bias does to differential gene expression analysis and if it does something, to what extent. Any thoughts on this?

Kike
I've found that outliers in median coverage skewness tend to have higher variability, thus reducing the power if these samples are included in group comparisons.
earonesty is offline   Reply With Quote
Old 07-13-2014, 07:23 AM   #14
Genohub
Registered Vendor
 
Location: genohub.com

Join Date: Mar 2013
Posts: 210
Default Coverage biases with rRNA depletion and poly-A selection

There are pretty considerable coverage biases with rRNA depletion and poly(A) selection methods, that up to now have not been well characterized: http://blog.genohub.com/rrna-depleti...as-in-rna-seq/.

I'd be interested in learning about tools / analysis methods that attempt to account for these exon - level expression biases.

- Genohub
Genohub is offline   Reply With Quote
Old 07-15-2014, 09:18 AM   #15
earonesty
Member
 
Location: United States of America

Join Date: Mar 2011
Posts: 52
Default

Quote:
Originally Posted by Genohub View Post
There are pretty considerable coverage biases with rRNA depletion and poly(A) selection methods, that up to now have not been well characterized: http://blog.genohub.com/rrna-depleti...as-in-rna-seq/.

I'd be interested in learning about tools / analysis methods that attempt to account for these exon - level expression biases.

- Genohub
sam-stats with the -R option outputs a "skewness" value for each transcript. Because of huge variation in reference, there's no absolute good number for this value, but you can compare the median skewness among samples and see which samples had more/less bias. Degraded samples generally have more bias.
earonesty is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:00 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO