SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: Detection of splicing events and multiread locations from RNA-seq data based Newsbot! Literature Watch 0 10-26-2011 02:50 AM
RNA-Seq: IsoLasso: A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembl Newsbot! Literature Watch 0 09-29-2011 06:00 AM
RNA-Seq: Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Expe Newsbot! Literature Watch 0 03-10-2011 03:00 AM
RNA-Seq: RNA-Seq Analysis of Gene Expression and Alternative Splicing by Double-Rando Newsbot! Literature Watch 0 03-03-2011 02:00 AM
RNA-Seq: SAW: A Method to Identify Splicing Events from RNA-Seq Data Based on Splicin Newsbot! Literature Watch 0 08-14-2010 02:00 AM

Reply
 
Thread Tools
Old 03-30-2011, 10:46 AM   #1
Jakob
Junior Member
 
Location: Dresden

Join Date: Mar 2011
Posts: 3
Default What's the best approach for a RNA seq project aiming at splicing and mRNA levels

We are about to embark on an RNAseq project looking at the effects of a null mutant mouse lacking an RNA binding protein that affects mRNA stability/translation and splicing. So, we want to explore both mRNA levels as well as mRNA exon composition in mutants vs wild-type.


1) Machine

In theory, we have access to 454, SOLiD, Illumina, although the latter is best established in the facility. Read length is obviously important for a splicing analysis (an argument for 454) but a paired end on Illumina would probably also do the trick?

2) RNA isolation & library
We were thinking of using a standard RNeasy plus purification, which does not catch RNAs under 200nts. Any downsides you can think? Or a different isolation you think superior for our purpose?
Followed by library generation in the facility which as far as I understand is a typical oligoT primed cDNA generation. RNA quantity is not an issue, so we don't need error-prone amplification.

3) Type of read
Should we go for maximal read length & paired end or is less good enough?
What number of total read should we aim for to be able to do the analyses we want to do?
Of course, long, paired, and more reads will give us better data but what's a good minimum starting point?

Would be great to hear your opinions.

Regards, j
Jakob is offline   Reply With Quote
Old 03-30-2011, 11:09 AM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

For differential splicing analysis, number of reads may well be more important than length of reads. I don't think a 454 can give you enough reads to achieve statistical power. Even with Illumina, you may be better off with many shorter reads. What kind of effects do you expect your mutation may have?
Simon Anders is offline   Reply With Quote
Old 03-30-2011, 03:37 PM   #3
Camg
Member
 
Location: Vancouver

Join Date: Jan 2011
Posts: 21
Default

I'm doing a similar type of analysis of splicing and expression levels. I think the new Illumina HiSeq machines are probably the way to go because coverage is so important, and the new HiSeq machines can produce hundreds of millions of 100bp reads per lane. Depth is important if you want to see enough reads over splice junctions etc, and improve statistical power for expression quantification (as mentioned in above response).

I've found that some genes are expressed at quite low levels making it difficult to assess expression and splicing. Although going with shorter reads will get you more coverage and probably be better for expression analysis, I think longer reads might be helpful in analyzing splicing, although it depends exactly what you're looking for there and what your transcriptome/genes look like. Also, you have to remember that you're probably going to trim off the last 10-15nt from your reads because they're of low quality and affect alignment/mapping.
I went with single reads in order to increase my number of unique reads per lane (and per $), but if you have the funding paired-end couldn't hurt.

What exactly are you trying to examine in terms of intron splicing? Alternative splicing or something else?
Camg is offline   Reply With Quote
Old 03-30-2011, 07:47 PM   #4
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi Camg,

Recently I'm having a research about RNA-seq as well.
Do you have any program or software in order to identify alternative splicing site of RNA-seq assembler read but without reference genome?
Thanks.
edge is offline   Reply With Quote
Old 03-31-2011, 06:46 AM   #5
Jakob
Junior Member
 
Location: Dresden

Join Date: Mar 2011
Posts: 3
Default

Thanks to Simon and Camg for replying. Regarding your questions:

Simon wrote: "What kind of effects do you expect your mutation may have?";
Camg wrote: "What exactly are you trying to examine in terms of intron splicing? Alternative splicing or something else?"


I'm expecting suppression of alternative exons to be relieved in the mutant, since the protein typically acts to exclude optional exons. In addition, it can affect overall mRNA levels by modifying their stability.

So, I need to compromise between read numbers which, like Simon pointed out, would be better for differential gene expression, and read length which would be better for exon alternative splicing. The question is what's a good compromise?

Any immediate suggestions regarding RNA isolation?

Regards, j
Jakob is offline   Reply With Quote
Old 03-31-2011, 07:13 AM   #6
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Long reads are useful to identify which transcript you have. Imagine your gene has two cassette exons, with a constitutive exon in between. Your data shows that both exons are spliced out in half of the transcript molecules. You may be interested to know whether there is correlation: It could be the case, e,g., (a) that a transcript either has both exons or none of them, or (b) whenever one exon is present the other is absent, or (c) there is no correlation between the two exon and all four possibilities happen abou equally often. Long reads help you deciding between such possibilities, as they could span from one exon to the other.

However, getting long reads is more expensive than short reads. And if you just want to know whether your treatment causes exons which are usually spliced out are now retained, you might not need it, as things are much easier then. Just count, for each sample, the number of reads falling onto the gene of interest and the number of reads among these that overlap with the alternative exon. Does the fraction of reads from this gene that fall onto this exon increase significantly from control to treatment? If you just want to compare the number of reads mapping onto the exon with the number of reads mapping onto any other part of the gene, rather short reads are fully sufficient.

So: In order to distinguish transcripts and see correlation between the usage of several facultative exons, read length helps a lot. But if you only want to know for each exon individually whether its usage in transcripts changes due to your treatment, better invest your money in read number than in read length.

Calculating the ratios is simple, testing whether a diferece is statistically significant is challenging. I say this because we are currently working on a tool to perform such an analysis. It should be ready for release soon.

Finally: Please don't forget to do your experiment at least in duplicates.
Simon Anders is offline   Reply With Quote
Old 03-31-2011, 10:56 AM   #7
Camg
Member
 
Location: Vancouver

Join Date: Jan 2011
Posts: 21
Default

Quote:
Originally Posted by edge View Post
Hi Camg,

Recently I'm having a research about RNA-seq as well.
Do you have any program or software in order to identify alternative splicing site of RNA-seq assembler read but without reference genome?
Thanks.
I'm using Tophat, which requires a reference genome. You're going to need to do de novo assembly, I think there are versions of Abyss and SOAP that can do this, and probably some others. Since I'm not doing de novo assembly I'm not really familiar with what you'll need to do, but it seems like identifying alternative splicing from a de novo transcriptome could be pretty tricky. Has anyone done this?
I suppose if you have good coverage and you see the spliced and unspliced versions of your transcripts, then it should be possible. Sorry I couldn't be much help.
Camg is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:30 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO