SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-seq: how many million reads are required ? skm RNA Sequencing 4 04-30-2013 06:29 AM
piRNAs sequence files in fasta format for rat and mapping small RNA seq reads TJC Epigenetics 0 10-08-2012 02:11 PM
How do calculate % of genome with >20x coverage? SeqVicious Bioinformatics 5 08-09-2012 03:56 AM
500 million reads needed for RNA-Seq?! epistatic RNA Sequencing 6 10-31-2011 03:53 PM
coverage required oliviera General 2 10-17-2010 11:40 PM

Reply
 
Thread Tools
Old 02-04-2015, 09:27 AM   #1
wingtec
Member
 
Location: Charlottesville, VA

Join Date: Apr 2010
Posts: 33
Default How many million reads required to have a 20x coverage for rat RNA

Hi All,

Try to seek a piece of advice --- we are trying to obtain an average coverage of 20x for RNA-Seq of rat tissues. How many million reads should we try to get for each library to have that kind of coverage?

Much thanks in advance!


Wing
wingtec is offline   Reply With Quote
Old 02-04-2015, 09:54 AM   #2
ymc
Senior Member
 
Location: Hong Kong

Join Date: Mar 2010
Posts: 498
Default

Well, it is very hard to tell because different tissues will have different part of the transcriptome expressed.

If $$$ is not an issue, 100M 2x100 reads should very likely be an overkill of what you want.

Good luck!
ymc is offline   Reply With Quote
Old 02-04-2015, 02:20 PM   #3
RNA
Junior Member
 
Location: California

Join Date: Aug 2010
Posts: 3
Default

"20X coverage" for RNA-Seq is difficult to define since the copy number varies for transcripts across at least 4 orders of magnitude within a tissue. Therefore estimating "coverage" for RNA-Seq is not nearly as straightforward as it is for DNA applications.

For very highly expressed transcripts, as little as 1 Million reads will easily give you 20X coverage.

But for rare transcripts, you can collect 1 Billion or more reads and still not ever get to 20X coverage.

And of course this issue varies depending upon which tissue you are studying as well...a transcript may be easy to study in liver, but be virtually absent in brain.

For mRNA sequencing (TruSeq Stranded mRNA Kits) we usually recommend 50 Million paired-end 2 X 75 bp reads...you can always go to 100M if you want deeper coverage...but beyond that the cost-benefit ratio of collecting more reads on a single sample really falls off dramatically.
RNA is offline   Reply With Quote
Old 02-04-2015, 06:49 PM   #4
wingtec
Member
 
Location: Charlottesville, VA

Join Date: Apr 2010
Posts: 33
Default

Thanks much to y'all. These are very useful as well as practical helps!

Wing
wingtec is offline   Reply With Quote
Old 02-05-2015, 05:41 AM   #5
wingtec
Member
 
Location: Charlottesville, VA

Join Date: Apr 2010
Posts: 33
Default

With all that said, if I am allowed to twist the question a bit.

Say, I already have some Affy microarray data and I want to better or at least confirm the array data with RNA-Seq. The Affy chip used was HG ST gene array and the experiment was done with n=3. Now we want to do also n=3 in RNA-Seq, will 20M clean read of PE2x100 have similar or better coverage than the array data?

Thanks

Wing
wingtec is offline   Reply With Quote
Old 02-05-2015, 07:28 AM   #6
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,313
Default

Quote:
Originally Posted by wingtec View Post
With all that said, if I am allowed to twist the question a bit.

Say, I already have some Affy microarray data and I want to better or at least confirm the array data with RNA-Seq. The Affy chip used was HG ST gene array and the experiment was done with n=3. Now we want to do also n=3 in RNA-Seq, will 20M clean read of PE2x100 have similar or better coverage than the array data?

Thanks

Wing
Long, long ago, when we did SOLiD runs, an ABI applications specialist told us 5M reads was equivalent to an Affy Chip. But I don't know what that was based on.
Possibly there are comparisons in the literature?
--
Phillip
pmiguel is offline   Reply With Quote
Old 02-05-2015, 09:35 AM   #7
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

Quote:
Originally Posted by wingtec View Post
With all that said, if I am allowed to twist the question a bit.

Say, I already have some Affy microarray data and I want to better or at least confirm the array data with RNA-Seq. The Affy chip used was HG ST gene array and the experiment was done with n=3. Now we want to do also n=3 in RNA-Seq, will 20M clean read of PE2x100 have similar or better coverage than the array data?

Thanks

Wing
People have a lot of opinions on the amount of coverage needed for RNA-Seq - it almost turns into a religious debate! Generally speaking, 10M reads should give you 'array-like' coverage. 20M PE reads (which I'm defining as 20M clusters) would be even better. If cost is a major issue, you could either reduce the number of clusters or go for SE reads. PE is nice, but unless you're going to do the hard work of trying to figure out splice isoforms, it's probably not necessary.

Good luck with the experiment!
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Old 02-05-2015, 10:33 AM   #8
RNA
Junior Member
 
Location: California

Join Date: Aug 2010
Posts: 3
Default

I would highly recommend this blog from CoreGenomics which tries to address this issue using the data published by the SEQC group last year:

http://core-genomics.blogspot.com/20...not-quite.html

Bottom line is that many independent groups have come to the same conclusion: 10M to 20M single-end 50 bp reads (from libraries made with polyA mRNA preps) will give gene-level expression values that are better than an AFFY array.

These days, what with the lower price of sequencing etc., I always try to default to 25M paired-end 2x75 bp reads. This data will persist for a long time and can be used by lots of different pipelines to do more advanced analysis of splicing, fusions, and novel transcript discovery than can be done with 50 bp SE reads alone.
RNA is offline   Reply With Quote
Old 02-05-2015, 10:48 AM   #9
mbblack
Senior Member
 
Location: Research Triangle Park, NC

Join Date: Aug 2009
Posts: 245
Default

Quote:
Originally Posted by wingtec View Post
With all that said, if I am allowed to twist the question a bit.

Say, I already have some Affy microarray data and I want to better or at least confirm the array data with RNA-Seq. The Affy chip used was HG ST gene array and the experiment was done with n=3. Now we want to do also n=3 in RNA-Seq, will 20M clean read of PE2x100 have similar or better coverage than the array data?

Thanks

Wing
Note that regardless of depth of coverage, you may well not be able to "confirm" some array results with an independent RNA-seq experiment. Just because you detect any given gene as significantly differentially expressed in one experiment does not mean you will do so in the other experiment. Sometimes the overlap in DEGs is great, but sometimes it can be quite low.

You may get better correspondance (better confirmation) in the end by ontology enrichment comparisons of the genes selected from the two experiments than you will with a direct comparison of signficant gene lists. Particularly given that your n=3 for biological replication is a minimally low number of replicates.

Array equivalence is a two part issue to my mind. First is the issue of equivalent sensitivity - how much RNA-seq coverage will give you equivalent statistical sensitivity of detection of change? But how much coverage do you need to pick up either the equivalent number of DEGs or largely the same set of DEGs is a different issue. Typically, coverage for the former is far less than for the latter. 5-10M reads per sample will equal or exceed array sensitivity, but you'd be better to have 30-50M reads per sample if you want a good chance of getting high overlap in detected DEGs in both experiments (in my experience).
__________________
Michael Black, Ph.D.
ScitoVation LLC. RTP, N.C.
mbblack is offline   Reply With Quote
Reply

Tags
rna-seq advice

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:07 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO