![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
RNA-seq: how many million reads are required ? | skm | RNA Sequencing | 4 | 04-30-2013 07:29 AM |
piRNAs sequence files in fasta format for rat and mapping small RNA seq reads | TJC | Epigenetics | 0 | 10-08-2012 03:11 PM |
How do calculate % of genome with >20x coverage? | SeqVicious | Bioinformatics | 5 | 08-09-2012 04:56 AM |
500 million reads needed for RNA-Seq?! | epistatic | RNA Sequencing | 6 | 10-31-2011 04:53 PM |
coverage required | oliviera | General | 2 | 10-18-2010 12:40 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Charlottesville, VA Join Date: Apr 2010
Posts: 34
|
![]()
Hi All,
Try to seek a piece of advice --- we are trying to obtain an average coverage of 20x for RNA-Seq of rat tissues. How many million reads should we try to get for each library to have that kind of coverage? Much thanks in advance! Wing |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Hong Kong Join Date: Mar 2010
Posts: 498
|
![]()
Well, it is very hard to tell because different tissues will have different part of the transcriptome expressed.
If $$$ is not an issue, 100M 2x100 reads should very likely be an overkill of what you want. Good luck! |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: California Join Date: Aug 2010
Posts: 3
|
![]()
"20X coverage" for RNA-Seq is difficult to define since the copy number varies for transcripts across at least 4 orders of magnitude within a tissue. Therefore estimating "coverage" for RNA-Seq is not nearly as straightforward as it is for DNA applications.
For very highly expressed transcripts, as little as 1 Million reads will easily give you 20X coverage. But for rare transcripts, you can collect 1 Billion or more reads and still not ever get to 20X coverage. And of course this issue varies depending upon which tissue you are studying as well...a transcript may be easy to study in liver, but be virtually absent in brain. For mRNA sequencing (TruSeq Stranded mRNA Kits) we usually recommend 50 Million paired-end 2 X 75 bp reads...you can always go to 100M if you want deeper coverage...but beyond that the cost-benefit ratio of collecting more reads on a single sample really falls off dramatically. |
![]() |
![]() |
![]() |
#4 |
Member
Location: Charlottesville, VA Join Date: Apr 2010
Posts: 34
|
![]()
Thanks much to y'all. These are very useful as well as practical helps!
Wing |
![]() |
![]() |
![]() |
#5 |
Member
Location: Charlottesville, VA Join Date: Apr 2010
Posts: 34
|
![]()
With all that said, if I am allowed to twist the question a bit.
Say, I already have some Affy microarray data and I want to better or at least confirm the array data with RNA-Seq. The Affy chip used was HG ST gene array and the experiment was done with n=3. Now we want to do also n=3 in RNA-Seq, will 20M clean read of PE2x100 have similar or better coverage than the array data? Thanks Wing |
![]() |
![]() |
![]() |
#6 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Possibly there are comparisons in the literature? -- Phillip |
|
![]() |
![]() |
![]() |
#7 | |
Registered Vendor
Location: San Diego, CA Join Date: Oct 2013
Posts: 138
|
![]() Quote:
Good luck with the experiment! |
|
![]() |
![]() |
![]() |
#8 |
Junior Member
Location: California Join Date: Aug 2010
Posts: 3
|
![]()
I would highly recommend this blog from CoreGenomics which tries to address this issue using the data published by the SEQC group last year:
http://core-genomics.blogspot.com/20...not-quite.html Bottom line is that many independent groups have come to the same conclusion: 10M to 20M single-end 50 bp reads (from libraries made with polyA mRNA preps) will give gene-level expression values that are better than an AFFY array. These days, what with the lower price of sequencing etc., I always try to default to 25M paired-end 2x75 bp reads. This data will persist for a long time and can be used by lots of different pipelines to do more advanced analysis of splicing, fusions, and novel transcript discovery than can be done with 50 bp SE reads alone. |
![]() |
![]() |
![]() |
#9 | |
Senior Member
Location: Research Triangle Park, NC Join Date: Aug 2009
Posts: 245
|
![]() Quote:
You may get better correspondance (better confirmation) in the end by ontology enrichment comparisons of the genes selected from the two experiments than you will with a direct comparison of signficant gene lists. Particularly given that your n=3 for biological replication is a minimally low number of replicates. Array equivalence is a two part issue to my mind. First is the issue of equivalent sensitivity - how much RNA-seq coverage will give you equivalent statistical sensitivity of detection of change? But how much coverage do you need to pick up either the equivalent number of DEGs or largely the same set of DEGs is a different issue. Typically, coverage for the former is far less than for the latter. 5-10M reads per sample will equal or exceed array sensitivity, but you'd be better to have 30-50M reads per sample if you want a good chance of getting high overlap in detected DEGs in both experiments (in my experience).
__________________
Michael Black, Ph.D. ScitoVation LLC. RTP, N.C. |
|
![]() |
![]() |
![]() |
Tags |
rna-seq advice |
Thread Tools | |
|
|