SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ALLPATHS: does it work? francesco.vezzi Bioinformatics 35 06-01-2014 06:21 AM
SOAP2 -v doesn't work baohua100 Bioinformatics 6 06-26-2012 08:12 PM
low 454 coverage combined with high solexa coverage strob Bioinformatics 7 10-07-2010 11:14 AM
Getting Bowtie to work jamminbeh Genomic Resequencing 9 02-23-2010 03:32 AM

Reply
 
Thread Tools
Old 09-22-2010, 09:37 AM   #1
James
Member
 
Location: Cardiff

Join Date: Mar 2010
Posts: 23
Default How to work out coverage?

Hi everyone,

I want to work out what coverage/depth of sequencing we have on our last RNA-seq run. We are working on a small genome, so Ideally we would multiplex to bring the cost down and do more experiments. Before working out if we can multiplex I need to work out how much coverage we have and then how much we need.

Cheers, J
James is offline   Reply With Quote
Old 09-22-2010, 11:29 AM   #2
malachig
Senior Member
 
Location: WashU

Join Date: Aug 2010
Posts: 117
Default

The issue of coverage in RNA-seq data (transcriptomes) is arguably more complicated than it is for whole genome shotgun sequencing or exome sequencing.

In sequencing the genome, you want to be reasonably sure that you are covering the majority of the genome with sufficient depth to genotype and identify mutations at each position. Since each portion of the genome is present in approximately equal amounts (i.e. the target space is approximately uniform in representation) the amount of sequencing needed to achieve a particular level of coverage can be predicted.

In transcriptomes, the situation is not as simply (IMHO). Although the size of the genome of the organism may be important and having a small genome may help help you there are other important factors to consider. For example, the dynamic range of gene expression levels. This varies from species to species and tissue to tissue. It is also highly dependent on library construction method. Is the library polyA+, if not how are ribosomal sequences removed? Was a library normalization method applied? etc. In some transcriptome libraries I have seen, a small number of highly expressed genes consumed a huge percentage of all reads. This was particularly pronounced in libraries created with riboMinus RNA compared to polyA+ RNA. Another issue is what you hope to get out of the data? You can use RNA-seq libraries to profile gene level expression, differential expression, identify mutations, RNA editing, gene fusions, alternative splicing, etc. Do you need to be sure to profile genes with very low expression levels (say down to 1 copy per cell or even less...)?

One strategy is to think about which of these analysis options you are going to pursue. Then think about which would require the most depth to satisfy. Since you have some example data, see how well this data performs at increasing levels of depth for this task. That is, make saturation curves for the metric of interest.

Several papers have described the relationship between library depth and output for several different applications of RNA-seq. The supplementary materials of the following paper describes some of these: ALEXA-seq (see Supplementary Text and Supplementary Figure 4)
malachig is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:17 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO