Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
sequencing depth/coverage Avro1986 General 1 12-12-2012 11:37 AM
Read depth recommendations dpryan RNA Sequencing 2 09-30-2011 12:15 PM
About the read depth of coverage El Mariachi Illumina/Solexa 2 12-30-2010 01:22 AM

Thread Tools
Old 05-12-2015, 07:42 PM   #1
Registered Vendor

Join Date: Mar 2013
Posts: 210
Default Coverage and Read Depth Recommendations by Sequencing Application

Genohub is in the process of developing an evolving coverage and read depth guide: based on references in the field. We'd like to ask this community for feedback and references to improve this guide.

- Genohub
Genohub is offline   Reply With Quote
Old 05-12-2015, 08:34 PM   #2
Brian Bushnell
Super Moderator
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

Whole-genome sequencing:

I don't see why indel-calling needs 4x the coverage of SNP-calling; 20x per ploidy seems fine to me for indel-calling, as it does for snp-calling. In fact, I suggest you mention somewhere on the page that the recommendations are for diploid genomes; you state
The coverage values below apply to most organisms while the read recommendations are for mammalian species with genome sizes of ~3Gb
but that does not really cover the issue of ploidy.

For CNVs... "1-8x" coverage seems really low to me. I would reject any data that calls virtually anything at 1x. It's important to mention the difference between amplified and unamplified libraries. I don't think amplified libraries are reliable for CNVs, due to amplification biases and randomness. Most of the time, you will probably see a 2x jump in coverage over a duplicated region using highly-amplified 8-fold coverage data... but I would not stake someone's life on that. The bias is reduced as you decrease the number of amplification cycles, but I don't know of a specific study that has analyzed this effect.

Whole-exome sequencing:

Calling a SNP homozygous at 3x coverage will be wrong (purely in terms of hom/het) ~1/8th of the time. I can hardly recommend a process that is wrong 1/8 of the time, though I should mention that when I wrote a variant caller, I got the best results when calling variants as low as 3x coverage. But I still don't recommend it as a guideline for planning things, particularly for exome-capture, which has an inherent ref-bias.

I had very good luck in calling indels from exome-capture data (consistent in trio studies, etc) but I assume it may be highly bait-system dependent. I only know about the ones that were called successfully, not what was missed, and I assume the ref-bias from baits is much more severe on indels than SNPs. So the recommendation of not selecting exome-capture with the intention of looking for indels seems appropriate. But I would still highly recommend people with exome-capture data to look for indels.

Transcriptome Sequencing/RNA-seq:

If people are interested in differential splicing, you should encourage them to use the longest possible reads (and paired reads). Also - the recommendations you have there are for a number of reads; but what is important is the transcriptome coverage, which varies by genome size and % of genome that is coding. I suggest you make your recommendations in terms of transcriptome coverage rather than a set number of reads (which does not consider read length, genome size, or transcriptome size).

I have not directly used the other categories so I'll defer to those who have.

Last edited by Brian Bushnell; 05-12-2015 at 08:37 PM.
Brian Bushnell is offline   Reply With Quote

coverage, read depth by application, sequencing depth

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:28 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO