SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Ion Torrent (http://seqanswers.com/forums/forumdisplay.php?f=40)
-   -   Difference between coverage, depth of coverage and targeted coverage (http://seqanswers.com/forums/showthread.php?t=63558)

hanco 10-18-2015 06:26 PM

Difference between coverage, depth of coverage and targeted coverage
 
Hi everybody,

I have a question about coverage, depth of coverage and targeted coverage.
What is the difference between those three?

If the targeted coverage, for example, is X25000, does it mean that the amount of reads is 25,000 reads?
And which one is better or more specific, X1000 coverage or X25000?

I am sorry if I perhaps ask a stupid question, please not that I am very new at this and my background is not bioinformatics.

Regards,
Hanco

Bukowski 10-19-2015 12:00 AM

Targeted coverage would normally imply the coverage you were aiming for with an experiment - i.e. in a perfect experiment with perfect pooling you would achieve this coverage. However you don't get perfect experiments, so you may not achieve it.

Coverage and depth of coverage mean the same thing.

Your coverage is not only related to the number of reads, but the size of the thing you are trying to sequence.

If you have a 100bp region covered by 1 100base read, then the coverage is 1x.
If you have a 200bp region covered by 1 100base read, then the coverage is 0.5x

More coverage is better for most applications, with the possible exception of genome assembly.

hanco 10-19-2015 01:14 AM

Thank you so much for your clear explanation.

I have another question.
So can I decide whether I have a good data based on coverage?
For example, I have 200bp region covered by 30 200base read, that mean I have coverage 30X. How can I know that this 30X coverage is good or not?

Again, thank you for your response, I really appreciate it.

Bukowski 10-19-2015 01:24 AM

I assume this relates to variant calling?

Firstly each base of each read will have a base quality score.
Each read will have a mapping score.

Whether something is 'good or not' is dependent on the application. Calling variants in a region where your read mapping qualities are low, or your base quality scores are low, is more likely to result in false positives.

However if you're using the data for genotyping, each variant will have a quality score which will take all this information into account.

Required coverage is driven by application. 30x is fine for calling germline SNPs in diploid model organisms. >1000x is fine for calling somatic SNPs with a variant allele frequency of 5% - however you can't call somatic variants of that nature with 30x coverage (not enough data).

Coverage is only *one* metric that you need to use when assessing the quality of your dataset.

hanco 10-19-2015 02:22 AM

Yes, actually my question relates to variant calling.

I am using Ion PGM Ampliseq Cancer Panel Hotspot v2 and Torrent Variant Caller.
Usually I have only few number of coverage, like I mentioned before, 30 200base reads. Sometimes this kind of data will not appear in TVC. I was wondering if the coverage of my data is too low or do I set the stringency too high.

I always set the variant frequency to "somatic" but I am little confused because in the user guide said that for somatic workflows the threshold is set to 4% frequency for SNPs and 20% for indels. But as you can see in the picture, the minimum variant allele is 0.02 (is it equal to 20%?).

http://i57.tinypic.com/2ahuvj4.jpg

I am sorry if my questions is out of the original topic.
Your help is much appreciated, thank you.

Bukowski 10-19-2015 04:18 AM

I think 0.02 refers to 2% in this context, not 20%

The min_coverage stated will not show variants at less than 100x (I assume, I've never used TVC)

hanco 10-19-2015 04:36 PM

Yes, I thought so too. That is why I get so confused. These number all automatically set when I chose "somatic" mode.
I suppose I should make another thread to discuss about this.

Anyway, thank you so much for all your kind help and clear explanations. Really appreciate it.


All times are GMT -8. The time now is 11:18 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.