View Single Post
Old 10-19-2015, 01:24 AM   #4
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

I assume this relates to variant calling?

Firstly each base of each read will have a base quality score.
Each read will have a mapping score.

Whether something is 'good or not' is dependent on the application. Calling variants in a region where your read mapping qualities are low, or your base quality scores are low, is more likely to result in false positives.

However if you're using the data for genotyping, each variant will have a quality score which will take all this information into account.

Required coverage is driven by application. 30x is fine for calling germline SNPs in diploid model organisms. >1000x is fine for calling somatic SNPs with a variant allele frequency of 5% - however you can't call somatic variants of that nature with 30x coverage (not enough data).

Coverage is only *one* metric that you need to use when assessing the quality of your dataset.
Bukowski is offline   Reply With Quote