While looking at some exome data I was struck by how widely my final variant list varies using different approaches. Using variant score reclalibration alone with GATK I ended up with 40,000 variants in a single exome (single sample using --maxGaussians 4 \ -percentBad 0.05). Using the GATK recommended hard filtering settings on the other hand yielded around 30,000 variants.
I was wondering, what are other people's experiences with variant filtering, how do you go about it and what sort of numbers do you see?
I was wondering, what are other people's experiences with variant filtering, how do you go about it and what sort of numbers do you see?