Hello everyone:
I am processing 74 germline exome sequencing samples of Mexican childhood patients on GATK 4.1.7.0. After VariantRecalibrator SNP, 2D projection of Gaussian mixture model does not show clearly high quality cluster of SNPs (green area), therefore, tranches plot does not show TPs variantes and Ti/Tv < 2.0
(attached below as plots.pdf)
The 74 BAM included in joint Genotyping approach, have a 60 - 85% of bases 20X. The script used and resource data sets are the following:
What does it mean? Have you any suggestions?
I am processing 74 germline exome sequencing samples of Mexican childhood patients on GATK 4.1.7.0. After VariantRecalibrator SNP, 2D projection of Gaussian mixture model does not show clearly high quality cluster of SNPs (green area), therefore, tranches plot does not show TPs variantes and Ti/Tv < 2.0
(attached below as plots.pdf)
The 74 BAM included in joint Genotyping approach, have a 60 - 85% of bases 20X. The script used and resource data sets are the following:
Code:
GATK VariantRecalibrator -R Homo_sapiens_assembly19.fasta -V input.vcf --resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.vcf --resource:omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.vcf --resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.b37.vcf --resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_138.b37.vcf -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -mode SNP -O output.recal --tranches-file Px_piloto.tranches --rscript-file file.plots.R