Dear all
I am finding difficult to understand some features of the vcf (variant call format) files I am getting when calling SNPs from illumina reads mapped using bwa and piled up using samtools. I hope you can help me.
When looking at the vcf file (of a single sequenced individual for example; just from bcftools view; before varFilter) I noticed that the overal DP of the position and the Genotype DP are quite different, the genotype DP being often drastically reduced relative to the position DP. I understant that (as the manual states) these numbers may be different because of a quality filter but I find hard to believe that such filter would reduce coverage from hundreds to less than 10, which is what I observe.
Does anyone know what this quality filter is exactly?
Also I noticed that this trend is more frequent in neighboring SNPs (with 0 to 2 sites in between).
Any ideas?
Thanks!
I am finding difficult to understand some features of the vcf (variant call format) files I am getting when calling SNPs from illumina reads mapped using bwa and piled up using samtools. I hope you can help me.
When looking at the vcf file (of a single sequenced individual for example; just from bcftools view; before varFilter) I noticed that the overal DP of the position and the Genotype DP are quite different, the genotype DP being often drastically reduced relative to the position DP. I understant that (as the manual states) these numbers may be different because of a quality filter but I find hard to believe that such filter would reduce coverage from hundreds to less than 10, which is what I observe.
Does anyone know what this quality filter is exactly?
Also I noticed that this trend is more frequent in neighboring SNPs (with 0 to 2 sites in between).
Any ideas?
Thanks!