hi, all
I call SNPs using mpileup from samtools, I got strange results like:
Gm01 2923425 . A G,C 999 . DP=308;VDB=0.0224;AF1=0.6;G3=0.4,1.742e-06,0.6;HWE=0.00948;AC1=8;DP4=46,45,111,93;MQ=55;FQ=999;PV4=0.61,1,4.6e-0
9,1 GT:PLP:SP:GQ 1/1:255,157,0,255,157,255:52:0:99 0/1:255,255,255,169,169,0:56:0:3 0/0:200,255,255,0,255,255:59:2:48 1/1:255,102,0,255,102,25
5:34:0:99 0/0:0,154,255,154,255,255:51:0:99 0/1:255,255,255,90,90,0:30:0:3 1/1:255,39,0,255,39,255:13:0:37
According VCF format specification:
the ordering of genotypes for the likelihoods is given by: F(j/k) = (k*(k+1)/2)+j. In other words, for biallelic sites the ordering is: AA,AB,BB; for triallelic sites the ordering is: AA,AB,BB,AC,BC,CC, etc.
However, in my case, for example,
GT:PLP:SP:GQ 0/1:255,255,255,169,169,0:56:0:3
the six field from PL string has the lowest value-"0", thus the genotype might be CC, so the GT field should be 1/1, but I get 0/1 in this case.
another example:
0/0:200,255,255,0,255,255:59:2:48
the forth field was "0", thus genotype might be A/C, and GT field must be 0/1, I get 0/0 in this case.
I am confused about this.
I call SNPs using mpileup from samtools, I got strange results like:
Gm01 2923425 . A G,C 999 . DP=308;VDB=0.0224;AF1=0.6;G3=0.4,1.742e-06,0.6;HWE=0.00948;AC1=8;DP4=46,45,111,93;MQ=55;FQ=999;PV4=0.61,1,4.6e-0
9,1 GT:PLP:SP:GQ 1/1:255,157,0,255,157,255:52:0:99 0/1:255,255,255,169,169,0:56:0:3 0/0:200,255,255,0,255,255:59:2:48 1/1:255,102,0,255,102,25
5:34:0:99 0/0:0,154,255,154,255,255:51:0:99 0/1:255,255,255,90,90,0:30:0:3 1/1:255,39,0,255,39,255:13:0:37
According VCF format specification:
the ordering of genotypes for the likelihoods is given by: F(j/k) = (k*(k+1)/2)+j. In other words, for biallelic sites the ordering is: AA,AB,BB; for triallelic sites the ordering is: AA,AB,BB,AC,BC,CC, etc.
However, in my case, for example,
GT:PLP:SP:GQ 0/1:255,255,255,169,169,0:56:0:3
the six field from PL string has the lowest value-"0", thus the genotype might be CC, so the GT field should be 1/1, but I get 0/1 in this case.
another example:
0/0:200,255,255,0,255,255:59:2:48
the forth field was "0", thus genotype might be A/C, and GT field must be 0/1, I get 0/0 in this case.
I am confused about this.