View Single Post
Old 03-15-2017, 06:43 AM   #1
Junior Member
Location: Tˇrshavn, Faroe Islands

Join Date: Mar 2017
Posts: 2
Default Spurious dosage coefficient of determination in imputed VCF file

I have a VCF file that has dosage r^2 in the info field. The problem is that while the r^2 value should be in the 0 to 1 range, it has both negative values and values above 1.

Is there a fundamental problem with my data? I might add that this is whole-exome data where the off-target regions have been imputed using Beagle.

I have pasted some data collected from VCFtools, just to give an example. As you can see there are huge numbers (positive and negative), and a lot of zeros.

Dosage r^2 example:
1 10177 A AC 0
1 10235 T TA 0
1 10352 T TA 0
1 10642 G A 0
1 11008 C G 0.01
1 11012 C G 0.01
1 11063 T G 0

More dosage r^2 examples:
One allele with dr2=0, one with a huge number:
1 66381 TATATA AATATA,T 0,5.10663e+28
One with high correlation, another with a huge (negative) number:
1 769829 C A,G 0.82,-7.97911e+26
Also really tiny numbers, which is plausible, but suspicious:
1 15274 A G,T 0,3.66383e-14
olavur is offline   Reply With Quote