Hi @all,
I'd like whether anyone has some experience with monovar (http://www.nature.com/nmeth/journal/...meth.3835.html)
I'm asking because multi sample genotyping doesn't seem to work probably. Here are two of many examples of to illustrate the problem:
0/1:425,1:428:2:4,2,3240
0/1:705,1:706:2:4,2,3240
0/1:239,0:240:2:4,2,3240
1/1:8,145:154:7:3240,12,0
1/1:1,248:249:7:3240,12,0
1/1:2,57:59:7:1486,12,0
1/1:2,118:120:7:3240,12,0
1/1:1,27:28:7:693,12,0
1/1:2,36:38:7:915,12,0
1/1:0,8:8:7:224,12,0
1/1:1,47:48:7:1248,12,0
1/1:0,76:76:7:2046,12,0
1/1:0,64:64:7:1724,12,0
1/1:0,116:116:7:3240,12,0
1/1:0,13:13:7:355,12,0
The first 3 records are controls, the remaining tumor cells. For me it looks like the algo assumes an ADO in those records simply because there are more hom alt calls?! But what is the logic behind this assumption?
It seems to work fine if you have a better mixture of genotypes
0/1:22,51:73:3:1127,0,357
0/1:63,44:108:3:847,0,1346
0/1:11,47:59:3:1064,0,109
0/0:38,0:38:4:1,5,996
./.
0/0:144,0:144:4:1,5,3240
0/0:19,0:19:4:1,5,500
1/1:0,117:117:4:3240,8,1
1/1:0,1:1:2:31,4,3
1/1:0,78:78:4:2062,8,1
1/1:0,54:54:4:1421,8,1
1/1:0,167:167:4:3240,8,1
1/1:0,74:74:4:1960,8,1
1/1:0,320:321:4:3240,8,1
1/1:0,104:104:4:3240,8,1
The big question for me is know, how do I filter unreliable values? Depth was my first choice, but its probably not sufficient. Quality seems to be always low, especially if ADOs might have occurred (so in case where the GT is hom alt or hom ref...) But then, only hets with a strong difference in the likelihoods are reliable and everything not...
Any ideas or suggestions are welcome
I'd like whether anyone has some experience with monovar (http://www.nature.com/nmeth/journal/...meth.3835.html)
I'm asking because multi sample genotyping doesn't seem to work probably. Here are two of many examples of to illustrate the problem:
0/1:425,1:428:2:4,2,3240
0/1:705,1:706:2:4,2,3240
0/1:239,0:240:2:4,2,3240
1/1:8,145:154:7:3240,12,0
1/1:1,248:249:7:3240,12,0
1/1:2,57:59:7:1486,12,0
1/1:2,118:120:7:3240,12,0
1/1:1,27:28:7:693,12,0
1/1:2,36:38:7:915,12,0
1/1:0,8:8:7:224,12,0
1/1:1,47:48:7:1248,12,0
1/1:0,76:76:7:2046,12,0
1/1:0,64:64:7:1724,12,0
1/1:0,116:116:7:3240,12,0
1/1:0,13:13:7:355,12,0
The first 3 records are controls, the remaining tumor cells. For me it looks like the algo assumes an ADO in those records simply because there are more hom alt calls?! But what is the logic behind this assumption?
It seems to work fine if you have a better mixture of genotypes
0/1:22,51:73:3:1127,0,357
0/1:63,44:108:3:847,0,1346
0/1:11,47:59:3:1064,0,109
0/0:38,0:38:4:1,5,996
./.
0/0:144,0:144:4:1,5,3240
0/0:19,0:19:4:1,5,500
1/1:0,117:117:4:3240,8,1
1/1:0,1:1:2:31,4,3
1/1:0,78:78:4:2062,8,1
1/1:0,54:54:4:1421,8,1
1/1:0,167:167:4:3240,8,1
1/1:0,74:74:4:1960,8,1
1/1:0,320:321:4:3240,8,1
1/1:0,104:104:4:3240,8,1
The big question for me is know, how do I filter unreliable values? Depth was my first choice, but its probably not sufficient. Quality seems to be always low, especially if ADOs might have occurred (so in case where the GT is hom alt or hom ref...) But then, only hets with a strong difference in the likelihoods are reliable and everything not...
Any ideas or suggestions are welcome