SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Artemis can't read my indel vcf file palc Bioinformatics 4 08-28-2012 06:13 AM
some question in processing 454 data (Pyrobayes, indel) chrislove 454 Pyrosequencing 0 08-16-2012 10:03 AM
I did a samtools pileup and ended in a vcf format which is confuses at INDEL's ssn4ssn Illumina/Solexa 1 04-24-2012 04:19 AM
vcf-tools vcf-stats sample question Rubal7 Bioinformatics 1 04-09-2012 12:42 AM
VCF file question kjaja Bioinformatics 1 01-11-2012 12:06 PM

Reply
 
Thread Tools
Old 05-23-2013, 09:34 AM   #1
darren.obbard
Junior Member
 
Location: Edinburgh, UK

Join Date: Jan 2012
Posts: 5
Default VCF Indel encoding question

Hi,

I'm having problems understanding a GATK output VCF. I have read the VCF standard, but I'm obviously missing something.

I /think/ I understand how SNPs and short indels are represented, but clearly I do not. Below is an excerpt that illustrates sites which I do not understand. I suspect it may be something to do with GATK quality filters that I'm not understanding...

The excerpt below was generated using

GATK -l INFO -I my.bam -R my.fa -T UnifiedGenotyper -S LENIENT -nt 8 --heterozygosity 0.1 -o test.vcf --genotype_likelihoods_model BOTH --min_base_quality_score 10 --output_mode EMIT_ALL_SITES -ploidy 2

Thanks!

Darren

-------------------------------------------------------
Code:
CH1	225	.	T	G	12.71	LowQual	AC=1;AF=0.500;AN=2;BaseQRankSum=1.978;DP=59;Dels=0.03;FS=0.000;HaplotypeScore=10.2840;MLEAC=1;MLEAF=0.500;MQ=70.25;MQ0=8;MQRankSum=-5.349;QD=0.22;ReadPosRankSum=-3.188	GT:AD:DP:GQ:PL	0/1:41,16:55:20:20,0,1435
CH1	226	.	T	.	121.53	.	AN=2;DP=59;MQ=70.25;MQ0=8	GT:DP	0/0:43
CH1	227	.	A	.	121.53	.	AN=2;DP=59;MQ=70.25;MQ0=8	GT:DP	0/0:43
CH1	228	.	T	.	121.53	.	AN=2;DP=59;MQ=70.25;MQ0=8	GT:DP	0/0:43
CH1	229	.	A	.	115.53	.	AN=2;DP=57;MQ=69.66;MQ0=8	GT:DP	0/0:38
CH1	230	.	C	.	115.53	.	AN=2;DP=57;MQ=69.66;MQ0=8	GT:DP	0/0:38
CH1	231	.	T	.	115.53	.	AN=2;DP=57;MQ=69.66;MQ0=8	GT:DP	0/0:36
CH1	232	.	G	.	115.53	.	AN=2;DP=57;MQ=69.66;MQ0=8	GT:DP	0/0:36
CH1	233	.	C	.	115.53	.	AN=2;DP=57;MQ=69.66;MQ0=8	GT:DP	0/0:37
CH1	234	.	A	.	139.53	.	AN=2;DP=70;MQ=59.20;MQ0=14	GT:DP	0/0:63
CH1	235	.	A	.	175.53	.	AN=2;DP=84;MQ=51.67;MQ0=15	GT:DP	0/0:79
CH1	236	.	A	.	175.53	.	AN=2;DP=84;MQ=51.67;MQ0=15	GT:DP	0/0:79
CH1	237	.	T	.	175.53	.	AN=2;DP=85;MQ=51.37;MQ0=16	GT:DP	0/0:80
CH1	238	.	A	.	175.53	.	AN=2;DP=102;MQ=46.90;MQ0=28	GT:DP	0/0:97
CH1	238	.	A	AGAAAGAAAGCTTGTA	83.73	.	AC=1;AF=0.500;AN=2;BaseQRankSum=6.172;DP=102;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=46.90;MQ0=0;MQRankSum=-6.190;QD=0.05;ReadPosRankSum=-5.733	GT:AD:DP:GQ:PL	0/1:27,25:57:99:121,0,4853
CH1	239	.	A	.	175.53	.	AN=2;DP=102;MQ=46.90;MQ0=28	GT:DP	0/0:101
CH1	240	.	T	.	175.53	.	AN=2;DP=102;MQ=46.90;MQ0=28	GT:DP	0/0:98
CH1	241	.	A	.	169.53	.	AN=2;DP=108;MQ=44.14;MQ0=29	GT:DP	0/0:107
CH1	242	.	T	.	169.53	.	AN=2;DP=109;MQ=43.94;MQ0=29	GT:DP	0/0:103
CH1	242	.	T	.	118.27	.	AN=2;DP=109;MQ=43.94;MQ0=29	GT:AD:DP	0/0:27:55
CH1	243	.	C	.	172.53	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:DP	0/0:108
CH1	243	.	CTTTT	.	118.27	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:AD:DP	0/0:27:56
CH1	244	.	T	.	91.53	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:DP	0/0:61
CH1	245	.	T	.	91.53	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:DP	0/0:53
CH1	246	.	T	.	73.53	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:DP	0/0:41
CH1	247	.	T	.	91.53	.	AN=2;DP=110;MQ=43.76;MQ0=29	GT:DP	0/0:46
CH1	248	.	A	.	172.53	.	AN=2;DP=116;MQ=42.61;MQ0=31	GT:DP	0/0:100
CH1	249	.	A	.	172.53	.	AN=2;DP=116;MQ=42.61;MQ0=31	GT:DP	0/0:100
CH1	250	.	T	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:101
CH1	251	.	T	.	169.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:96
CH1	251	.	T	.	118.27	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:AD:DP	0/0:27:56
CH1	252	.	C	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:113
CH1	253	.	C	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:110
CH1	254	.	T	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:111
CH1	255	.	T	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:111
CH1	256	.	T	.	172.53	.	AN=2;DP=117;MQ=42.43;MQ0=32	GT:DP	0/0:111
Line 1 is a SNP
Lines 14 and 15 are an indel that I do understand
Lines 19 and 20 I do /not/ understand
Lines 21 and 22 I do /not/ understand
---------------------------------------------------
darren.obbard is offline   Reply With Quote
Old 06-06-2013, 12:38 AM   #2
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Darren, maybe you should re-post with the lines numbered. You can use the Unix "nl" command to do this.
Torst is offline   Reply With Quote
Reply

Tags
gatk, indel, vcf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:03 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO