SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tabix and 1000 genomes data Alessandra Bioinformatics 8 09-05-2013 12:13 AM
Annotating 1000 Genomes data ashkot Bioinformatics 4 12-22-2011 10:53 AM
1000 Genomes Data RichardRocca General 1 03-16-2011 12:11 PM
1000 Genomes Data/ Exon targetted Firebird Bioinformatics 27 02-17-2011 12:08 PM
need 1000 genomes data for just one gene michelle.lupton Bioinformatics 11 08-09-2010 01:00 PM

Reply
 
Thread Tools
Old 12-22-2011, 12:00 PM   #1
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default Interpreting 1000 Genomes data

Hi all, i generated a vcf file from 1000 Genomes data using samtools etc and then annotated it with annovar. Following are a few lines from the annotated file. I am trying to answer the question if this person has a certain SNP or not. For e.g. the first line states bases as T and C and het. Does this mean the person is TC for that location and thus het in the next coulmn. If that is true then on the second line we have AG but it states hom, how could this be true. Alternatively does this mean that only het/ hom should be used for interpretation and T and C are the reference and risk alleles. If this is true then if someone if hom how can i find out the genotype because hom could mean AA or GG.

snp132 rs9276002 6 32694540 32694540 T C het 6.98 4 37
snp132 rs9276003 6 32694550 32694550 A G hom 105 5 37
snp132 rs9276004 6 32694564 32694564 C T hom 110 5 37
snp132 rs9276005 6 32694567 32694567 T C hom 110 5 37
snp132 rs9276006 6 32694582 32694582 G C hom 88.5 4 37
snp132 rs9276007 6 32694604 32694604 A G hom 16.9 2 37
snp132 rs9276008 6 32694633 32694633 C T hom 17.8 2 37
snp132 rs9276009 6 32694641 32694641 T C hom 13.9 2 37
snp132 rs9276010 6 32694686 32694686 A T het 8.65 3 37
snp132 rs9276013 6 32694724 32694724 T A hom 16.9 2 37
snp132 rs9276015 6 32694759 32694759 C G hom 10.2 2 37
snp132 rs9276017 6 32695022 32695022 T C hom 34 4 60

Appreciate any help.

Thank you.
ashkot is offline   Reply With Quote
Old 12-23-2011, 03:34 AM   #2
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

It looks like the output you are reading gives you the reference then the alternative allele not the genotype

Which individual are you looking at?
laura is offline   Reply With Quote
Old 12-23-2011, 12:28 PM   #3
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default

I am looking at HG00096, the very first one in the ftp list. In the meantime I also looked at another sample I had and this sample does have the genotypes, see following lines.

11 219398 . G A 45 . DP=9;AF1=0.5;AC1=1;DP4=3,2,4,0;MQ=43;FQ=47.9;PV4=0.44,1,0.097,1 GT:PL:GQ 0/1:75,0,92:78
11 219452 . C G 72.3 . DP=5;AF1=1;AC1=2;DP4=0,0,4,1;MQ=46;FQ=-42 GT:PL:GQ 1/1:105,15,0:27
11 220401 . C T 36 . DP=7;AF1=0.5;AC1=1;DP4=2,2,2,1;MQ=45;FQ=39;PV4=1,1,1,1 GT:PL:GQ 0/1:66,0,88:69
11 220919 . T C 26 . DP=9;AF1=0.5;AC1=1;DP4=2,2,2,3;MQ=24;FQ=19.5;PV4=1,0.095,1,1 GT:PL:GQ 0/1:56,0,47:50
11 221195 . T C 6.98 . DP=8;AF1=0.4999;AC1=1;DP4=0,5,2,1;MQ=35;FQ=9.53;PV4=0.11,0.46,1,1 GT:PL:GQ 0/1:36,0,77:37
11 221322 . G T 26 . DP=8;AF1=0.5;AC1=1;DP4=1,3,2,1;MQ=51;FQ=28.8;PV4=0.49,1,1,0.31 GT:PL:GQ 0/1:56,0,69:59
11 222620 . T C 23 . DP=7;AF1=0.5;AC1=1;DP4=2,2,3,0;MQ=41;FQ=26;PV4=0.43,1,0.22,0.35 GT:PL:GQ 0/1:53,0,82:56
11 223119 . T C 40 . DP=3;AF1=1;AC1=2;DP4=0,0,2,1;MQ=37;FQ=-36 GT:PL:GQ 1/1:72,9,0:16
11 225466 . T C 30 . DP=6;AF1=0.5;AC1=1;DP4=2,1,1,2;MQ=53;FQ=32.6;PV4=1,0.1,0.058,1 GT:PL:GQ 0/1:60,0,70:63
11 230135 . T C 16.1 . DP=4;AF1=1;AC1=2;DP4=0,0,3,0;MQ=24;FQ=-36 GT:PL:GQ 1/1:48,9,0:15
11 230368 . C T 20 . DP=6;AF1=0.5;AC1=1;DP4=2,1,2,1;MQ=46;FQ=22.8;PV4=1,0.089,1,0.44 GT:PL:GQ 0/1:50,0,63:53
11 230751 . A G 6.21 . DP=3;AF1=0.5019;AC1=1;DP4=1,0,0,2;MQ=46;FQ=-7.1;PV4=0.33,0.35,1,0.066 GT:PL:GQ


There is not a single location where the genotype is 0/0 i,e. REF/REF across the entire file.

Can you please let me know if I may have missed something.

Thanks,
ashkot is offline   Reply With Quote
Old 12-23-2011, 10:30 PM   #4
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

I would check for certain that you are passing annovar the individual you think you are and then I would contact the annovar developers to point out a possible bug
laura is offline   Reply With Quote
Old 01-04-2012, 05:05 PM   #5
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default

the data shown above is from the vcf file BEFORE it is input into annovar. I need some hel understanding genotype info from the 1K Genomes files. Is there any place i can look at?
ashkot is offline   Reply With Quote
Old 01-05-2012, 12:27 AM   #6
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

There is general documentation about vcf files here

http://www.1000genomes.org/wiki/Anal...mat-version-41

The vcftools community has a couple of mailing lists which you might find helpful

http://sourceforge.net/mail/?group_id=279407
laura is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO