SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tabix and 1000 genomes data Alessandra Bioinformatics 8 09-05-2013 12:13 AM
Interpreting 1000 Genomes data ashkot Bioinformatics 5 01-05-2012 12:27 AM
1000 Genomes Data RichardRocca General 1 03-16-2011 12:11 PM
1000 Genomes Data/ Exon targetted Firebird Bioinformatics 27 02-17-2011 12:08 PM
need 1000 genomes data for just one gene michelle.lupton Bioinformatics 11 08-09-2010 01:00 PM

Reply
 
Thread Tools
Old 12-19-2011, 11:48 AM   #1
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default Annotating 1000 Genomes data

Hi all,
I am working with 1KGenomes data abd I have conducted all the analysis on my files of interest.

What I am really after is to annotate my data, I have been using annovar, which works great, however i have come accross an issue.

First I used dbSNP 129 and hg18 for annotating, this time most variants were not annotated. I was then advised to use buildver 37 and hg19 and i used dbSNP 132. This time around most variants where annotated, but my variant was interest was not. Thinking it may not be annotated i looked at the filtered file and even the genomic coordinates for that variant did not appear in that.

Can someone clearly explain with assembly versions, dbSNP builds etc.

Happy Holidays to all !

Ashwin
ashkot is offline   Reply With Quote
Old 12-20-2011, 01:36 AM   #2
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

Askhot

Are you looking for functional annotation of your variants?

Do you know what assembly version your variants have their coordinates in?

thanks
laura is offline   Reply With Quote
Old 12-20-2011, 11:40 AM   #3
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default

hi, the dataset that i am working on is one of the .bam files that i have downloaded from 1K Genomes.

As for annotation, all i really want is to have dbSNP rs numbers in the VCF files. I did a lot of downstream analysis such as converting the .bam into bcf and then piping it into annovar to generate an annotated vcf file. however, this vcf file does not have any rs id's listed.

the coordinate system that I want to use is GR37. i am just wondering what is the most straight forward way to accomplish this?

thanks,
ashwin
ashkot is offline   Reply With Quote
Old 12-20-2011, 10:52 PM   #4
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

You will probably find the dbSNP vcf files useful

ftp://ftp.ncbi.nih.gov/snp/organisms...9606/VCF/v4.0/
laura is offline   Reply With Quote
Old 12-22-2011, 10:53 AM   #5
ashkot
Member
 
Location: Cupertino, CA

Join Date: Nov 2011
Posts: 59
Default

Thanks, i think this might do the trick.
ashkot is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO