SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
1000 genomes data format gsgs General 5 08-28-2017 11:51 PM
tabix and 1000 genomes data Alessandra Bioinformatics 8 09-05-2013 12:13 AM
Interpreting 1000 Genomes data ashkot Bioinformatics 5 01-05-2012 12:27 AM
1000 genome data download zhanglu295 Bioinformatics 5 03-23-2011 04:44 AM
1000 Genomes Data RichardRocca General 1 03-16-2011 12:11 PM

Reply
 
Thread Tools
Old 06-03-2013, 01:46 PM   #1
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default 1000 Genomes Data Download/Tabix

Hi,

I am trying to download 1000 Genomes data using tabix. But I am unsure of how to do this. I searched through previous threads and could not find a step by step tutorial of how to do this.

I have tabix downloaded on PuTTY, but am not sure where to go from here. Step by step instructions or a tutorial would be much appreciated.

Thanks in advance!
strongside24 is offline   Reply With Quote
Old 06-04-2013, 04:04 AM   #2
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

Have you tried our example command lines from the faq

http://www.1000genomes.org/faq/how-d...ction-vcf-file
laura is offline   Reply With Quote
Old 06-04-2013, 06:50 AM   #3
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

Yes, I tried copy pasting that command line but it does not work and says that is not a function. Please help! Thanks!
strongside24 is offline   Reply With Quote
Old 06-04-2013, 07:14 AM   #4
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

Can you post your full command line and any error messages you see so we can help
laura is offline   Reply With Quote
Old 06-04-2013, 10:15 PM   #5
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

i posted :

tabix -h ftp://ftp.1000genomes.ebi.ac.uk/vol1...lease/20100804 ALL.2of4intersection.20100804.genotypes.vcf.gz 2:39967768-39967768

and this came up:

[kftp_connect_file] 550 Could not get file size.
[main] fail to open the data file.


thanks!
strongside24 is offline   Reply With Quote
Old 06-04-2013, 10:50 PM   #6
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

Apologies this is due to a typo in the example, there is a space between the ftp directory and the filename where there should be a /
laura is offline   Reply With Quote
Old 06-06-2013, 02:34 PM   #7
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

i tried replacing the space witha / and i got the same error message.

i posted this:

tabix -h ftp://ftp.1000genomes.ebi.ac.uk/vol1...notypes.vcf.gz 2:39967768-39967768

and got the same results.

help please! thank you!
strongside24 is offline   Reply With Quote
Old 06-06-2013, 02:37 PM   #8
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

wait nevermind. got it to work. thank you so much! will let you know if i have any more questions!!
strongside24 is offline   Reply With Quote
Old 06-10-2013, 11:52 AM   #9
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

i finished downloading the data i need via tabix, but now i am unsure of how to use it. how do you convert it into genotypes that can be useable in pli for an ld analysis?

thanks!
strongside24 is offline   Reply With Quote
Old 06-10-2013, 12:00 PM   #10
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by strongside24 View Post
how do you convert it into genotypes that can be useable in pli for an ld analysis?

thanks!
Perhaps like this: http://www.1000genomes.org/faq/can-i...linkped-format
GenoMax is offline   Reply With Quote
Old 06-17-2013, 11:24 AM   #11
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

i tried that command, and it said this:

"Can't open perl script 'vcf_to_ped_converter.pl': No such file or directory"

What is this script, and how do I get it?

Much appreciated!
strongside24 is offline   Reply With Quote
Old 06-17-2013, 11:41 AM   #12
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by strongside24 View Post
i tried that command, and it said this:

"Can't open perl script 'vcf_to_ped_converter.pl': No such file or directory"

What is this script, and how do I get it?

Much appreciated!
It seems to be available here: ftp://ftp.1000genomes.ebi.ac.uk/vol1...r/version_1.1/

Last edited by GenoMax; 06-17-2013 at 12:01 PM.
GenoMax is offline   Reply With Quote
Old 06-17-2013, 11:53 AM   #13
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

Actually scratch that.

I'm using the online VCF to PED Converter here: http://browser.1000genomes.org/Homo_...Data/Haploview

But when I download everything, it does not pull all the variants. It leaves out the introns. For example, I tried pulling mapk1. I entered the region I found through the browser and resulted 1619, not the 6777. I'm assuming the introns are left out. Is there a way to do this so all variants are pulled/downloaded?

Thanks!!
strongside24 is offline   Reply With Quote
Old 06-17-2013, 11:56 AM   #14
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

Also, is the online converter the same as the VCF to PED converter Genomax referenced above? And is there a way to pull introns from this converter as well? Thanks!
strongside24 is offline   Reply With Quote
Old 06-17-2013, 12:04 PM   #15
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by strongside24 View Post
Also, is the online converter the same as the VCF to PED converter Genomax referenced above? Thanks!
One would hope so since there appears to be only one version available for download.

Perhaps Laura can confirm by tomorrow.
GenoMax is offline   Reply With Quote
Old 06-17-2013, 12:51 PM   #16
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

ok sounds good. thank you! any instructions/info on how to get introns as well?
strongside24 is offline   Reply With Quote
Old 06-18-2013, 12:52 AM   #17
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

the vcf to ped form and script should not skip any variants which are variable in the individuals you ask it to consider

Can you give a specific example of the variants you think it is missing and we will investigate
laura is offline   Reply With Quote
Old 06-18-2013, 08:25 PM   #18
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

I tried running it for mapk1. I entered the region I found through the browser and resulted 1619 variants, not the 6777. Thanks!
strongside24 is offline   Reply With Quote
Old 06-19-2013, 12:31 AM   #19
laura
Senior Member
 
Location: Cambridge UK

Join Date: Sep 2008
Posts: 151
Default

Can you give me specific coordinates that you are using and let me know where your 6777 count comes from

When I look at our vcf file in ftp://ftp.1000genomes.ebi.ac.uk/vol1...ease/20110521/

laura@pg-trace-001[20110521]:tabix ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz 22:22108789-22221970 | cut -f1-8

There is only 1591 sites and only 1532 of them are snps

thanks
laura is offline   Reply With Quote
Old 06-23-2013, 10:15 AM   #20
strongside24
Member
 
Location: san francisco

Join Date: Jun 2013
Posts: 13
Default

those are the results when i use the online converter. but we decided to use the script instead of the online converter so i wouldn't have to save the data onto my computer.

i'm trying to run the script again for the converter and it won't work. i downloaded the vcf to perl converter mentioned above and ran this script:

perl vcf_to_ped_converter.pl -vcf ftp://ftp.1000genomes.ebi.ac.uk/vol1...notypes.vcf.gz -sample_panel_file ftp://ftp.1000genomes.ebi.ac.uk/vol1...L.sample_panel -region 13:32889611-32973805 -population GBR -population FIN

and got this as a result:

Can't open perl script "vcf_to_ped_convert": No such file or directory

I'm assuming I installed/downloaded the script wrong. Am i correct?
strongside24 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO