SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to get hg19.fa? ninad Bioinformatics 26 08-24-2011 07:40 AM
Promoter regions hg19 NearyJL78 Bioinformatics 2 05-16-2011 11:08 AM
Hg19.PSL and SOAPals RockChalkJayhawk Bioinformatics 1 07-14-2010 09:22 AM
HG18 or HG19? foxyg Bioinformatics 4 06-23-2010 01:53 PM
hg18 or hg19 Layla Bioinformatics 1 11-09-2009 11:38 PM

Reply
 
Thread Tools
Old 07-16-2010, 09:28 AM   #1
cliff
Member
 
Location: USA

Join Date: Oct 2009
Posts: 41
Default where to download hg19?

Dear All

I am wondering where to download hg19 reference files. I need to map my illumina reads to hg19 by using BWA.

All your help will be appreciated.

-C
cliff is offline   Reply With Quote
Old 07-16-2010, 10:10 AM   #2
thaley
Junior Member
 
Location: Boston

Join Date: Jul 2010
Posts: 4
Default

http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/

I get my references from UCSC.

Cheers
thaley is offline   Reply With Quote
Old 07-16-2010, 11:17 AM   #3
cliff
Member
 
Location: USA

Join Date: Oct 2009
Posts: 41
Default

Thanks, Thaley, I just found that page two. Here is a question, how to use twoBitToFa to convert hg19.2bit to hg19.fa?

I just tried

./twoBitToFa hg19.2bit hg19.fa

but it said "Floating point exception"..
cliff is offline   Reply With Quote
Old 07-16-2010, 11:27 AM   #4
thaley
Junior Member
 
Location: Boston

Join Date: Jul 2010
Posts: 4
Default

Hmm.. You followed the directions on UCSC for the tool - build the source, etc?

Honestly, I got my references in .fa format before they started using this 2bit format. Sorry I can't be more help.

Off hand, I would double check the downloaded file to make sure it's not truncated and be sure the source for 2bit is building successfully.

...or if someone knows of an alternate location to get the .fa files, that would be the easiest.
thaley is offline   Reply With Quote
Old 07-16-2010, 11:56 AM   #5
aleferna
Senior Member
 
Location: sweden

Join Date: Sep 2009
Posts: 121
Default Try this one its one file per chromo

http://hgdownload.cse.ucsc.edu/golde...chromFa.tar.gz
aleferna is offline   Reply With Quote
Old 07-16-2010, 12:00 PM   #6
cliff
Member
 
Location: USA

Join Date: Oct 2009
Posts: 41
Default

Aleferna

Thanks, I have that one too. I am thinking of trying

cat chr*.fa > hg19.fa

But I am just not sure whether this concatenated hg19.fa is different from the one converted from hg19.2bit...
cliff is offline   Reply With Quote
Old 07-16-2010, 02:27 PM   #7
aleferna
Senior Member
 
Location: sweden

Join Date: Sep 2009
Posts: 121
Default

it should be the same, but check if they have the M chromosome and the haploids, that, I'm not sure, you might have to separate those before doing the cat.
aleferna is offline   Reply With Quote
Old 07-18-2010, 03:58 PM   #8
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

I used the 1000 genomes hg19 reference sequence from:

ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz

They already have the haplotype chromosomes removed.
mard is offline   Reply With Quote
Old 07-19-2010, 07:52 AM   #9
cliff
Member
 
Location: USA

Join Date: Oct 2009
Posts: 41
Default

mard:

Thanks for your response! Is this 1000 genome hg19 reference sequence different from that one from UCSC? All the files I have been using were downloaded from UCSC and I hope there won't be any discrepancy between those different versions of hg19.

Thanks

-C
cliff is offline   Reply With Quote
Old 07-19-2010, 08:30 AM   #10
mfischer
Junior Member
 
Location: Austria

Join Date: Mar 2010
Posts: 9
Default

Hi cliff,

according to ftp://ftp.1000genomes.ebi.ac.uk/vol1...k_v37.fasta.gz the 1000 genomes hg19 reference was built as follows:

Quote:
10th October 2009

Here are the steps used to produce this version of the human reference sequence to be used for the
main production project of the 1000 Genomes.

1. Download individual chrs from ensembl ftp

ftp://ftp.ensembl.org/pub/current_fa...o_sapiens/dna/

2. Download the newer version of the MT (NC_012920) from:

http://www.ncbi.nlm.nih.gov/nuccore/251831106

3. Create a reference with chrs1-22, X, Y, NC_012920 MT, and include the non-chromosomal supercontigs. The new single fasta is posted:

ftp://ftp.sanger.ac.uk/pub/1000genom...ect_reference/
UCSC states in http://genome.ucsc.edu/cgi-bin/hgGateway:
Quote:
Note on chrM
Since the release of the UCSC hg19 assembly, the Homo sapiens mitochondrion sequence (represented as "chrM" in the Genome Browser) has been replaced in GenBank with the record NC_012920. We have not replaced the original sequence, NC_001807, in the hg19 Genome Browser. We plan to use the Revised Cambridge Reference Sequence (rCRS) in the next human assembly release.
Besides UCSC's older version of the mitochondrion sequence and in the included haploids, the 1000 genomes reference should be identical to UCSC.

Cheers
mfischer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:20 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO