SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Total beginner: how do I download human NGS expression data ? tuku3 General 3 10-23-2011 07:35 PM
how to download genome annotation file pfzhu Bioinformatics 1 09-26-2011 12:24 AM
UCSC genome browser download ashwatha Bioinformatics 3 07-25-2011 08:58 PM
Download human gene sequences ritzriya Bioinformatics 6 03-24-2011 06:05 AM
Download promoter sequence db for human genome? ewilbanks Bioinformatics 3 10-27-2009 10:53 AM

Reply
 
Thread Tools
Old 08-02-2013, 02:43 PM   #1
fghd
Junior Member
 
Location: Canada

Join Date: Aug 2013
Posts: 7
Default Download Human Genome

I want to download assembled human genome from nbci site to calculate some statistics on it. I found GRCh37 in the link below
http://www.ncbi.nlm.nih.gov/assembly...imary_Assembly

which include all chromosome. but I don't know which format should be used in my calculation (Genbak or fasta) and how can I access the data since it load all chromosome data on the webpage when I click on Fasta. Is there any way to save this data in a file instead of load?

Thanks
fghd is offline   Reply With Quote
Old 08-02-2013, 02:56 PM   #2
vivek_
Bioinformatician
 
Location: Denmark

Join Date: Jul 2012
Posts: 158
Default

You can download/save the fasta files from the FTP site

ftp://ftp.ncbi.nlm.nih.gov/genbank/g...mosomes/FASTA/
vivek_ is offline   Reply With Quote
Old 08-02-2013, 03:20 PM   #3
fghd
Junior Member
 
Location: Canada

Join Date: Aug 2013
Posts: 7
Default

Thanks for your response.
Could you please explain how you reach to these files (for my future use) because this link (ftp://ftp.ncbi.nlm.nih.gov/genbank/g...ns/GRCh37.p13/) has the same title with different contetnt.

and is it correct to go in each chromosome page from this link (http://www.ncbi.nlm.nih.gov/nuccore/...7?report=fasta) and let it load the fasta format data on the web page and then copy and paste them in a notepad file?

as I am new in Bioinformatics, could you pelase explain What is chr.rm.out files? or do u know any more document to help me in better understanding these files and how I can analyze them.
Thanks
fghd is offline   Reply With Quote
Old 08-02-2013, 04:48 PM   #4
fghd
Junior Member
 
Location: Canada

Join Date: Aug 2013
Posts: 7
Default

To reach these Fasta file:
ftp://ftp.ncbi.nlm.nih.gov/genbank/g...apiens/GRCh37/ < Primary-assembly< assembled_chromosomes/
But Primary-Assembly is one of 10 assembly-units, how about the others (ALT_REF_LOCI_?)?
fghd is offline   Reply With Quote
Old 08-02-2013, 06:55 PM   #5
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by fghd View Post
To reach these Fasta file:
ftp://ftp.ncbi.nlm.nih.gov/genbank/g...apiens/GRCh37/ < Primary-assembly< assembled_chromosomes/
But Primary-Assembly is one of 10 assembly-units, how about the others (ALT_REF_LOCI_?)?
NCBI's ftp is full of readme files, reading them would probably answer all your questions. The rm.out files you asked about probably represent repeat masker output. And no, you shouldn't use notepad. If you want to bioinfo, you better get familiar with the command line..
rhinoceros is offline   Reply With Quote
Reply

Tags
fasta format, genbank, grch37, human genome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:44 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO