SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: Genome Wide Full-Length Transcript Analysis Using 5' and 3' Paired-End-Tag N Newsbot! Literature Watch 1 01-20-2012 05:38 PM
Taxonomy classifier Chuckytah Bioinformatics 16 11-02-2011 12:55 PM
Translate coordinates between 2 references foxyg Bioinformatics 8 01-27-2011 10:40 AM
Maq and several references in same file dnusol Bioinformatics 1 01-12-2010 01:22 AM
Cost of human full genome sequencing.... Joann Literature Watch 0 04-06-2009 12:21 PM

Reply
 
Thread Tools
Old 01-21-2012, 09:47 AM   #1
vyahhi
Junior Member
 
Location: St. Petersburg / San Diego

Join Date: Feb 2011
Posts: 4
Default Genome references database with full taxonomy

Dear experts,

Is there a database with fully sequenced reference genomes with taxonomy? Better if in one/many fasta-files with taxonomy in headers like
>Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales;Enterobacteriaceae;Escherichia-Shigella;Escherichia coli str. K-12 substr. DH10B
There is 16S/18S rRNA database like arb-silva.de for that, but I couldn't find full genome database.

If such a database does not exists, what is the simplest way to mine this data from other databases (assuming good programming skills)?

Thank you!
vyahhi is offline   Reply With Quote
Old 01-21-2012, 04:22 PM   #2
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 836
Default

Here's the ftp link for the NCBI taxonomy database:

ftp://ftp.ncbi.nih.gov/pub/taxonomy/

I'm not quite sure what you want to use this for. Using FASTA files with full taxa in the headers seems incredibly inefficient.

If you have NCBI GI numbers for sequences, you can find out the taxon for that sequence:

ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid.readme

And then use the nodes.dmp file from the taxdump database to traverse the complete taxonomy:

ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt

I guess you could also use the gi_taxid to do a reverse lookup, retrieving all the NCBI sequence GIs with a particular taxid.
gringer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:11 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO