SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BWA manual's -q description too technical CHRYSES Bioinformatics 13 01-09-2014 05:56 AM
samtools pileup manual description gormleymp Bioinformatics 2 04-25-2013 08:33 PM
Functions of the protein "zinc finger and BTB domain-containing protein" Testtube General 0 08-31-2012 09:20 AM
454 fasta qual description format NGS QC toolkit Quality Trimming pepperoni Bioinformatics 1 02-24-2012 08:55 AM
SOAP output format - description? Aengus Bioinformatics 2 09-08-2010 06:26 AM

Reply
 
Thread Tools
Old 10-29-2012, 12:29 PM   #1
mkdir
Member
 
Location: US

Join Date: Feb 2012
Posts: 19
Default protein gi number to description

Hello,

I got gi numbers for blastp -outfmt 6 results. I am wondering which programs and database can convert the gi number list to protein descriptions. Any help? Thank you very much.
mkdir is offline   Reply With Quote
Old 10-30-2012, 02:18 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

In theory BLAST could do this for you from the description in the database, please email them to let them know people would use this:
http://blastedbio.blogspot.co.uk/201...criptions.html

But as to the general question, you want to go from a protein GI number, e.g. GI:7525018, to a description, here something like "ATP synthase CF1 alpha subunit"?

You could use the NCBI Entrez API to lookup by protein GI number, see http://www.ncbi.nlm.nih.gov/books/NBK25499/

Another online alternative is the TogoWS REST API http://togows.dbcls.jp/site/en/ which makes this very easy thanks to their option to extract a specific field from a record, and can be used at the command line, in a script, or in your web-browser:

Code:
$ curl http://togows.dbcls.jp/entry/protein/7525018/definition
ATP synthase CF1 alpha subunit (chloroplast) [Arabidopsis thaliana].

Last edited by maubp; 10-30-2012 at 03:16 AM. Reason: Fixing tags; suggest emailing NCBI about BLAST
maubp is offline   Reply With Quote
Old 10-30-2012, 05:42 AM   #3
kmcarr
Senior Member
 
Location: USA, Midwest

Join Date: May 2008
Posts: 1,178
Default

Quote:
Originally Posted by mkdir View Post
Hello,

I got gi numbers for blastp -outfmt 6 results. I am wondering which programs and database can convert the gi number list to protein descriptions. Any help? Thank you very much.
I'm guessing from the fact that you said you used '-outfmt 6' that you ran your BLAST search locally, using BLAST+, with a local database. If that is the case then you already have the toolset and data needed. You can use the blastdbcmd which is part of the BLAST+ distribution. This command should give you what your want.

Code:
# blastdbcmd -db <your_blastdb_name> -entry_batch <gi_list_file> -outfmt '%g %t' -target_only -out <output_file_name>
The gi_list_file contains the gi's you want to search, one entry per line. The outfmt will print "gi#<space>definition", one per line. You can adjust the outfmt as desired (blastdbcmd -help for a full description of possible formats).

Last edited by kmcarr; 10-30-2012 at 05:44 AM. Reason: Added '-target_only' to command line. This will output only the first definition if multiple definitions map to the gi
kmcarr is offline   Reply With Quote
Old 10-30-2012, 06:01 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

That's rather neat kmcarr, in this case for just the one GI number in my example:
Code:
$ blastdbcmd -db /data/blastdb/ncbi/nr -entry 7525018 -outfmt '%t' -target_only
ATP synthase CF1 alpha subunit [Arabidopsis thaliana]
Nice
maubp is offline   Reply With Quote
Old 10-30-2012, 01:02 PM   #5
mkdir
Member
 
Location: US

Join Date: Feb 2012
Posts: 19
Default

Thank you very much kmcarr and maubp. This worked out.
mkdir is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO