Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to make best guest of protein class from protein sequence ajing Bioinformatics 3 06-04-2013 11:55 PM
On using BLASTX to find the protein coding frame woa Bioinformatics 1 11-27-2012 12:34 AM
Functions of the protein "zinc finger and BTB domain-containing protein" Testtube General 0 08-31-2012 10:20 AM
Which organism has the best GO annotations? RNAddict Bioinformatics 8 08-21-2012 07:51 AM
Where can i find the protein accesion id list on ncbi. rsingh2083 Bioinformatics 0 08-13-2012 02:51 AM

Thread Tools
Old 06-17-2013, 09:18 AM   #1
Location: /home/bob

Join Date: Jun 2012
Posts: 59
Default Best way to find host organism for protein IDs


I inherited some bioinformatics RNA-Seq from someone else, and I'm trying to make sense of something that they did. The project was looking for differentially expressed genes using DESeq, which works nicely. The output of that is then compared to a gtf file for the reference genome and protein IDs for each of the hits are extracted. A program called fastacmd is then used to get amino acid sequences for those protein IDs. That all makes sense to me. However then those protein sequences are blasted against the kegg database, and the 3 letter code for the best hits for each is then used to assign the organism host to that gene. This doesn't really make sense to me (since the header from the fasta file generated by fastacmd contains the organism name), I'm hoping someone else can help. The work was originally done about a year ago so no one can quite remember the logic of doing it, and the blast results from kegg gave some interesting output so I want to be able to validate it.

Can anyone offer some insight into their logic? or perhaps suggest a better way?

bob-loblaw is offline   Reply With Quote
Old 06-17-2013, 09:30 AM   #2
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,087

Perhaps you could install the taxdb database as indicated in this post:

And the also use the information in this blog post:
GenoMax is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:23 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO