Hi,
I have several annotated bacterial genomes, and would like to map pathway information to the coding sequences in each genome. In the past I've used blast2go to query KEGG, but no longer have access to this. So, I've been looking at free command line programs (mostly R based: reactomePA, KEGGREST, Metacyc tools looks good but don't think they have command line option?). However, KEGGREST and reactomePA require specific accessions as input (usually an Entrez Gene ID), and the only accessions present in my PGAAP-annotated file are refseq IDs (and a few SwissProt IDs). I've used several programs (e.g., MyGene.Info in Bioconductor) to convert the refseq IDs to Gene IDs, and have found that most of the refseq IDs do not map to any Gene IDs. So, how can I get pathway information for these sequences?
Thanks!
I have several annotated bacterial genomes, and would like to map pathway information to the coding sequences in each genome. In the past I've used blast2go to query KEGG, but no longer have access to this. So, I've been looking at free command line programs (mostly R based: reactomePA, KEGGREST, Metacyc tools looks good but don't think they have command line option?). However, KEGGREST and reactomePA require specific accessions as input (usually an Entrez Gene ID), and the only accessions present in my PGAAP-annotated file are refseq IDs (and a few SwissProt IDs). I've used several programs (e.g., MyGene.Info in Bioconductor) to convert the refseq IDs to Gene IDs, and have found that most of the refseq IDs do not map to any Gene IDs. So, how can I get pathway information for these sequences?
Thanks!
Comment