Unconfigured Ad

**GenoMax** · 03-07-2014, 04:17 AM

See this thread (you will need to use the suggestion in post #3 in reverse): http://seqanswers.com/forums/showthread.php?t=9390

NCBI's e-Utilities may also help: http://www.ncbi.nlm.nih.gov/books/NBK179288/

**bigmw** · 03-10-2014, 05:31 PM

Pathview package has a function id2eg, which convert various types of gene IDs to Entrez Gene ID for major research species. Check the help info:
library(pathview)
?id2eg

Meanwhile, gage package has a dedicated vignette on “Gene set and data preparation”, check section 5-“gene or transcript ID conversion::

http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/dataPrep.pdf

**Parharn** · 06-10-2014, 01:42 AM

Thanks, I work on S. pombe and I cannot an annotation package for it on bioconductor.
What should I do? And what should I put for org?

> gnames.eg=pathview::id2eg(gnames, category="symbol", org="????")

**bigmw** · 06-10-2014, 09:34 AM

Function id2eg in pathview package works only if the annotation package exists, which is not the case for S. pombe.
If you just need your gene set data in Entrez Gene ID, you use the kegg.gsets function in gage package:
> grep("pombe", korg[,2])
[1] 126
> korg[126,]
kegg.code scientific.name
"spo" "Schizosaccharomyces pombe"
common.name entrez.gnodes
"fission yeast" "0"
kegg.geneid ncbi.geneid
"SPAC144.03" "2542823"
>kg.spo=kegg.gsets(species =" spo", id.type ="entrez")
…

If you need to convert your input data gene IDs, you can follow the thread GenoMax referred above, to download the gene_info data file from NCBI ftp site:
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
under unix/linux shell, do:
gunzip gene_info.gz
egrep '(^4896)' gene_info >>sp.gene_info.txt

Column 2-6 are (Entrez) GeneID, Symbol, LocusTag, Synonyms, dbXrefs. Note S. pombe taxonomy ID is 4896.

Or you can also use Bioconductor biomaRt package to the ID conversion.

**Parharn** · 06-11-2014, 05:30 AM

Thanks bigmw! Could you be a bit more clear where should I apply these commands in the process? I am not sure if I need ENTREZ or not. I am sure if I want to use my cufflinks data then I have to convert the IDs, but is it the same if I want to do the analysis with Deseq2 for instance?

Also, In part 3.2 it starts with:

> library(TxDb.Hsapiens.UCSC.hg19.knownGene)

I need help with finding the corresponding package for S. pombe instead of "TxDb.Hsapiens.UCSC.hg19.knownGene"!

Sorry, I am totally confused in this with all the IDs and libraries! I appreciate if you can give me some more help.

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, Today, 05:37 AM	0 responses 5 views 0 reactions	Last Post by SEQadmin2 Today, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 109 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

Entrez ID for GAGE

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News