Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Parharn
    Member
    • Jul 2013
    • 84

    Entrez ID for GAGE

    How does one convert gene symbols into Entrez Gene IDs for using the data with GAGE?
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    See this thread (you will need to use the suggestion in post #3 in reverse): http://seqanswers.com/forums/showthread.php?t=9390

    NCBI's e-Utilities may also help: http://www.ncbi.nlm.nih.gov/books/NBK179288/

    Comment

    • bigmw
      Senior Member
      • Aug 2013
      • 124

      #3
      Pathview package has a function id2eg, which convert various types of gene IDs to Entrez Gene ID for major research species. Check the help info:
      library(pathview)
      ?id2eg

      Meanwhile, gage package has a dedicated vignette on “Gene set and data preparation”, check section 5-“gene or transcript ID conversion::

      Comment

      • Parharn
        Member
        • Jul 2013
        • 84

        #4
        Thanks, I work on S. pombe and I cannot an annotation package for it on bioconductor.
        What should I do? And what should I put for org?

        > gnames.eg=pathview::id2eg(gnames, category="symbol", org="????")

        Comment

        • bigmw
          Senior Member
          • Aug 2013
          • 124

          #5
          Function id2eg in pathview package works only if the annotation package exists, which is not the case for S. pombe.
          If you just need your gene set data in Entrez Gene ID, you use the kegg.gsets function in gage package:
          > grep("pombe", korg[,2])
          [1] 126
          > korg[126,]
          kegg.code scientific.name
          "spo" "Schizosaccharomyces pombe"
          common.name entrez.gnodes
          "fission yeast" "0"
          kegg.geneid ncbi.geneid
          "SPAC144.03" "2542823"
          >kg.spo=kegg.gsets(species =" spo", id.type ="entrez")


          If you need to convert your input data gene IDs, you can follow the thread GenoMax referred above, to download the gene_info data file from NCBI ftp site:
          ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
          under unix/linux shell, do:
          gunzip gene_info.gz
          egrep '(^4896)' gene_info >>sp.gene_info.txt

          Column 2-6 are (Entrez) GeneID, Symbol, LocusTag, Synonyms, dbXrefs. Note S. pombe taxonomy ID is 4896.

          Or you can also use Bioconductor biomaRt package to the ID conversion.

          Comment

          • Parharn
            Member
            • Jul 2013
            • 84

            #6
            Thanks bigmw! Could you be a bit more clear where should I apply these commands in the process? I am not sure if I need ENTREZ or not. I am sure if I want to use my cufflinks data then I have to convert the IDs, but is it the same if I want to do the analysis with Deseq2 for instance?

            Also, In part 3.2 it starts with:

            > library(TxDb.Hsapiens.UCSC.hg19.knownGene)

            I need help with finding the corresponding package for S. pombe instead of "TxDb.Hsapiens.UCSC.hg19.knownGene"!

            Sorry, I am totally confused in this with all the IDs and libraries! I appreciate if you can give me some more help.

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Today, 05:37 AM
            0 responses
            5 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            16 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            49 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            109 views
            0 reactions
            Last Post SEQadmin2  
            Working...