Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GOseq errors

    Hi

    Below is the code I used and the errors returned for the GOseq R package. I would appreciate any suggestions as to how to resolve them.

    Regards

    Bill

    ________________________________________________________

    > library(goseq)
    > de.genes <- scan("DE_genes.txt", what=character() )
    Read 2744 items
    > all.genes <- scan("all_genes.txt", what=character() )
    Read 35180 items
    > genes = as.integer(all.genes %in% de.genes)
    > names(genes) = all.genes
    > pwf = nullp(genes, "mm9", "ensGene")
    Loading mm9 length data...
    > mapping=read.table("innatedb.in",header=TRUE,sep="\t")
    > innatedb=split(mapping$Pathway,mapping$ID)
    > pathways=goseq(genes,pwf,gene2cat=innatedb)
    Using manually entered categories.
    Error in summary(map)[, 1] : incorrect number of dimensions

    > head(pathways, n = 50)
    Error in head(pathways, n = 50) : object 'pathways' not found
    > enriched.pathways = pathways$category[pathways$upval < 0.01]
    Error: object 'pathways' not found
    > enriched.pathways
    Error: object 'enriched.pathways' not found

  • #2
    I ran into the same problem using manual annotations. If you look in the code, the goseq function restricts the list gene2cat to names that are present in the rownames of the data.frame kicked out by nullp. If you don't add the names of the data used to generate the list to the rownames of the data.frame, it chokes in the 'reversemapping' function because it is dealing with a null list. This fixed the problem for me:

    N = nullp(DEgenes = DETECTED_VECTOR,bias.data=LENGTHS_VECTOR)
    rownames(N) <- names(LENGTHS_VECTOR) ##this is what fixes it
    go = goseq(N,gene2cat=GENES_TO_CAT_LIST,method='Wallenius')

    Obviously you'll have to tailor it a bit to your situation, but I hope that helps.

    -Colin

    Comment


    • #3
      Hi,

      I'm running into the same problem I think.

      > sessionInfo()
      R version 2.12.2 (2011-02-25)
      Platform: i386-apple-darwin9.8.0/i386 (32-bit)

      other attached packages:
      [1] goseq_1.2.0 geneLenDataBase_0.99.5 BiasedUrn_1.03


      Here's a snippet of my code:

      > all <- read.delim("qd_de_gene_size_down_only.txt",header=T,stringsAsFactors=FALSE)
      > head(all)
      gene CDS DE
      1 au5.g1.t1 927 0
      2 au5.g10.t1 795 0

      >
      > #list of assayed genes
      > assayed.genes<-all$gene
      > head(assayed.genes)
      [1] "au5.g1.t1" "au5.g10.t1" "au5.g100.t1" "au5.g1000.t1"
      [5] "au5.g10000.t1" "au5.g10001.t1"
      > is.vector(assayed.genes)
      [1] TRUE
      >
      > #read gene lengths (CDS)
      > gene.length<-all$CDS
      > head(gene.length)
      [1] 927 795 1941 2317 207 4239
      > is.vector(gene.length)
      [1] TRUE
      >
      > #list of DE genes (0|1) caution: use UP_ONLY file for goseq, or DOWN_only
      > de.genes<-all$DE
      > head(de.genes)
      [1] 0 0 0 0 0 0
      > is.vector(de.genes)
      [1] TRUE
      >
      >
      > #gene vector
      > gene.vector<-as.integer(de.genes)
      > names(gene.vector)<-assayed.genes
      >
      > #KEGG annotations data frame
      > #au5.g10000_t1 520259 Ribosome
      > go<-read.delim("../ids_kegg_map_associations.txt",header=FALSE,stringsAsFactors=FALSE)
      > head(go)
      V1 V2 V3
      1 au5.g10000_t1 520259 map03010
      2 au5.g10001_t1 520260 map00480

      > go.terms=data.frame(go$V1,go$V3)
      > head(go.terms)
      go.V1 go.V3
      1 au5.g10000_t1 map03010
      2 au5.g10001_t1 map00480
      3 au5.g10002_t1 <NA>
      4 au5.g10003_t1 <NA>
      5 au5.g10004_t1 <NA>
      6 au5.g10005_t1 <NA>
      > is.data.frame(go.terms)
      [1] TRUE

      > pwf=nullp(gene.vector,bias.data=gene.length)
      >
      > stats=goseq(pwf,gene2cat=go.terms,method='Wallenius')#good for R 2.12
      Using manually entered categories.
      Error in summary(map)[, 1] : incorrect number of dimensions
      In addition: Warning message:
      In goseq(pwf, gene2cat = go.terms, method = "Wallenius") :
      Gene column could not be identified in gene2cat conclusively, using the one headed go.V1
      >


      If I construct

      rownames(pwf) <- names(gene.length) ##this is what fixes it

      I get the same error:

      > pwf=nullp(gene.vector,bias.data=gene.length)
      > rownames(pwf) <- names(gene.length) ##this is what fixes it
      > stats=goseq(pwf,gene2cat=go.terms,method='Wallenius')#good for R 2.12
      Using manually entered categories.
      Error in summary(map)[, 1] : incorrect number of dimensions
      Calls: goseq -> reversemapping
      In addition: Warning message:
      In goseq(pwf, gene2cat = go.terms, method = "Wallenius") :
      Gene column could not be identified in gene2cat conclusively, using the one headed go.V1
      Execution halted


      Any suggestions welcomed - I'm missing something simple I suspect.

      Charles

      Comment


      • #4
        I found the problem - different formatting of IDs imported:

        all: au5.g1.t1

        go.terms:au5.g1_t1


        dope!

        Charles

        Comment


        • #5
          Error in summary(map)[, 1] : incorrect number of dimensions

          Examples in the goseq pdf file dated 1 April 2010 show goseq() being called with the first argument "genes" and second argument "pwf". This is wrong. The "genes" argument is not used with goseq(). Its first argument is "pwf".
          See ?goseq

          Comment


          • #6
            Hi all,
            I am also having the same problem. My ID format is the same but still I get the same error when I run this.

            > stats=goseq(pwf,gene2cat=go.terms,method='Wallenius')
            Using manually entered categories.
            Error in summary(map)[, 1] : incorrect number of dimensions
            In addition: Warning message:
            In goseq(pwf, gene2cat = go.terms, method = "Wallenius") :
            Gene column could not be identified in gene2cat conclusively, using the one headed go.ID

            Does anyone know the source of the error? What is "dimensions" mentioned in the message?

            Thanks!
            Melis

            Comment


            • #7
              Originally posted by tedtoal View Post
              Examples in the goseq pdf file dated 1 April 2010 show goseq() being called with the first argument "genes" and second argument "pwf". This is wrong. The "genes" argument is not used with goseq(). Its first argument is "pwf".
              See ?goseq
              Thanks for the hint. I got into the same problem as everyone above and your hint helped me to fix it. Thanks

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X