Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generally Applicable Gene-set Analysis (GAGE) problem

    I am trying to use GAGE to do some pathway analysis based on RNA-Seq data. The vignette is nicely written, but there is an error that I do not know if it is my fault in my script.

    I will show you the error first:
    Code:
    > head(expression.kegg.esg.up$coreGeneSets,4)
    $`hsa00190 Oxidative phosphorylation`
    character(0)
    
    $`hsa03010 Ribosome`
    character(0)
    
    $`hsa04141 Protein processing in endoplasmic reticulum`
    character(0)
    
    $`hsa04120 Ubiquitin mediated proteolysis`
    character(0)
    
    > head(expression.kegg.esg.dn$coreGeneSets,4)
    $`hsa04610 Complement and coagulation cascades`
    character(0)
    
    $`hsa04640 Hematopoietic cell lineage`
    character(0)
    
    $`hsa00140 Steroid hormone biosynthesis`
    character(0)
    Apparently, these are the list of significant pathway identified. I couldnt figure why the gene sets are missing in each pathway, unlike what is shown in the vignette. It is required for next step of the analysis.
    (link: http://bioconductor.org/packages/2.1...html/gage.html)

    I am thinking that because I might have probably converted the Entrez ID of Kegg.gs to Symbol instead. But, I am clueless on how it caused this error.

    My script:
    Code:
    #read data
    data = read.table("data.txt")
    row = read.table("rownames3.txt")
    col = read.table("colnames3.txt")
    length(col$V1)
    expression = matrix(scan("data3.txt"),byrow=T,ncol=length(col$V1),dimnames=list(row$V1,col$V1))
    str(expression)
    head(expression)
    
    #reading Gene set data input
    filename=system.file("extdata/c2.demo.gmt",package="gage")
    demo.gs=readList(filename)
    cn=colnames(expression)
    hn=grep('A710003',cn,ignore.case=T)
    dn=grep('A710004',cn,ignore.case=T)
    data(kegg.gs)
    data(egSymb)
    
    #convert it back to entrez id
    head(rownames(expression))
    kegg.gs.sym<-lapply(kegg.gs,eg2sym)
    lapply(kegg.gs.sym[1:3],head)
    
    #preliminra
    expression.kegg.p <- gage(expression,gsets=kegg.gs.sym, ref=hn, samp=dn, saaTest=gs.KSTest)
    write.table(rbind(expression.kegg.p$greater,expression.kegg.p$less), file="kegg_kstest.txt",sep="\t")
    expression.kegg.2d.p <- gage(expression,gsets=kegg.gs.sym, ref=hn, samp=dn, saaTest=gs.KSTest, same.dir=F)
    write.table(rbind(expression.kegg.2d.p$greater,expression.kegg.2d.p$less), file="kegg_kstest_2d.txt",sep="\t")
    
    #find sig pathway
    expression.kegg.sig = sigGeneSet(expression.kegg.p, outname="expression.kegg")
    expression.kegg.2d.sig = sigGeneSet(expression.kegg.2d.p, outname="expression.kegg.2d")
    write.table(expression.kegg.2d.sig$greater, file="kegg_kstest_2d_sig.txt",sep="\t")
    
    #sort and count significant gene sets based on q-value or p-va
    expression.kegg.esg.up <- esset.grp(expression.kegg.p$greater,expression,gsets=kegg.gs.sym, ref=hn,samp=dn,test4up=T,output=T,outname="expression.kegg.up",make.plot=F)
    expression.kegg.esg.dn <- esset.grp(expression.kegg.p$less,expression,gsets=kegg.gs.sym, ref=hn,samp=dn,test4up=T,output=T,outname="expression.kegg.dn",make.plot=F)
    head(expression.kegg.esg.up$setGroups,4)
    head(expression.kegg.esg.up$essentialSets,4)
    head(expression.kegg.esg.up$coreGeneSets,4)
    head(expression.kegg.esg.dn$coreGeneSets,4)
    My problem is that I can't proceed to the next step of analysis unless I ensure that the pathways contain the genes.
    Many thanks.

  • #2
    It would be helpful if you can post your input data here too.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    18 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    48 views
    0 likes
    Last Post seqadmin  
    Working...
    X