Unconfigured Ad

**entrez** · 03-30-2014, 08:00 AM

Thanks for the input. I agree on your feeling on these two packages. Do you see differences in the results using GAGE and GOseq?

**sindrle** · 03-30-2014, 12:16 PM

I have not done a direct comparison yet, but I will in the future.

**bigmw** · 04-07-2014, 12:10 PM

Thanks for the good words on GAGE. Let us know if you have more comments/questions.

Originally posted by sindrle View Post

I have done a quick test with GOseq, but I must admit I like GAGE better after first glance. Easy to follow, nice manual, nice plots, lots of results and possibilities. It really facilitates further analysis I think.

But Im going to give GOseq another go for sure!

**tigerxu** · 07-07-2014, 03:13 AM

Originally posted by bigmw View Post

Forgot that sigGeneSet function has been updated to give users more control on the margin and font size. sigGeneSet calls a internal function heatmap2 to generate the heatmaps. So check the argument for this function
args(gage:::heatmap2)
The argument two relevant arguments here are margins and cexRow, which control the margins for column/row names and row name font size, you may do something like:
kegg.sig<-sigGeneSet(cnts.kegg.p,outname="~/RNAseq/13_Acute-Changes/14_GAGE_native_A1A2/A1A2All/A1A2All.kegg",pdf.size = c(7,12), margins = c(5,10))

I have a question about the margin argument in the sigGeneSet function when I run the following command
> rcount.kegg.sig<-sigGeneSet(rcount.kegg.p, outname="sig.kegg",pdf.size=c(7,12),margins=c(5, 10))
Error in heatmap2(gs.data, Colv = F, Rowv = F, dendrogram = "none", col = cols, :
formal argument "margins" matched by multiple actual arguments

Can anyone help me?

Thanks!

**bigmw** · 07-07-2014, 08:54 AM

You may want to check the version of the gage package you are running, which can be seen by:
sessionInfo()

**tigerxu** · 07-09-2014, 03:06 AM

Originally posted by bigmw View Post

You may want to check the version of the gage package you are running, which can be seen by:
sessionInfo()

other attached packages:
[1] gage_2.14.2 GenomicAlignments_1.0.2
[3] BSgenome_1.32.0 Rsamtools_1.16.1
[5] Biostrings_2.32.0 XVector_0.4.0
[7] DESeq2_1.4.5 RcppArmadillo_0.4.300.8.0
[9] Rcpp_0.11.2 GenomicRanges_1.16.3
[11] GenomeInfoDb_1.0.2 IRanges_1.22.9
[13] BiocGenerics_0.10.0

Is the version of gage not proper?

**bigmw** · 07-09-2014, 09:45 AM

This is the latest version. Do you still get the problem?

**tigerxu** · 07-09-2014, 10:39 AM

Originally posted by bigmw View Post

This is the latest version. Do you still get the problem?

The problem is still there. But I have modified the margins parameters in the internal function sigGeneSet within the gage package. It can work!

**bigmw** · 07-10-2014, 06:05 AM

Just checked the source code for sigGeneSet and internal functions gs.heatmap. there seems to be a potential conflict in argument margins indeed. Will have the problem fixed. you can check the updated version 2.14.3 in the next couple of days here:

gage

http://www.bioconductor.org/packages/release/bioc/html/gage.html

GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.

**tigerxu** · 07-10-2014, 06:26 AM

Originally posted by bigmw View Post

Just checked the source code for sigGeneSet and internal functions gs.heatmap. there seems to be a potential conflict in argument margins indeed. Will have the problem fixed. you can check the updated version 2.14.3 in the next couple of days here:
http://www.bioconductor.org/packages...html/gage.html

Okay, thank! I will try version 2.14.3 later.

**tigerxu** · 07-11-2014, 12:26 PM

I have followed the default workflows of gage and pathview on the example RNA-seq dataset. I also used the fold changes inferred by deseq2, then followed by the gage and pathview. I found both pipelines will output different results. The pipeline based on the fold changes by deseq2 generate much fewer significant pathways. For example below

> gage.kegg.sig<-sigGeneSet(gage.kegg.p, outname="sig.kegg",pdf.size=c(7,8))
[1] "there are 22 signficantly up-regulated gene sets"
[1] "there are 17 signficantly down-regulated gene sets"

> deseq2.kegg.sig<-sigGeneSet(deseq2.kegg.p, outname="deseq2.sig.kegg",pdf.size=c(7,8))
[1] "gs.data needs to be a matrix-like object!"
[1] "No heatmap produced for down-regulated gene sets, only 1 or none signficant."
[1] "gs.data needs to be a matrix-like object!"
[1] "there are 7 signficantly up-regulated gene sets"
[1] "there are 0 signficantly down-regulated gene sets"

I'm wondering which pipeline is more reliable for biological interpretation. Why the pipeline based on deseq2 return much fewer pathways? Can anyone give me some advice?

Thanks!

**crazyhottommy** · 07-14-2014, 12:56 PM

Hi there, thank you for making this awesome tool.

I am working with mouse data, I want to know how to convert the gene set into gene symbol format.

kg.mouse<- kegg.gsets("mouse")
kegg.gs<- kg.mouse$kg.sets[kg.mouse$sigmet.idx]
lapply(kegg.gs[1:3],head)

the eg2sym function is only for human data. I can not do things below:

data(egSymb)
kegg.gs.sym<- lapply(kegg.gs, eg2sym)

Thank you!
Tommy

**bigmw** · 07-15-2014, 12:49 PM

The pathview package provides two functions: eg2id and id2eg, for ID mapping/conversion for major research species. For details:
?pathview::eg2id

BTW, I would suggest you to convert your data ID from symbol to Entrez Gene, rather than your gene set ID from Entrez to symbol. The former should be much faster as it only need to call the conversion function once.

**bigmw** · 07-15-2014, 05:09 PM

BTW, has a separate tutorial on data preparation, you can check Section 5 -- gene or transcript ID conversion:

http://www.bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/dataPrep.pdf

**crazyhottommy** · 07-15-2014, 06:46 PM

Originally posted by bigmw View Post

BTW, has a separate tutorial on data preparation, you can check Section 5 -- gene or transcript ID conversion:
http://www.bioconductor.org/packages...c/dataPrep.pdf

Thank you, I followed it, after DESeq. 1724 differentially expressed genes were used for pathway analysis.

res <- nbinomTest( cds, 'control, 'treat' )

resSig <- res[ res$padj < 0.01 & (res$log2FoldChange >1| res$log2FoldChange < -1), ]

resSig <- na.omit(resSig)

require(gage)
datakegg.gs)
deseq.fc<- resSig$log2FoldChange
names(deseq.fc)<- resSig$id
sum(is.infinite(deseq.fc)) # there are some infinite numbers, if use DESeq2, no such problem.
deseq.fc[deseq.fc>10]=10
deseq.fc[deseq.fc<-10]=-10
exp.fc<- deseq.fc

#kegg.gsets works with 3000 KEGG speicies
data(korg)
head(korg[,1:3], n=20)

#let's get the annotation files for mouse and convert the gene set to gene symbol format
kg.mouse<- kegg.gsets("mouse")
kegg.gs<- kg.mouse$kg.sets[kg.mouse$sigmet.idx]
lapplykegg.gs[1:3],head)

# to convert IDs among gene/transcript ID to Entrez GeneID or reverse, use eg2id and id2eg in the pathview package
library(pathview)
data(bods)
bods

gene.symbol.eg<- id2eg(ids=names(exp.fc), category='SYMBOL', org='Mm')
# convert the gene symbol to Entrez Gene ID
head(gene.symbol.eg, n=100)
head(gene.symbol.eg[,2], n=10)

names(exp.fc)<- gene.symbol.eg[,2]

fc.kegg.p<- gage(exp.fc, gsets= kegg.gs, ref=NULL, samp=NULL)
sel<- fc.kegg.p$greater[,"q.val"] < 0.1 & !is.na(fc.kegg.p$greater[,"q.val"])
table(sel)

sel.l<- fc.kegg.p$less[,"q.val"] < 0.1 & !is.na(fc.kegg.p$greater[,"q.val"])
table(sel.l)

> table(sel.l)
sel.l
FALSE
202

> table(sel)
sel
FALSE
202

Am I doing it right?

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News