View Single Post
Old 07-17-2014, 10:29 AM   #81
Senior Member
Location: US

Join Date: Aug 2013
Posts: 123

This is normal. Different analysis procedures give you different p-values, although you can still see lot of consistency between the them (the top hits or order of significant pathways). We expect native gage/pathview workflow to be more sensitive than the joint workflows due to the design of GAGE analysis procedure. Importantly, GAGE takes the sample size into account by default, but average fold change scores output from other tools donít have that info.
For details, there was an earlier thread talking on the same question:

Originally Posted by tigerxu View Post
I have followed the default workflows of gage and pathview on the example RNA-seq dataset. I also used the fold changes inferred by deseq2, then followed by the gage and pathview. I found both pipelines will output different results. The pipeline based on the fold changes by deseq2 generate much fewer significant pathways. For example below

> gage.kegg.sig<-sigGeneSet(gage.kegg.p, outname="sig.kegg",pdf.size=c(7,8))
[1] "there are 22 signficantly up-regulated gene sets"
[1] "there are 17 signficantly down-regulated gene sets"

> deseq2.kegg.sig<-sigGeneSet(deseq2.kegg.p, outname="deseq2.sig.kegg",pdf.size=c(7,8))
[1] " needs to be a matrix-like object!"
[1] "No heatmap produced for down-regulated gene sets, only 1 or none signficant."
[1] " needs to be a matrix-like object!"
[1] "there are 7 signficantly up-regulated gene sets"
[1] "there are 0 signficantly down-regulated gene sets"

I'm wondering which pipeline is more reliable for biological interpretation. Why the pipeline based on deseq2 return much fewer pathways? Can anyone give me some advice?

bigmw is offline   Reply With Quote