Hello,
I am currently trying to find the enriched pathways following a RNAseq analysis using DESeq2, in a plant species where there is no reference genome. However, it does not work as I expected.
I used the bioconductor package “mygene” in order to get entrez identifiers corresponding to my uniprot identifiers that I had in my genes list. I then constructed my fold changes table using the log2 fold change obtained by using DESeq2, and having for names the entrez identifiers.
111151 6920 100303206 24328 8668 20517
0.5957113 -0.4976848 0.4986454 -0.1833950 -0.3897194 0.5718210
However, then when I use the gage function, I have troubles finding the enriched pathways. My data actually comes from plants, so I thought of using the kegg’s pathways from Arabidopsis thaliana as such:
But I get this kind of results:
$greater
p.geomean stat.mean p.val q.val set.size exp1
ath00970 Aminoacyl-tRNA biosynthesis NA NaN NA NA 0 NA
ath02010 ABC transporters NA NaN NA NA 0 NA
However when I tried by curiosity to use the homo sapiens pathway with the following code, it seems to work better…
$greater
p.geomean stat.mean p.val q.val set.size exp1
hsa00010 Glycolysis / Gluconeogenesis 0.1426633 1.10118079 0.1426633 0.9257147 10 0.1426633
hsa00240 Pyrimidine metabolism 0.1566751 1.03046501 0.1566751 0.9257147 13 0.1566751
Can someone give me a clue about what is going on?
Best regards,
BioLion
I am currently trying to find the enriched pathways following a RNAseq analysis using DESeq2, in a plant species where there is no reference genome. However, it does not work as I expected.
I used the bioconductor package “mygene” in order to get entrez identifiers corresponding to my uniprot identifiers that I had in my genes list. I then constructed my fold changes table using the log2 fold change obtained by using DESeq2, and having for names the entrez identifiers.
Code:
head(foldchanges)
0.5957113 -0.4976848 0.4986454 -0.1833950 -0.3897194 0.5718210
However, then when I use the gage function, I have troubles finding the enriched pathways. My data actually comes from plants, so I thought of using the kegg’s pathways from Arabidopsis thaliana as such:
Code:
kegg.ath=kegg.gsets("ath",id.type="entrez") kegg.ath.sigmet=kegg.ath$kg.sets[kegg.ath$sigmet.idx] keggres=gage(foldchanges, gsets=kegg.ath.sigmet, same.dir=TRUE)
$greater
p.geomean stat.mean p.val q.val set.size exp1
ath00970 Aminoacyl-tRNA biosynthesis NA NaN NA NA 0 NA
ath02010 ABC transporters NA NaN NA NA 0 NA
However when I tried by curiosity to use the homo sapiens pathway with the following code, it seems to work better…
Code:
data(kegg.sets.hs) data(sigmet.idx.hs) kegg.sets.hs=kegg.sets.hs[sigmet.idx.hs] keggres=gage(foldchanges, gsets=kegg.sets.hs, same.dir=TRUE)
p.geomean stat.mean p.val q.val set.size exp1
hsa00010 Glycolysis / Gluconeogenesis 0.1426633 1.10118079 0.1426633 0.9257147 10 0.1426633
hsa00240 Pyrimidine metabolism 0.1566751 1.03046501 0.1566751 0.9257147 13 0.1566751
Can someone give me a clue about what is going on?
Best regards,
BioLion