SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Generally Applicable Gene-set Analysis (GAGE) problem wilson90 Bioinformatics 1 08-19-2013 10:28 AM
Gene set enrichment analysis of RNA-Seq data jel4h Bioinformatics 1 06-21-2012 05:25 AM

Reply
 
Thread Tools
Old 10-21-2013, 04:07 PM   #1
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default new RNA-Seq Pathway and Gene-set Analysis Workflows in R/Bioconductor

The gage package (2.12.0) now includes a new tutorial, "RNA-Seq Data Pathway and Gene-set Analysis Workflows". Note you need to update to current release versions of R(3.0.2)/ Bioconductor(2.13) to use all the features. Please check it out:
http://bioconductor.org/packages/rel...html/gage.html
http://bioconductor.org/packages/rel...eqWorkflow.pdf

We first cover a full workflow from preparation, reads counting, data preprocessing, gene set test, to pathway visualization in about 40 lines of codes. The same workflow can be used for GO analysis or other types of gene set analysis too. We also describe joint workflows, i.e. to do gene-level analysis using one of the major RNA-Seq analysis tools, DEseq/DEseq2, edgeR, limma and Cufflinks, and feed the results into GAGE/Pahview for pathway analysis or visualization. All these workflows are implemented in R/Bioconductor.
Comments and questions are welcome. Thanks!

Last edited by bigmw; 10-21-2013 at 04:52 PM.
bigmw is offline   Reply With Quote
Old 10-22-2013, 10:53 AM   #2
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

GAGE and Pathview can be used independent of each other. GAGE does pathway and Gene-set Analysis, and works on other tyeps of gene sets than pathways, like GO, coexpressed/coregulated gene sets, TF or miRNA target lists etc. Pathview may integrate and visualizeuser data onto pathway graphs independent of pathway analysis procedure.

Pathview package available at:
http://bioconductor.org/packages/rel.../pathview.html
Here is the info page with example output:
http://pathview.r-forge.r-project.org/
bigmw is offline   Reply With Quote
Old 10-22-2013, 01:58 PM   #3
entrez
Junior Member
 
Location: NY

Join Date: Nov 2010
Posts: 7
Default

Iíve both gage and pathview installed on my computer. I tried to follow the example in the native workflow. Things work well, except I didnít get 4 samples show up in the same graph (or nodes with 4 slices) as in Figure 2 of the workflow document, instead I got 4 separate graphs. What I might have done wrong?
entrez is offline   Reply With Quote
Old 10-22-2013, 08:13 PM   #4
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

What versions of gage, pathview and Bioconductor you have?
bigmw is offline   Reply With Quote
Old 10-23-2013, 06:30 AM   #5
entrez
Junior Member
 
Location: NY

Join Date: Nov 2010
Posts: 7
Default

gage 2.10.0, pathview 1.1.4 and Bioconductor 2.12
entrez is offline   Reply With Quote
Old 10-24-2013, 07:15 AM   #6
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

pathview 1.1.4 does not show multiple samples/states in the same graph, you need to upgrade to the current release, which is 1.2.0: http://bioconductor.org/packages/rel.../pathview.html.
I would recommend to do an overall upgrade to R 3.0.2/Bioconductor 2.13, which will update your pathview and gage to the latest version too.
bigmw is offline   Reply With Quote
Old 10-25-2013, 06:32 AM   #7
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

If you don’t know how to upgrade Bioc, please check:
http://www.bioconductor.org/install/...uctor-packages
Here is some work around if you get problems:
https://stat.ethz.ch/pipermail/bioco...er/055642.html
bigmw is offline   Reply With Quote
Old 10-27-2013, 05:21 PM   #8
entrez
Junior Member
 
Location: NY

Join Date: Nov 2010
Posts: 7
Default

Can I use the workflow (with necessary changes) for microarray data analysis? If so, how?
entrez is offline   Reply With Quote
Old 10-28-2013, 09:42 AM   #9
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

GAGE/Pathview workflow can be applied for microarray data analysis. Please check the main tutorials of gage and pathview for details:
http://bioconductor.org/packages/rel...t/doc/gage.pdf
http://bioconductor.org/packages/rel...c/pathview.pdf
bigmw is offline   Reply With Quote
Old 10-29-2013, 05:52 PM   #10
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

Pathview is actually applicable to any data mappable to pathways, including gene, protein, metabolite, genetics, literature, and others. The tutorial describes examples on metabolite/compound data too.
bigmw is offline   Reply With Quote
Old 11-15-2013, 02:36 PM   #11
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Hi I was playing around with GAGE, one question is that I got the count table by HTSeq, and the ids are gene names for each row, how can I change the gene names to GO term ids?

Thanks
crazyhottommy is offline   Reply With Quote
Old 11-16-2013, 08:06 AM   #12
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

You don’t have to change gene names/IDs to GO term IDs. GAGE (or other gene set analysis tools) requires two major input data objects: your expression data (vector or matrix-like) and gene set list (list of gene ID vectors). Make sure your gene IDs in expression data and gene set list are the same type, i.e. both are Entrez Gene IDs, or both gene symbols, etc.
You may want to go through the basics and common use of gage described in the main gage vignette:
http://bioconductor.org/packages/rel...t/doc/gage.pdf
if you want a quick start, section 1, 6 and 7 (page 1, 4-8) would be enough. You will see examples for both KEGG and GO analysis.
bigmw is offline   Reply With Quote
Old 11-17-2013, 05:38 PM   #13
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

If you follow the RNA-seq workflows (links in the first post above), we can actually work on the demo examples from Step 2. In other words, we can start with the pre-mapped raw read counts data (from previous steps), i.e. hnrnp.cnts stored in gageData. I would suggest you to run the demo example and explore gage/pathview functions and input/output data by yourself.
bigmw is offline   Reply With Quote
Old 11-17-2013, 08:35 PM   #14
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

Thank you!
crazyhottommy is offline   Reply With Quote
Old 01-05-2014, 11:03 PM   #15
wilson90
Member
 
Location: Singapore

Join Date: May 2012
Posts: 48
Default

In your vignette, we are suppose to provide our annotation file.
I wonder where have you obtained "kegg.gs"?
and I want to use GO annotation. So where can I obtain "GO.gs" in R?
Thank you.

Frustrated user
wilson90 is offline   Reply With Quote
Old 01-06-2014, 02:28 PM   #16
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

gage package has a function, kegg.gsets, to generate updated pathway gene set data in real time for ~ 2300 KEGG species and KEGG Orthology (with species="ko").
gageData package provides kegg and GO gene sets for 4 common research species: human, mouse, rat and budding yeast.
You may want to go through the main vignette and other documents of gage package (besides the RNA-Seq workflow tutorial):
http://bioconductor.org/packages/rel...t/doc/gage.pdf
http://bioconductor.org/packages/rel...html/gage.html

gageData is available:
http://bioconductor.org/packages/rel.../gageData.html
bigmw is offline   Reply With Quote
Old 02-21-2014, 09:04 AM   #17
shriram
Member
 
Location: UK

Join Date: May 2010
Posts: 13
Default

Hi
I am using pathview for yeast however I get following error while retrieving pathway information for 19 different pathways.
[1] "Downloading xml files for sce04113 Meiosis - yeast, 1/19 pathways.."
[1] "Downloading png files for sce04113 Meiosis - yeast, 1/19 pathways.."
Download of sce04113 Meiosis - yeast xml and png files failed!
Failed to download KEGG xml/png files, sce04113 Meiosis - yeast skipped!

Same functionality works fine with human data.

below is my R version
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

Thanks
Shriram
shriram is offline   Reply With Quote
Old 02-21-2014, 10:17 AM   #18
shriram
Member
 
Location: UK

Join Date: May 2010
Posts: 13
Default

Quote:
Originally Posted by shriram View Post
Hi
I am using pathview for yeast however I get following error while retrieving pathway information for 19 different pathways.
[1] "Downloading xml files for sce04113 Meiosis - yeast, 1/19 pathways.."
[1] "Downloading png files for sce04113 Meiosis - yeast, 1/19 pathways.."
Download of sce04113 Meiosis - yeast xml and png files failed!
Failed to download KEGG xml/png files, sce04113 Meiosis - yeast skipped!

Same functionality works fine with human data.

below is my R version
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

Thanks
Shriram
############
Issue resolved
by taking substring of actual pathway name in kegg and specifying gene.idtype="KEGG"
path.ids <- substr(path.ids, 1, 8)
############
shriram is offline   Reply With Quote
Old 02-22-2014, 07:36 AM   #19
bigmw
Senior Member
 
Location: US

Join Date: Aug 2013
Posts: 123
Default

gene.idtype="KEGG" specifies the ID type used for the gene.data. It is not related to the error message, which indicates a download problem. As shown in your solution, this download problem is due to the wrong pathway IDs.

Quote:
Originally Posted by shriram View Post
############
Issue resolved
by taking substring of actual pathway name in kegg and specifying gene.idtype="KEGG"
path.ids <- substr(path.ids, 1, 8)
############
bigmw is offline   Reply With Quote
Old 03-12-2014, 08:02 AM   #20
shocker8786
Member
 
Location: Urbana Illinois

Join Date: Jan 2013
Posts: 28
Default

I have a question about using GAGE with data from cufflinks, as described in the RNA-Seq workflow tutorial. I have RNAseq data from pigs that was aligned using Tophat and analyzed for DEGs using cufflinks. I'm going through the process listed in the cufflinks section, but I'm running into an error. Below are the commands I've been entering. Everything runs fine until I get to the last command.

> cuff.res=read.delim(file="gene_exp.diff", sep="\t")
> cuff.fc=cuff.res$log2.fold_change
> gnames=cuff.res$gene
> sel=gnames!="-"
> gnames=as.character(gnames[sel])
> cuff.fc=cuff.fc[sel]
> names(cuff.fc)=gnames
> gnames.eg=pathview::id2eg(gnames, category ="symbol")
> sel2=gnames.eg[,2]>""
> cuff.fc=cuff.fc[sel2]
> names(cuff.fc)=gnames.eg[sel2,2]
> range(exp.fc)
Error: object 'exp.fc' not found

Do you know what the issue could be? I'm just starting out with RNAseq data and using R, and I haven't been able to find anyone else with this issue. Thanks.
shocker8786 is offline   Reply With Quote
Reply

Tags
gene set, pathway analysis, r/bioconductor, rna-seq, visualization

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:51 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO