Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Thanks for the input. I agree on your feeling on these two packages. Do you see differences in the results using GAGE and GOseq?

    Comment


    • #62
      I have not done a direct comparison yet, but I will in the future.

      Comment


      • #63
        Thanks for the good words on GAGE. Let us know if you have more comments/questions.

        Originally posted by sindrle View Post
        I have done a quick test with GOseq, but I must admit I like GAGE better after first glance. Easy to follow, nice manual, nice plots, lots of results and possibilities. It really facilitates further analysis I think.

        But Im going to give GOseq another go for sure!

        Comment


        • #64
          Originally posted by bigmw View Post
          Forgot that sigGeneSet function has been updated to give users more control on the margin and font size. sigGeneSet calls a internal function heatmap2 to generate the heatmaps. So check the argument for this function
          args(gage:::heatmap2)
          The argument two relevant arguments here are margins and cexRow, which control the margins for column/row names and row name font size, you may do something like:
          kegg.sig<-sigGeneSet(cnts.kegg.p,outname="~/RNAseq/13_Acute-Changes/14_GAGE_native_A1A2/A1A2All/A1A2All.kegg",pdf.size = c(7,12), margins = c(5,10))
          I have a question about the margin argument in the sigGeneSet function when I run the following command
          > rcount.kegg.sig<-sigGeneSet(rcount.kegg.p, outname="sig.kegg",pdf.size=c(7,12),margins=c(5, 10))
          Error in heatmap2(gs.data, Colv = F, Rowv = F, dendrogram = "none", col = cols, :
          formal argument "margins" matched by multiple actual arguments

          Can anyone help me?

          Thanks!

          Comment


          • #65
            You may want to check the version of the gage package you are running, which can be seen by:
            sessionInfo()

            Comment


            • #66
              Originally posted by bigmw View Post
              You may want to check the version of the gage package you are running, which can be seen by:
              sessionInfo()
              other attached packages:
              [1] gage_2.14.2 GenomicAlignments_1.0.2
              [3] BSgenome_1.32.0 Rsamtools_1.16.1
              [5] Biostrings_2.32.0 XVector_0.4.0
              [7] DESeq2_1.4.5 RcppArmadillo_0.4.300.8.0
              [9] Rcpp_0.11.2 GenomicRanges_1.16.3
              [11] GenomeInfoDb_1.0.2 IRanges_1.22.9
              [13] BiocGenerics_0.10.0

              Is the version of gage not proper?

              Comment


              • #67
                This is the latest version. Do you still get the problem?

                Comment


                • #68
                  Originally posted by bigmw View Post
                  This is the latest version. Do you still get the problem?
                  The problem is still there. But I have modified the margins parameters in the internal function sigGeneSet within the gage package. It can work!

                  Comment


                  • #69
                    Just checked the source code for sigGeneSet and internal functions gs.heatmap. there seems to be a potential conflict in argument margins indeed. Will have the problem fixed. you can check the updated version 2.14.3 in the next couple of days here:
                    GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.

                    Comment


                    • #70
                      Originally posted by bigmw View Post
                      Just checked the source code for sigGeneSet and internal functions gs.heatmap. there seems to be a potential conflict in argument margins indeed. Will have the problem fixed. you can check the updated version 2.14.3 in the next couple of days here:
                      http://www.bioconductor.org/packages...html/gage.html
                      Okay, thank! I will try version 2.14.3 later.

                      Comment


                      • #71
                        I have followed the default workflows of gage and pathview on the example RNA-seq dataset. I also used the fold changes inferred by deseq2, then followed by the gage and pathview. I found both pipelines will output different results. The pipeline based on the fold changes by deseq2 generate much fewer significant pathways. For example below

                        > gage.kegg.sig<-sigGeneSet(gage.kegg.p, outname="sig.kegg",pdf.size=c(7,8))
                        [1] "there are 22 signficantly up-regulated gene sets"
                        [1] "there are 17 signficantly down-regulated gene sets"

                        > deseq2.kegg.sig<-sigGeneSet(deseq2.kegg.p, outname="deseq2.sig.kegg",pdf.size=c(7,8))
                        [1] "gs.data needs to be a matrix-like object!"
                        [1] "No heatmap produced for down-regulated gene sets, only 1 or none signficant."
                        [1] "gs.data needs to be a matrix-like object!"
                        [1] "there are 7 signficantly up-regulated gene sets"
                        [1] "there are 0 signficantly down-regulated gene sets"

                        I'm wondering which pipeline is more reliable for biological interpretation. Why the pipeline based on deseq2 return much fewer pathways? Can anyone give me some advice?

                        Thanks!
                        Last edited by tigerxu; 07-11-2014, 12:29 PM.

                        Comment


                        • #72
                          Hi there, thank you for making this awesome tool.

                          I am working with mouse data, I want to know how to convert the gene set into gene symbol format.

                          kg.mouse<- kegg.gsets("mouse")
                          kegg.gs<- kg.mouse$kg.sets[kg.mouse$sigmet.idx]
                          lapply(kegg.gs[1:3],head)


                          the eg2sym function is only for human data. I can not do things below:

                          data(egSymb)
                          kegg.gs.sym<- lapply(kegg.gs, eg2sym)

                          Thank you!
                          Tommy

                          Comment


                          • #73
                            The pathview package provides two functions: eg2id and id2eg, for ID mapping/conversion for major research species. For details:
                            ?pathview::eg2id

                            BTW, I would suggest you to convert your data ID from symbol to Entrez Gene, rather than your gene set ID from Entrez to symbol. The former should be much faster as it only need to call the conversion function once.

                            Comment


                            • #74
                              BTW, has a separate tutorial on data preparation, you can check Section 5 -- gene or transcript ID conversion:

                              Comment


                              • #75
                                Originally posted by bigmw View Post
                                BTW, has a separate tutorial on data preparation, you can check Section 5 -- gene or transcript ID conversion:
                                http://www.bioconductor.org/packages...c/dataPrep.pdf
                                Thank you, I followed it, after DESeq. 1724 differentially expressed genes were used for pathway analysis.

                                res <- nbinomTest( cds, 'control, 'treat' )

                                resSig <- res[ res$padj < 0.01 & (res$log2FoldChange >1| res$log2FoldChange < -1), ]

                                resSig <- na.omit(resSig)

                                require(gage)
                                datakegg.gs)
                                deseq.fc<- resSig$log2FoldChange
                                names(deseq.fc)<- resSig$id
                                sum(is.infinite(deseq.fc)) # there are some infinite numbers, if use DESeq2, no such problem.
                                deseq.fc[deseq.fc>10]=10
                                deseq.fc[deseq.fc<-10]=-10
                                exp.fc<- deseq.fc

                                #kegg.gsets works with 3000 KEGG speicies
                                data(korg)
                                head(korg[,1:3], n=20)


                                #let's get the annotation files for mouse and convert the gene set to gene symbol format
                                kg.mouse<- kegg.gsets("mouse")
                                kegg.gs<- kg.mouse$kg.sets[kg.mouse$sigmet.idx]
                                lapplykegg.gs[1:3],head)



                                # to convert IDs among gene/transcript ID to Entrez GeneID or reverse, use eg2id and id2eg in the pathview package
                                library(pathview)
                                data(bods)
                                bods

                                gene.symbol.eg<- id2eg(ids=names(exp.fc), category='SYMBOL', org='Mm')
                                # convert the gene symbol to Entrez Gene ID
                                head(gene.symbol.eg, n=100)
                                head(gene.symbol.eg[,2], n=10)

                                names(exp.fc)<- gene.symbol.eg[,2]

                                fc.kegg.p<- gage(exp.fc, gsets= kegg.gs, ref=NULL, samp=NULL)
                                sel<- fc.kegg.p$greater[,"q.val"] < 0.1 & !is.na(fc.kegg.p$greater[,"q.val"])
                                table(sel)

                                sel.l<- fc.kegg.p$less[,"q.val"] < 0.1 & !is.na(fc.kegg.p$greater[,"q.val"])
                                table(sel.l)



                                > table(sel.l)
                                sel.l
                                FALSE
                                202

                                > table(sel)
                                sel
                                FALSE
                                202

                                Am I doing it right?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM
                                • seqadmin
                                  The Impact of AI in Genomic Medicine
                                  by seqadmin



                                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                  02-26-2024, 02:07 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-14-2024, 06:13 AM
                                0 responses
                                33 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-08-2024, 08:03 AM
                                0 responses
                                72 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-07-2024, 08:13 AM
                                0 responses
                                80 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-06-2024, 09:51 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X