Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • R or Galaxy for RNA-Seq

    Hi All,

    I wonder if the community may share their thought on this question -- for a beginner trying to learn how to analyze RNA-Seq data and assume he or she is reasonably versed in R, is R or Galaxy more worthy to invest his time to learn?

    Thanks

    Wintec

  • #2
    Wintec,

    I don't have an objective opinion here, but I'll post some thoughts anyway.

    First, are you thinking of installing your own Galaxy, or of using the Galaxy Project public server ("main" http://usegalaxy.org/), or one of the other public servers (http://galaxyproject.org/wiki/PublicGalaxyServers)? You can get a pretty good idea of Galaxy's capabilities (as a user) in an hour by walking through the exercise at http://usegalaxy.org/galaxy101

    I would argue that both Galaxy and R are the worth the time to learn. (Which, does not answer your question.) Both will likely be broadly useful.

    Comment


    • #3
      Thank you for sharing your thought -- even though you said you did not answer my question I felt you did, in a sense.

      At the present we do not have a local installation of Galaxy and using the public portal is kind of painfully slow.

      Wingtec

      Comment


      • #4
        Dear Wintec,

        in my opinion at this stage R is the way to go, using packages like edgeR or DEseq. The time spend learning can be seen as a good investment in the future

        However it really depends on what question you are trying to answer. I think the tools offered in the public galaxy atm are not as well suited to perform differential expression analysis (between i.e. different treatments) as the R packages mentioned, while they are probably better for determination of expression levels within a given system. Hence if you want to answer the question if gene X is more expressed than gene Y within a given cell, cufflinks as implemented in galaxy is probably the better choice. In my experience cufflinks can have a bit of a temper though

        I think the galaxy community is working on wrapping edgeR and DEseq on the public site as well, but it might take some time.

        Hence I repeat my initial assessment, especially as someone who faced the same question not so long a go, learning R is worthwhile and likely at this stage the better choice.

        Cheers
        Seb

        Comment


        • #5
          It partly depends what step you are on and what you want to do.

          For any statistical analysis of RNA-seq data I would stick with R. You have a greater range of tools and better tools. Unless things have changed and I am wrong, you are limited to the Cufflinks suite of tools for differential expression analysis. There are far better tools out there, namely DESeq and edgeR.

          Everything that is done in Galaxy can be done outside of Galaxy and more.

          Comment


          • #6
            Originally posted by chadn737 View Post
            you are limited to the Cufflinks suite of tools for differential expression analysis. There are far better tools out there, namely DESeq and edgeR.
            Why are DESeq and edgeR "far better" than Cufflinks?

            Do they take into account alternative splicing for instance?
            I have strong concerns with the "exon union" approach.. adding read counts across different transcripts of the same gene just sounds wrong to me.

            Originally posted by chadn737 View Post
            Everything that is done in Galaxy can be done outside of Galaxy and more.
            Of course, but then what about learning R, Perl, Python, C, ... and the assembler, so that you can do everything yourself? The point of using some existing tool is that you don't have to reinvent the wheel each time.

            Bottom line:
            - Galaxy is easier to learn/use
            - R is more powerful/advanced

            By the way, aren't we comparing a programming language with a software?

            Personally I would first try to use things like SAMtools, BEDtools, HTSeq, Cufflinks, Scripture, DESeq, edgeR.. and learn some shell scripting to pipeline them. I would actually start with Jon's thread.

            Comment


            • #7
              Originally posted by steven View Post
              Why are DESeq and edgeR "far better" than Cufflinks?

              Do they take into account alternative splicing for instance?
              I have strong concerns with the "exon union" approach.. adding read counts across different transcripts of the same gene just sounds wrong to me.
              Well originally because cuffdiff did not account for sample to sample variability. It does now I believe, but funny thing is it actually bases it on the same model as DESeq. They even say this in their website in the description of how Cuffdiff works.

              Also if you are concerned with the exon-union approach, there are other tools besides Cufflinks, such as DEXseq, from the same author of DESeq.

              You can read about the comparison between Cufflinks and DEXseq here:



              Of course, but then what about learning R, Perl, Python, C, ... and the assembler, so that you can do everything yourself? The point of using some existing tool is that you don't have to reinvent the wheel each time.

              Bottom line:
              - Galaxy is easier to learn/use
              - R is more powerful/advanced
              Learning how to use preexisting tools within the command line along with maybe a little bit of a programming language is not reinventing the wheel.

              I'm just saying that when you used an all encompassing program like Galaxy you limit your options and furthermore you are completely dependent upon the creators to update and expand it.

              It becomes a question really then of time versus power/utility/and self-development. Learning to use R programs like DESeq or EdgeR will also get you started in learning R, and R can be used for a lot more than just differential expression, so its like the gift that keeps on giving. You start using it for other problems because it has that utility.

              I'm a straight up molecular biologist and when I started doing next-gen work I couldn't even change to a different folder in a command line and for a while we even used some commercial point and click software for our analysis. It took some effort to learn everything, but what I have learned allows me to do far far more than had I limited myself to one set of tools because they were easy to use.

              So despite the difficulties, I will recommend any beginner to actually take the time to do it the slightly harder way. It will pay off.

              Personally I would first try to use things like SAMtools, BEDtools, HTSeq, Cufflinks, Scripture, DESeq, edgeR.. and learn some shell scripting to pipeline them. I would actually start with Jon's thread.
              I agree completely.
              Last edited by chadn737; 02-17-2012, 12:57 PM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X