SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Free & Open Environment for RNA-seq analysis: Galaxy (http://usegalaxy.org) jgoecks RNA Sequencing 27 11-07-2016 12:02 AM
RNA-Seq analysis using galaxy guzhi100 RNA Sequencing 3 07-17-2012 12:39 AM
How to find out SNP and point mutations in RNA-Seq data using Galaxy? sepulveda RNA Sequencing 0 12-29-2011 01:50 PM
RNA seq analysis in Galaxy Puva Bioinformatics 0 04-29-2011 11:31 AM
RNA-seq Galaxy workflow for PE barcoded samples? jjw14 Bioinformatics 0 04-19-2011 12:58 PM

Reply
 
Thread Tools
Old 02-13-2012, 06:04 AM   #1
wingtec
Member
 
Location: Charlottesville, VA

Join Date: Apr 2010
Posts: 33
Default R or Galaxy for RNA-Seq

Hi All,

I wonder if the community may share their thought on this question -- for a beginner trying to learn how to analyze RNA-Seq data and assume he or she is reasonably versed in R, is R or Galaxy more worthy to invest his time to learn?

Thanks

Wintec
wingtec is offline   Reply With Quote
Old 02-13-2012, 08:26 AM   #2
tnabtaf
Member
 
Location: Oregon

Join Date: Jan 2011
Posts: 53
Default

Wintec,

I don't have an objective opinion here, but I'll post some thoughts anyway.

First, are you thinking of installing your own Galaxy, or of using the Galaxy Project public server ("main" http://usegalaxy.org/), or one of the other public servers (http://galaxyproject.org/wiki/PublicGalaxyServers)? You can get a pretty good idea of Galaxy's capabilities (as a user) in an hour by walking through the exercise at http://usegalaxy.org/galaxy101

I would argue that both Galaxy and R are the worth the time to learn. (Which, does not answer your question.) Both will likely be broadly useful.
tnabtaf is offline   Reply With Quote
Old 02-13-2012, 01:07 PM   #3
wingtec
Member
 
Location: Charlottesville, VA

Join Date: Apr 2010
Posts: 33
Default

Thank you for sharing your thought -- even though you said you did not answer my question I felt you did, in a sense.

At the present we do not have a local installation of Galaxy and using the public portal is kind of painfully slow.

Wingtec
wingtec is offline   Reply With Quote
Old 02-13-2012, 01:12 PM   #4
NextGenSeb
Member
 
Location: Melbourne

Join Date: Jan 2012
Posts: 15
Default

Dear Wintec,

in my opinion at this stage R is the way to go, using packages like edgeR or DEseq. The time spend learning can be seen as a good investment in the future

However it really depends on what question you are trying to answer. I think the tools offered in the public galaxy atm are not as well suited to perform differential expression analysis (between i.e. different treatments) as the R packages mentioned, while they are probably better for determination of expression levels within a given system. Hence if you want to answer the question if gene X is more expressed than gene Y within a given cell, cufflinks as implemented in galaxy is probably the better choice. In my experience cufflinks can have a bit of a temper though

I think the galaxy community is working on wrapping edgeR and DEseq on the public site as well, but it might take some time.

Hence I repeat my initial assessment, especially as someone who faced the same question not so long a go, learning R is worthwhile and likely at this stage the better choice.

Cheers
Seb
NextGenSeb is offline   Reply With Quote
Old 02-13-2012, 01:21 PM   #5
chadn737
Senior Member
 
Location: US

Join Date: Jan 2009
Posts: 392
Default

It partly depends what step you are on and what you want to do.

For any statistical analysis of RNA-seq data I would stick with R. You have a greater range of tools and better tools. Unless things have changed and I am wrong, you are limited to the Cufflinks suite of tools for differential expression analysis. There are far better tools out there, namely DESeq and edgeR.

Everything that is done in Galaxy can be done outside of Galaxy and more.
chadn737 is offline   Reply With Quote
Old 02-17-2012, 11:02 AM   #6
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Quote:
Originally Posted by chadn737 View Post
you are limited to the Cufflinks suite of tools for differential expression analysis. There are far better tools out there, namely DESeq and edgeR.
Why are DESeq and edgeR "far better" than Cufflinks?

Do they take into account alternative splicing for instance?
I have strong concerns with the "exon union" approach.. adding read counts across different transcripts of the same gene just sounds wrong to me.

Quote:
Originally Posted by chadn737 View Post
Everything that is done in Galaxy can be done outside of Galaxy and more.
Of course, but then what about learning R, Perl, Python, C, ... and the assembler, so that you can do everything yourself? The point of using some existing tool is that you don't have to reinvent the wheel each time.

Bottom line:
- Galaxy is easier to learn/use
- R is more powerful/advanced

By the way, aren't we comparing a programming language with a software?

Personally I would first try to use things like SAMtools, BEDtools, HTSeq, Cufflinks, Scripture, DESeq, edgeR.. and learn some shell scripting to pipeline them. I would actually start with Jon's thread.
steven is offline   Reply With Quote
Old 02-17-2012, 11:53 AM   #7
chadn737
Senior Member
 
Location: US

Join Date: Jan 2009
Posts: 392
Default

Quote:
Originally Posted by steven View Post
Why are DESeq and edgeR "far better" than Cufflinks?

Do they take into account alternative splicing for instance?
I have strong concerns with the "exon union" approach.. adding read counts across different transcripts of the same gene just sounds wrong to me.
Well originally because cuffdiff did not account for sample to sample variability. It does now I believe, but funny thing is it actually bases it on the same model as DESeq. They even say this in their website in the description of how Cuffdiff works.

Also if you are concerned with the exon-union approach, there are other tools besides Cufflinks, such as DEXseq, from the same author of DESeq.

You can read about the comparison between Cufflinks and DEXseq here:

http://www-huber.embl.de/pub/pdf/npre20126837-1.pdf

Quote:
Of course, but then what about learning R, Perl, Python, C, ... and the assembler, so that you can do everything yourself? The point of using some existing tool is that you don't have to reinvent the wheel each time.

Bottom line:
- Galaxy is easier to learn/use
- R is more powerful/advanced
Learning how to use preexisting tools within the command line along with maybe a little bit of a programming language is not reinventing the wheel.

I'm just saying that when you used an all encompassing program like Galaxy you limit your options and furthermore you are completely dependent upon the creators to update and expand it.

It becomes a question really then of time versus power/utility/and self-development. Learning to use R programs like DESeq or EdgeR will also get you started in learning R, and R can be used for a lot more than just differential expression, so its like the gift that keeps on giving. You start using it for other problems because it has that utility.

I'm a straight up molecular biologist and when I started doing next-gen work I couldn't even change to a different folder in a command line and for a while we even used some commercial point and click software for our analysis. It took some effort to learn everything, but what I have learned allows me to do far far more than had I limited myself to one set of tools because they were easy to use.

So despite the difficulties, I will recommend any beginner to actually take the time to do it the slightly harder way. It will pay off.

Quote:
Personally I would first try to use things like SAMtools, BEDtools, HTSeq, Cufflinks, Scripture, DESeq, edgeR.. and learn some shell scripting to pipeline them. I would actually start with Jon's thread.
I agree completely.

Last edited by chadn737; 02-17-2012 at 11:57 AM.
chadn737 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO