SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Free & Open Environment for RNA-seq analysis: Galaxy (http://usegalaxy.org) jgoecks RNA Sequencing 27 11-07-2016 01:02 AM
NAR: PileLineGUI a desktop environment for handling NGS genome position files. gonvader Literature Watch 0 06-09-2011 03:59 AM
Free Partek RNA-Seq & ChIP-Seq Data Analysis Workshops [email protected] Events / Conferences 0 04-05-2011 08:28 AM
Free Avadis NGS webinar on 23 February 2011 Strand SI Events / Conferences 4 03-15-2011 02:54 AM
ChIP-Seq: The Poisson Margin Test for Normalization-Free Significance Analysis of NGS Newsbot! Literature Watch 0 03-10-2011 03:00 AM

Reply
 
Thread Tools
Old 03-22-2010, 01:33 PM   #1
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default Free & Open Environment for NGS analysis: Galaxy (http://usegalaxy.org)

The Galaxy team is announcing the launch of the first free public resource for NGS analysis at http://usegalaxy.org. This service is the beginning of our campaign to provide free web-based utilities for NGS analysis that later in the year will take advantage of Cloud resources (see http://bit.ly/aMUkpo).

At present there are three main groups of tools including (you can find them in the left pane of http://usegalaxy.org):

1. NGS QC and manipulation - contains a variety of tools for dealing with all flavors of fastq datasets as well as outputs of SOLiD and 454 instruments.
2. NGS Mapping - currently includes bowtie (Illumina & SOLiD), BWA (Illumina), and lastz (454) mappers. PerM (SOLiD) is on the way and more will be added in the coming months. Transcriptome tools (e.g., top-hat) are also in the final stages of development.
3. NGS SAMTools - includes a variety of utilities for SAM/BAM manipulation. Some are based on the samtools library, some are written by the Galaxy team.

The Galaxy team does not like to read documentation and expects that others don't either. This is why we make short movies called quickies. To see what Galaxy can do, see these:

Example 1 - mapping mate-paired SOLiD data
Example 2 - mapping SOLiD (or Illumina) data against a custom genome
Example 3 - mapping paired-end Illumina run and visualizing results at the UCSC Browser

Enjoy and send us feedback!

Those who want to contribute tools, brains, or coding skills should consider attending the Galaxy Developer Conference (http://www.galaxyproject.org/dev2010; Cold Spring Harbor Lab; immediately after the Biology of Genomes) by e-mailing us ([email protected]). We might sponsor your participation!

This free service is brought you by NIH (NHGRI), NSF, the Huck Institutes for the Life Sciences and the Institute for CyberScience at Penn State University, Emory University and the Pennsylvania Department of Public Health.

Last edited by nekrut; 03-24-2010 at 07:10 AM. Reason: small corrections
nekrut is offline   Reply With Quote
Old 03-22-2010, 10:40 PM   #2
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,355
Default

This is amazing by the way. If there is anything the site or community can do to support this effort, don't hesitate to let me/us know.

Galaxy =
ECO is online now   Reply With Quote
Old 03-22-2010, 11:47 PM   #3
thinkRNA
Member
 
Location: Carlsbad,CA

Join Date: Jan 2010
Posts: 94
Default

This is simply mind-blowing and must be a lot of work. I love the videos. Now, I can think of at least 5 companies that are going to be put out of business because they are charging 1000s of $ for this service.
Are you all thinking about including other analysis tools into Galaxy like cufflinks software suite, Degseq and Eland/Erange, other R statistical packages in the near future (if so,by when?)

Last edited by thinkRNA; 03-22-2010 at 11:58 PM.
thinkRNA is offline   Reply With Quote
Old 03-23-2010, 01:47 AM   #4
dawe
Senior Member
 
Location: 4530'25.22"N / 915'53.00"E

Join Date: Apr 2009
Posts: 258
Default

Quote:
Originally Posted by nekrut View Post
Those who want to contribute tools, brains, or coding skills should consider attending the Galaxy Developer Conference (http://www.galaxyproject.org/dev2010) by e-mailing us ([email protected]). We might sponsor your participation!
See you there!
dawe is offline   Reply With Quote
Old 03-23-2010, 02:44 AM   #5
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 197
Default

Is it a typo that BWA doesn't support SOLID or does Galaxy have no support for BWA on SOLID reads?

forgot to add I am blown away by the movies too.

Last edited by KevinLam; 03-23-2010 at 02:51 AM.
KevinLam is offline   Reply With Quote
Old 03-23-2010, 07:28 AM   #6
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,541
Default

Looks very impressive

I have one suggestion for enhancement: Right now, users would have to provide their Roche 454 data already as FASTA+QUAL or merged into FASTQ. One feature that should be fairly straightforward to add would be SFF to Sanger FASTQ (or SFF to FASTA, SFF to QUAL). I'd suggest looking at sff_extract for this (also in Python), which can handle paired end 454 data too.

Peter
maubp is offline   Reply With Quote
Old 03-23-2010, 08:06 AM   #7
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

I guess bwa's solid support was too buggy at the time of developing galaxy. It should become better now. For solid, I think it is important to include bfast as perm/bowtie do not do gapped alignment. Bwa does gapped alignment for SOLiD, but not as good as bfast. Now I think gapped alignment is crucial to accurate variant discovery, more important than I thought before. Several other publications have already emphasized this point.
lh3 is offline   Reply With Quote
Old 03-23-2010, 10:07 AM   #8
thinkRNA
Member
 
Location: Carlsbad,CA

Join Date: Jan 2010
Posts: 94
Default

Quote:
Originally Posted by lh3 View Post
I guess bwa's solid support was too buggy at the time of developing galaxy. It should become better now. For solid, I think it is important to include bfast as perm/bowtie do not do gapped alignment. Bwa does gapped alignment for SOLiD, but not as good as bfast. Now I think gapped alignment is crucial to accurate variant discovery, more important than I thought before. Several other publications have already emphasized this point.

Are you talking about splice-variant discovery or snp discovery? I am guessing its the latter.
thinkRNA is offline   Reply With Quote
Old 03-23-2010, 11:47 AM   #9
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default

Quote:
Originally Posted by maubp View Post
Looks very impressive

I have one suggestion for enhancement: Right now, users would have to provide their Roche 454 data already as FASTA+QUAL or merged into FASTQ. One feature that should be fairly straightforward to add would be SFF to Sanger FASTQ (or SFF to FASTA, SFF to QUAL). I'd suggest looking at sff_extract for this (also in Python), which can handle paired end 454 data too.

Peter
Dear Peter:

Yes, this is on the to do list, and since we have 454s here this will be happening soon.
nekrut is offline   Reply With Quote
Old 03-23-2010, 11:53 AM   #10
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default

Quote:
Originally Posted by lh3 View Post
I guess bwa's solid support was too buggy at the time of developing galaxy. It should become better now. For solid, I think it is important to include bfast as perm/bowtie do not do gapped alignment. Bwa does gapped alignment for SOLiD, but not as good as bfast. Now I think gapped alignment is crucial to accurate variant discovery, more important than I thought before. Several other publications have already emphasized this point.
At the time the challenge was the interpretation of SAM output produced by BWA on solid reads. It will be quite simple to enable BWA for SOLiD in Galaxy, since all cs indices are alreay built. Do you think it's time to enable BWA for SOLiD again? Speaking of bfast, we really need Nils' input on making index generation a bit more user friendly.
nekrut is offline   Reply With Quote
Old 03-23-2010, 12:41 PM   #11
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default

Quote:
Originally Posted by thinkRNA View Post
Are you all thinking about including other analysis tools into Galaxy like cufflinks software suite, Degseq and Eland/Erange, other R statistical packages in the near future (if so,by when?)
Cufflinks by early summer. No plans for DegSeq yet (but integrating tools into Galaxy is easy, so anyone can do this). Speaking of Eland/Erange - we give priority to Open Source tools.
nekrut is offline   Reply With Quote
Old 03-23-2010, 06:57 PM   #12
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 197
Default

Quote:
Originally Posted by lh3 View Post
I guess bwa's solid support was too buggy at the time of developing galaxy. It should become better now. For solid, I think it is important to include bfast as perm/bowtie do not do gapped alignment. Bwa does gapped alignment for SOLiD, but not as good as bfast. Now I think gapped alignment is crucial to accurate variant discovery, more important than I thought before. Several other publications have already emphasized this point.
interesting revelation!
lh3: slightly OT. How would you compare bfast mapping vs bioscope's mapreads then?

Galaxy: I think you should ask ABI to write wrappers for their binaries in Galaxy. I think they should be more than happy to have better support for their platform. and Bioscope is still ^%*^% propriety &^(&*^ software. This seriously limits their widespread adaptation
KevinLam is offline   Reply With Quote
Old 03-23-2010, 07:01 PM   #13
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default

Quote:
Originally Posted by KevinLam View Post
interesting revelation!
Galaxy: I think you should ask ABI to write wrappers for their binaries in Galaxy. I think they should be more than happy to have better support for their platform. and Bioscope is still ^%*^% propriety &^(&*^ software. This seriously limits their widespread adaptation
We prefer to stay on the open source side of the world = how else do you know what software actually does?
nekrut is offline   Reply With Quote
Old 03-23-2010, 07:05 PM   #14
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by nekrut View Post
Speaking of bfast, we really need Nils' input on making index generation a bit more user friendly.
I believe before, we planned to design a script that would find the "optimal" parameters for building the indexes. The indexes in the manual work very well and support most types of sequence data (lengths and technologies) and genomes (long and short). Given this information, it should be trivial to get BFAST support up and running. Feel free to PM me or email me.
nilshomer is offline   Reply With Quote
Old 03-23-2010, 08:37 PM   #15
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 197
Default

Quote:
Originally Posted by nekrut View Post
We prefer to stay on the open source side of the world = how else do you know what software actually does?
I think we can lobby for ABI to release as open source.
I mean they ain't selling the software anyway.

for ppl stuck with service providers its either you use corona lite which has very little documentation or u go with open source tools.

if no one uses ABI tools then what's the point?
KevinLam is offline   Reply With Quote
Old 03-24-2010, 07:29 AM   #16
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default

Quote:
Originally Posted by nilshomer View Post
The indexes in the manual work very well and support most types of sequence data (lengths and technologies) and genomes (long and short). Given this information, it should be trivial to get BFAST support up and running. Feel free to PM me or email me.
Great. We'll give it a shot and make a wrapper in a few weeks.
nekrut is offline   Reply With Quote
Old 03-25-2010, 07:51 AM   #17
nekrut
Member
 
Location: Penn State

Join Date: Apr 2009
Posts: 22
Default New Quickies explaining fastq manipulation

Just added two movies explaining fastq manipulation: basic and advanced.
nekrut is offline   Reply With Quote
Old 03-26-2010, 07:19 AM   #18
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

This is great.

Just one question - is the web interface for the cloud Galaxy meant to be working? I successfully connected to the AMI and was able to log in using ssh - but I couldn't connect to the public DNS where the web interface is supposed to reside.
kopi-o is offline   Reply With Quote
Old 03-26-2010, 08:18 AM   #19
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

Never mind, connecting worked like a charm when I followed the (presumably very recently updated) and detailed instructions at http://bitbucket.org/galaxy/galaxy-central/wiki/cloud.

This seems extremely useful!
kopi-o is offline   Reply With Quote
Old 03-26-2010, 01:03 PM   #20
tweist
Junior Member
 
Location: boston, usa

Join Date: Aug 2008
Posts: 3
Default

This is a great platform having many useful tools available from the same place and interoperable. However, uploading big dataset to Galaxy is a challenge and a road block. I tried to upload a ILMN dataset of a few Gb and not yet done after a day and half. Any plan to enable Aspera-like tools to accelerate the data transfer speed?

Also, I think a blast tool that allows search against arbitrary database would be useful too. Any plan to add this feature?

Great job!
tweist is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:36 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO