Hi all,
We've just released a new piece of software from the Babraham Bioinformatics group called Cluster Flow.
Cluster Flow is a command-line program which uses GRIDEngine or LSF cluster environments to run analysis pipelines.
How Cluster Flow differs from other pipeline tools:
We have been using Cluster Flow on our GRIDEngine software for some months and it's working well. In fact, I think it's fair to say that most of our bioinformatics group use it on an almost daily basis now. There has been limited testing on LSF systems with the help of a friend at the EBI, where it seems to work ok.
At the time of writing, Cluster Flow comes bundled with pipelines and modules to run the following programs:
It comes with typical pipelines to process data using these modules, some with additional parameters (eg. for miRNA alignment or RRBS methylation data).
We've written these pipelines as we've needed them - Cluster Flow comes with an example module which you can use to help you write your own. If you do use Cluster Flow and write any new modules or pipelines, please let us know as we're keen to expand the number of available analyses that it can run.
Cluster Flow is released with a GPL v3 licence and can be downloaded from the Babraham Bioinformatics website: http://www.bioinformatics.babraham.a...s/clusterflow/
We've just released a new piece of software from the Babraham Bioinformatics group called Cluster Flow.
Cluster Flow is a command-line program which uses GRIDEngine or LSF cluster environments to run analysis pipelines.
- Routine analyses are very quick to run, for example: cf --genome GRCh37 fastq_bowtie *fq.gz
- Pipelines use identical parameters, standardising analysis and making results more reproducable
- Integrated parallelisation tools help prevent your cluster becoming overloaded
- All commands and output is logged in files for future reference
- Intuitive commands and a comprehensive manual make Cluster Flow easy to use
- Works out of the box (almost - see the YouTube tutorial)
How Cluster Flow differs from other pipeline tools:
- Very lightweight and flexible
- Pipelines and configurations can easily be generated on a project-specific basis if required
- New modules and pipelines are very easy to write (see video tutorial)
We have been using Cluster Flow on our GRIDEngine software for some months and it's working well. In fact, I think it's fair to say that most of our bioinformatics group use it on an almost daily basis now. There has been limited testing on LSF systems with the help of a friend at the EBI, where it seems to work ok.
At the time of writing, Cluster Flow comes bundled with pipelines and modules to run the following programs:
- Bismark
- Bowtie (1 and 2)
- FastQ Screen
- FastQC
- HiCUP
- SRA dump (abi and FastQ)
- Tophat
- Trim Galore!
It comes with typical pipelines to process data using these modules, some with additional parameters (eg. for miRNA alignment or RRBS methylation data).
We've written these pipelines as we've needed them - Cluster Flow comes with an example module which you can use to help you write your own. If you do use Cluster Flow and write any new modules or pipelines, please let us know as we're keen to expand the number of available analyses that it can run.
Cluster Flow is released with a GPL v3 licence and can be downloaded from the Babraham Bioinformatics website: http://www.bioinformatics.babraham.a...s/clusterflow/
Comment