Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster Flow: A pipelining tool to automate and standardise bioinformatics analyses

    Hi all,

    We've just released a new piece of software from the Babraham Bioinformatics group called Cluster Flow.

    Cluster Flow is a command-line program which uses GRIDEngine or LSF cluster environments to run analysis pipelines.
    • Routine analyses are very quick to run, for example: cf --genome GRCh37 fastq_bowtie *fq.gz
    • Pipelines use identical parameters, standardising analysis and making results more reproducable
    • Integrated parallelisation tools help prevent your cluster becoming overloaded
    • All commands and output is logged in files for future reference
    • Intuitive commands and a comprehensive manual make Cluster Flow easy to use
    • Works out of the box (almost - see the YouTube tutorial)


    How Cluster Flow differs from other pipeline tools:
    • Very lightweight and flexible
    • Pipelines and configurations can easily be generated on a project-specific basis if required
    • New modules and pipelines are very easy to write (see video tutorial)


    We have been using Cluster Flow on our GRIDEngine software for some months and it's working well. In fact, I think it's fair to say that most of our bioinformatics group use it on an almost daily basis now. There has been limited testing on LSF systems with the help of a friend at the EBI, where it seems to work ok.

    At the time of writing, Cluster Flow comes bundled with pipelines and modules to run the following programs:

    It comes with typical pipelines to process data using these modules, some with additional parameters (eg. for miRNA alignment or RRBS methylation data).

    We've written these pipelines as we've needed them - Cluster Flow comes with an example module which you can use to help you write your own. If you do use Cluster Flow and write any new modules or pipelines, please let us know as we're keen to expand the number of available analyses that it can run.

    Cluster Flow is released with a GPL v3 licence and can be downloaded from the Babraham Bioinformatics website: http://www.bioinformatics.babraham.a...s/clusterflow/

  • #2
    Hi all,

    I've just released version 0.2 of Cluster Flow. The main update is that it now supports SLURM clusters, plus it's much easier to customise the job submission commands to be tailored to your environment.

    Cluster Flow now has its own website for documentation: http://ewels.github.io/clusterflow/

    It's now hosted on GitHub - you can download v0.2 from tagged releases page.

    Cheers,

    Phil

    Comment


    • #3
      Version 0.3 of Cluster Flow has just been pushed live.

      This one has been brewing for a few months now and is a big update. The main highlights:
      • Report log files are now handled in a clever way to keep their order consistent, even when jobs are running in parallel.
      • E-mails are fancier and flag any errors or warnings, plus they can be given custom text strings to search for in the logs and highlight or flag as warnings.
      • Environment module loading has been tidied up and now needs less configuration and works more robustly. Environment modules can now be given aliases for better compatibility and version specification.
      • Cluster compatibility has been developed heavily and now allows almost complete configuration of the job submission commands via the configuration file.


      You can download v0.3 of Cluster Flow here: https://github.com/ewels/clusterflow/releases/tag/v0.3

      Documentation and new demonstrations can be seen on the docs homepage: http://ewels.github.io/clusterflow/

      Much of this development has been the result of me moving and wanting to run Cluster Flow on a different cluster. I'd like to thank those who have helped out with testing and development, notably the chaps back at Babraham who have had to put up with all of my buggy pre-releases.

      Phil

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Understanding Genetic Influence on Infectious Disease
        by seqadmin




        During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

        Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
        09-09-2024, 10:59 AM
      • seqadmin
        Addressing Off-Target Effects in CRISPR Technologies
        by seqadmin






        The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
        08-27-2024, 04:44 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 06:25 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 01:02 PM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-18-2024, 06:39 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-11-2024, 02:44 PM
      0 responses
      14 views
      0 likes
      Last Post seqadmin  
      Working...
      X