Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Setting up a shared analysis platform for NGS (advice is welcome)

    Hi All,

    I wish to setup a shared linux server (internet-accessible with some degree of privacy) for my Institutional colleagues (spread across 4 Universities/Towns ).

    The prerequisites are the following:
    • lots of hot data needs be accessible only by owners
    • some more shared data storage for training
    • about 10-20 named user slots
    • choice/installs and maintenance done by us
    • I do not ask here about which NGS software to select but more about the security layer around it and what would be best choice of structure to create a secured - shared platform


    My thoughts
    • I thought of using the unix built-in security system as I do not have a budget to buy VPN or Sunsecure stuff (maybe working with certificates!)
    • I do not want a graphical user interface as this is not going to work fast enough but a good command line platform
    • I want to provide a reasonably strong computing platform (lots or RAM and 8+ CPU) to users who do not have better than a laptop or small desktop at hand.
    • I want to use the same platform to train people with as objective that they learn it with us (select the right apps and pipelines) and then buy their own (bigger) machine for the lab


    Any people with comparable experienced and with advice is very welcome at this point. I do not intend to spend the money we do not have on this but truly believe that training in NGS analysis is crucial TODAY if we want to perform in this exploding field tomorrow.

    Thanks for your help,
    Stephane
    http://www.bits.vib.be/index.php

  • #2
    We're building something similar using Galaxy as a base. It's pretty flexible, comes with a lot of NGS tools pre-configured but makes it simple to add your own. It also takes care of the account management and data sharing.

    The problems we've found so far (not unique to galaxy - just generally):

    1) Storage management - no matter how much storage you have you need to manage the amount of data people are creating. Finding a balance between forcing people to clean up what they no longer need and minimising the risk of losing useful data is a tricky balancing act. We're trying to link raw data files from our main storage array to the local storage so they can only be accessed read-only, and then let people create derived files locally. We're also trying to put up shared copies of relevant publicly accessible datasets so we don't end up with multiple copies of these floating around.

    2) Resource management - No matter how much computing resource you provide it's pretty easy for a single user to take all of it (possibly unintentionally). Some of this comes down to education, but for several users you should look at putting some kind of queueing system in place to put jobs through in an optimal way.

    3) Project management - Lots of NGS analysis creates lots of (big) files. Having some way for the users to track what analyses they've done and which files came from which analysis will make everyone's life easier.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 06:37 PM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:07 PM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    51 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    67 views
    0 likes
    Last Post seqadmin  
    Working...
    X