Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice for setting up some lab computers

    Hey all, recently the lab I work for has shown interest in setting up a couple of computers for doing some in-house bioinformatics and all-around lab usage. Our institution does have a high performance computing cluster, but it's a bit of a black box when getting to the specifics of the computational stuff. Some of this would just be for smaller computing projects as well, so we don't have to deal with the queues of our main computing cluster every time we decide to test something smaller.

    Computationally, we'd mostly be doing things involving R and analyzing Illumina sequencing data sets (mapping 50 million reads or more, etc).

    We're currently debating on whether or not to build one from parts or buy a pre-built system from something like thinkmate.com. Maybe even a Mac Pro since it would take less time to set up? Does anyone have experience buying systems like this? Any suggestions on reliable/supportive vendors?

    We've sort of concluded that if we were to get 2 computers the roles would be divided as such:

    Option A: 1 weaker traditional desktop and 1 relatively powerful server in the traditional sense. The lab members would somehow send jobs to a queue on the server and it would use its own resources to compute. The problem here, is that although I do know my way around OSes like Ubuntu or Mint, I have zero experience handling actual servers and the maintenance or setup of such systems. In this case, does anyone have guides for a beginner with respect to server setup and maintenance?

    -OR-

    Option B: 2 powerful "workstations", where one is primarily reserved for remote usage, though both could used this way. Perhaps one workstation would have some kind of RAID array setup to act more like a main repository for lab data too. We might connect to the machines through some kind of remote screen-sharing or just SSH stuff if we need to. Both would basically be set up like traditional desktop computers, for ease of use.

    I've heard that certain distros are better suited for actual server machines like CentOS in comparison to distros like Mint. I'm sure this is a huge topic just by itself, but can someone elaborate a little or recommend some reading?

    In addition, besides getting a pre-built, I guess I'd just like a critique on a prospective parts list that I put together which would act as the main server machine as detailed in option A.

    CPU: Intel Xeon E5-2630 V3 2.4GHz 8-Core OEM/Tray Processor
    CPU: Intel Xeon E5-2630 V3 2.4GHz 8-Core OEM/Tray Processor
    CPU Cooler: Cooler Master Hyper 212 EVO 82.9 CFM Sleeve Bearing CPU Cooler
    CPU Cooler: Cooler Master Hyper 212 EVO 82.9 CFM Sleeve Bearing CPU Cooler
    Motherboard: Asus Z10PA-D8 ATX Dual-CPU LGA2011-3 Motherboard
    Memory: Crucial 64GB (4 x 16GB) Registered DDR4-2133 Memory
    Memory: Crucial 64GB (4 x 16GB) Registered DDR4-2133 Memory
    Storage: Intel 730 Series 480GB 2.5" Solid State Drive
    Storage: Western Digital Red Pro 3TB 3.5" 7200RPM Internal Hard Drive
    Storage: Western Digital Red Pro 3TB 3.5" 7200RPM Internal Hard Drive
    Storage: Western Digital Red Pro 3TB 3.5" 7200RPM Internal Hard Drive
    Storage: Western Digital Red Pro 3TB 3.5" 7200RPM Internal Hard Drive
    Video Card: EVGA GeForce GTX 750 1GB Video Card
    Case: Corsair 500R Black ATX Mid Tower Case
    Power Supply: EVGA 750W 80+ Gold Certified Fully-Modular ATX Power Supply
    Monitor: Acer H236HLbid 60Hz 23.0" Monitor


    Other than for an actual server, is using a 2 socket system for a workstation wise? Does it make much of a difference in terms of complications in POST-ing the thing? What about for simply setting things up like alignments to run?

    With regards to "Option B", would it be a good idea to build two workstations similar to the posted build, but just with less storage, only 64GB of RAM, and 2 Xeon E5-2620 V3s instead? Or would each of these workstations be better with just something like an i7-5690X or some other single socket setup on an X99 board?

    Thanks to anyone who can help.
    Last edited by bob.chen; 06-25-2015, 01:19 PM.

  • #2
    Option 3. Rent time in the cloud.

    Comment


    • #3
      @bob.chen: Every option is bound to come with its pluses and minuses. Outside of this short post you know your lab's environment/needs best and can probably decide on a solution.

      If you are primarily a biologist then leave the system administration to those who's day job description is to do that. Even though putting new systems together is exciting in short term, over long haul administering the systems effectively can become a chore. If not done properly a badly administered system has the potential to cause a security incident.

      If the central compute services provided by your institution are good then make use of them. Make friends with those people if you feel like you are dealing with a black box. Buy fractional hardware (a server node or two) that can be attached to the main cluster ("patron" model used by many institutions). That way they can take care of the administration and you can do your own research.
      Last edited by GenoMax; 06-25-2015, 01:15 PM.

      Comment


      • #4
        @GenoMax

        Thanks for the insight. Can you elaborate what you mean by "fractional hardware"?

        Comment


        • #5
          Did you mean to have double line entries ? Like for the CPU ?

          Typically you don't need a graphics card for running BBMAP, bwa, etc.

          You might want bigger hard drives.

          Get a bigger monitor, too.

          If you're going hang this off the internet, make sure you back it up.

          Comment


          • #6
            Originally posted by bob.chen View Post
            Can you elaborate what you mean by "fractional hardware"?
            Some central compute facilities allow contributors to pay for a node (or several) that are technically "owned" by you but become part of the main compute cluster. You are generally assured access to "your" nodes via preferential queues when you need them but when idle they become available for other users on the cluster to use. You can similarly access additional nodes on the cluster when you need them even though they may "belong" to other patrons. IT administers the entire collection of nodes as a unit. This way everyone benefits.

            Comment


            • #7
              @Richard_Finney

              The build I posted was meant to be a 2 socket system, so yes two separate CPUs. I included a GPU since the Xeons don't have integrated graphics, so it was mostly just a cheap video output thing that supported multiple monitors. I agree that we'll be needing more screen real estate.

              Comment


              • #8
                @GenoMax

                I see, it makes sense how this is much more secure than setting something up on our own. We're probably not going to set up a full-fledged server given your advice now. Still, some workstation/general lab computers for internal use remains appealing.

                Are the security issues associated with setting up a workstation equivalent to those that our personal laptops used in the lab are exposed to? Or is there a more complex issue given the nature of remote access to such a workstation?
                Last edited by bob.chen; 06-25-2015, 02:27 PM.

                Comment


                • #9
                  Originally posted by bob.chen View Post
                  Are the security issues associated with setting up a workstation equivalent to those that our personal laptops used in the lab are exposed to? Or is there a more complex issue given the nature of remote access to such a workstation?
                  As long as things are configured correctly security is a manageable risk. It is the avalanche of new exploits that come out daily that is what makes keeping up with patches/updates important.

                  If you choose to run web/database servers of any kind lock them down for access from local network space. Require complex passwords with regular rotations. If you must access the services remotely then make use of VPN (I suppose your institutions offers one) mandatory for off-campus access. Once you set things up ask your local security office to run a security/intrusion scan to ensure things are set up correctly. You should be fine as long as you remain vigilant.

                  Comment


                  • #10
                    Thanks so much for your help, we'll keep all this in mind for whatever we end up doing!

                    Comment


                    • #11
                      You might want to get more/different HDs, particularly for a RAID:
                      Check out the latest Hard Drive Stats. It was one year ago that I first blogged about the failure rates of specific models of hard drives, so now is a

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      11 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      51 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      68 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X