Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GAIIx, pipeline server config help?

    Dear all,
    we are in the process of ordering a pipeline server for our GAIIx.
    We came up to a config something like: 4X 12 cores CPUs, 128GB RAM, 24 x 1TB Near Line SAS ext. storage

    Did someone have build such server to answer my questions:
    -It was told to me that CASAVA requires RedHat to CentOS?
    -which is most suitable OS
    -did we have to look for any incompatibility between the software (Casava, bow tie, etc, all soft for miRNA-seq, RNA-seq, genome assembly) and the hardwares - number of CPU, RAM etc?
    -is there something specific useful for this pipeline server that we should looking for (e.g. ext.storage linkage/type; CPU brand AMD/Intel, etc) that we are not aware of?

    Thanks for helping!
    ------------
    SMART - bioinfo.uni-plovdiv.bg

  • #2
    We use both RedHat and CentOS to run the illumina pipeline. Either OS will work fine. CentOS is free whereas RedHat will require purchasing a license.

    There should be no major software incompatibilities irrespective of the OS you choose. You may need access to a good systems administrator to make all this work smoothly.

    An easily overlooked thing is a means for data backup. I do not see it mentioned in your config above. You may want to consider that in addition.

    Comment


    • #3
      Originally posted by GenoMax View Post
      We use both RedHat and CentOS to run the illumina pipeline. Either OS will work fine. CentOS is free whereas RedHat will require purchasing a license.

      There should be no major software incompatibilities irrespective of the OS you choose. You may need access to a good systems administrator to make all this work smoothly.

      An easily overlooked thing is a means for data backup. I do not see it mentioned in your config above. You may want to consider that in addition.

      can you suggest some ideas about it
      ------------
      SMART - bioinfo.uni-plovdiv.bg

      Comment


      • #4
        I assume you are asking about the data backup?

        The answer may depend on your specific application/budget. Is this installation going to service a single lab or are you going to provide this as a service to a wider community?

        Originally posted by vebaev View Post
        can you suggest some ideas about it

        Comment


        • #5
          Originally posted by GenoMax View Post
          I assume you are asking about the data backup?

          The answer may depend on your specific application/budget. Is this installation going to service a single lab or are you going to provide this as a service to a wider community?

          We plan at the beginning to be for the lab only.
          The application will be - small RNA-seq; RNA-seq, and when we accumulate experience maybe genome de novo seq
          ------------
          SMART - bioinfo.uni-plovdiv.bg

          Comment


          • #6
            Depending on your throughput (one run per week?) the "backup" solution could be simple as a couple large external HD's (redundant copies to be cautious), if you intend to only keep the sequence files/downstream analyzed data.

            You could also look at purchasing a backup tape drive (LTO-4 or 5) and then backup the data to tape.

            Note: If you run longer cycles (> 75 bp), the raw data folders can become big (up to 250 GB). If you also want to keep a backup of these then tape backup may be the cheapest solution.



            Originally posted by vebaev View Post
            We plan at the beginning to be for the lab only.
            The application will be - small RNA-seq; RNA-seq, and when we accumulate experience maybe genome de novo seq

            Comment


            • #7
              Thanks,

              for the external storage we can always add more, I'm more worried for choosing CPUs, OS and if the software will be optimized and use maximum of all hardware components?

              Is CASAVA (or other frequently used soft) have limits to maximum how many CPUs or RAM can use?

              What OS you can suggest me, and if it is RedHat do we have to pay for year support which is not cheep?
              ------------
              SMART - bioinfo.uni-plovdiv.bg

              Comment


              • #8
                Since you had quoted fairly beefy specs in your original email I am going to assume that you are not planning to buy the components and put this server together yourself. So as long as the components are standard there should be no problems with most modern Linux distros.

                Like I said before either Cent OS (which is basically an open source equivalent of RedHat) or RedHat will work. Go with CentOS for starters. There is no need to get the fastest CPU's either (go for more RAM than the fastest CPU's). Your jobs will finish a few mins later but you would save a lot of money.

                Buying a monolithic quad-socket (multi-core) server (as indicated in your original posting) may be fully adequate in your case since putting together a small Linux cluster may give you more flexibility but would require additional systems admin expertise. If you have never done this sort of thing before then you would want to make friends with a local systems admin/Linux guru. This stuff is not trivial to set up securely and correctly, specially if you are just starting out.

                Most tasks you are going to be doing are going to be executed using brute force parallel jobs (there is not much truly parallel software for alignments etc). CASAVA (and many of the other programs) will use as many cores available. It may be the RAM that would turn out to be limiting factor.

                Originally posted by vebaev View Post
                Thanks,

                for the external storage we can always add more, I'm more worried for choosing CPUs, OS and if the software will be optimized and use maximum of all hardware components?

                Is CASAVA (or other frequently used soft) have limits to maximum how many CPUs or RAM can use?

                What OS you can suggest me, and if it is RedHat do we have to pay for year support which is not cheep?

                Comment


                • #9
                  thanks,
                  yes we wil not be assembling this ourselves.
                  Here is the actual offer/quote:

                  Dell PowerEdge R815

                  PowerEdge R815 Rack Chassis, Up to 6x 2.5" HDDs

                  2x AMD Opteron 6168, 12C, 1.9GHz, 12x512K L2/12M L3 Cache, 80W ACP, DDR3-1333MHz

                  Additional 2x AMD Opteron 6168, 12C, 1.9GHz, 12x512K L2/12M L3 Cache, 80W ACP, DDR3-1333MHz

                  128GB Memory for 4 CPUs, DDR3, 1333MHz (16x8GB Dual Ranked LV RDIMMs)

                  4 x 300GB, SAS 6Gbps, 2.5-in, 15K RPM Hard Drive (Hot Plug)

                  16X DVD+/-RW ROM Drive SATA

                  3Yr Basic Warranty - Next Business Day

                  Electronic System Documentation and Dell OpenManage DVD for PowerEdge R815


                  RACK
                  ESTAP SERVERMAX 26U 600x1000mm. Server Cabinet



                  Dell PowerVault MD1220

                  PowerVault MD1220 SAS
                  PV MD12XX Additional Enclosure Management Module

                  24 x 1TB Near Line SAS 6Gbps 7.2k 2.5" HD Hot Plug

                  3Yr Basic Warranty - Next Business Day (Emerging Only)




                  UPS
                  APC Smart-UPS X 3000VA Rack/Tower LCD 200-240V

                  As our building is on top of a hill we got some power stops sometimes and we have to get UPS probably, because I do know know if there is a power glitches when the machine is working
                  ------------
                  SMART - bioinfo.uni-plovdiv.bg

                  Comment


                  • #10
                    That looks fine.

                    Use the internal disks for OS/swap space. Set the external disks up with RAID5 or better for data storage. A logical volume manager for disk management can simplify things.

                    A UPS would be essential (if power is unreliable) to condition the power going into the server and to prevent loss of data.



                    Originally posted by vebaev View Post
                    thanks,
                    yes we wil not be assembling this ourselves.
                    Here is the actual offer/quote:




                    As our building is on top of a hill we got some power stops sometimes and we have to get UPS probably, because I do know know if there is a power glitches when the machine is working
                    Last edited by GenoMax; 03-16-2012, 11:05 AM. Reason: Internal tape drives not available for R815

                    Comment


                    • #11
                      is GAIIx (and its small PC) have something like internal UPS or it is completely naked to power faliures?
                      ------------
                      SMART - bioinfo.uni-plovdiv.bg

                      Comment


                      • #12
                        You would definitely want to look into getting a beefy UPS, if power is a problem. GAIIx/PC do not have built-in UPS's.

                        Originally posted by vebaev View Post
                        is GAIIx (and its small PC) have something like internal UPS or it is completely naked to power faliures?

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin




                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          04-22-2024, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 11:49 AM
                        0 responses
                        15 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-24-2024, 08:47 AM
                        0 responses
                        16 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        61 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        60 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X