Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computer Hardware: CPU vs. Memory

    Hi, I am planning to build a computer for next-gen analysis with a tight budget. The main application is de novo assembly, re-sequencing, and RNA-seq.

    I can choose either two AMD 8-core CPUs (16 cores total) with 16G memory or one AMD 8-core CPU with 24G memory. My question is whether I should invest in # of cores or memory capacity in this case.

    Thank you,
    Douglas

  • #2
    Definitely more memory. A lot of people are writing terrible code that wastes tons of memory, and it's better to run programs slowly than not be able to run them at all.

    Comment


    • #3
      Yes, more memory. Sequencers are only going to spit out more reads.

      Also, more memory usually means programs can potentially run faster.
      SpliceMap: De novo detection of splice junctions from RNA-seq
      Download SpliceMap Comment here

      Comment


      • #4
        Also look at the max memory the motherboard can hold, because you'll probably want to add more memory later (e.g. using consumables budget or next year's money).

        Comment


        • #5
          You say "main application" but then list three different applications, that have very different requirements. To be fair resequencing and RNA-Seq share a lot of requirements, a primary one being mapping reads to a reference. Mappers do not require a ton memory but can be sped up (in a nearly linear fashion) by adding cpus. As john mentions sequencers are going to be spitting out more reads, but if your pipeline involves mapping those reads to a reference more memory won't do you much good at all, but doubling the # of cpus sure will.

          On the other hand de novo assembly is a memory pig, and most algorithms are not highly threaded, meaning additional cpus will not provide much benefit for this application.

          You really need to define your requirements better. What specific programs do you think you'll be using? What are their resource requirements to perform projects sized similarly to yours?

          Comment


          • #6
            Thank you all for great suggestions and comments. Ideally I should bulid two machines - one is for de novo and the other for mapping. Due to the limited budget, I will choose somewhere in between. it is a great suggestion to choose a mainboard with upgrade potential!

            Comment


            • #7
              I'd also suggest adding a SDD drive to use as a scratch. Then you need cheap 1T drives
              to store your data (SATA-II 7k2 should be fine).
              Let us know what machine(s) you end up getting.
              -drd

              Comment


              • #8
                You might be better off using AWS (the cloud).

                Comment


                • #9
                  Originally posted by DZhang View Post
                  Hi, I am planning to build a computer for next-gen analysis with a tight budget. The main application is de novo assembly, re-sequencing, and RNA-seq.

                  I can choose either two AMD 8-core CPUs (16 cores total) with 16G memory or one AMD 8-core CPU with 24G memory. My question is whether I should invest in # of cores or memory capacity in this case.

                  Thank you,
                  Douglas
                  More memory... but I would check that the disk I/O is fast and efficient.

                  Comment


                  • #10
                    Originally posted by DZhang View Post
                    Hi, I am planning to build a computer for next-gen analysis with a tight budget. The main application is de novo assembly, re-sequencing, and RNA-seq. I can choose either two AMD 8-core CPUs (16 cores total) with 16G memory or one AMD 8-core CPU with 24G memory. My question is whether I should invest in # of cores or memory capacity in this case.
                    De novo assembly needs more RAM, while re-sequencing (read mapping) and RNA-seq (read mapping + analysis) require less RAM and more CPU.

                    Frankly, the difference between 16GB and 24GB RAM is not that much, and won't help with de novo too much. More importantly is the RAM PER CPU, your choices are 1 GB/core (x16) or 3 GB/core (x8).

                    I assume you are working on large genomes for which you have references, like human or mouse? In that case I think you will be doing much more read mapping than de novo, so one would think more cores is better, but 1 GB/core is a bit low for mapping to large genomes, so you may have idle CPUs anyway! So the 24GB RAM would probably be my choice in the end.

                    The issue of fast disk subsystem is a crucial one, which usually gets ignored. A good RAID controller or smart use of Linux md software RAID with multiple 7200rpm spindles should be enough on your tight budget. But remember, if your disks are slow, you can't get data into RAM fast, and processes wait on I/O a lot - especially when there are so many cores competing for disk I/O ! More RAM helps here too, for disk cache etc.

                    As an aside, does your institute or partner institute have access to a HPC facility where you can get some CPU allocation?

                    Comment


                    • #11
                      Look at the motherboard, because what you want is expandability. Boards for the AMD 6100 typically come with 1, 2 or 4 processor sockets and have 8, 16 or 32 memory slots respectively. Processor slots need to be populated with identical cpus (not all need to be filled), and memory slots should be populated in groups of 4 identical sticks.

                      Your to 24GB is likely a 1P board with 4x4GB + 4x2GB, and thus would fill all of your cpu and memory slots (no expandability without throwing away components).

                      The 16GB configuration would likely be a 2p board with 8x2GB or 4x4GB and thus would leave 8 or 12 open memory slots (room to grow).

                      Comment


                      • #12
                        Originally posted by jwfoley View Post
                        Definitely more memory. A lot of people are writing terrible code that wastes tons of memory, and it's better to run programs slowly than not be able to run them at all.
                        AGREED.
                        I am curious though if I were to use a SSD as a swap would I be in a sweet zone for $$ vs speed?
                        but I guess it's a moot question since for some reason I can't find programs that allow you to choose to write to disk or use RAM.
                        http://kevin-gattaca.blogspot.com/

                        Comment


                        • #13
                          Originally posted by KevinLam View Post
                          AGREED.
                          I am curious though if I were to use a SSD as a swap would I be in a sweet zone for $$ vs speed?
                          An SSD is only marginally faster than a HDD when compared to RAM. A good RAID array of HDDs still beats a single SSD too (for throughput, not latency though).

                          HDD ~ 75 MB/s
                          SSD ~ 300 MB/s
                          RAM ~ 10000 MB/s (!)

                          but I guess it's a moot question since for some reason I can't find programs that allow you to choose to write to disk or use RAM.
                          Just use your SSD as your virtual memory / swap disk?

                          Some software is now being intelligently written to exploit RAM/HDD tradeoff, for example this read mapper: Syzygy

                          Comment


                          • #14
                            Originally posted by Torst View Post
                            An SSD is only marginally faster than a HDD when compared to RAM. A good RAID array of HDDs still beats a single SSD too (for throughput, not latency though).

                            HDD ~ 75 MB/s
                            SSD ~ 300 MB/s
                            RAM ~ 10000 MB/s (!)



                            Just use your SSD as your virtual memory / swap disk?

                            Some software is now being intelligently written to exploit RAM/HDD tradeoff, for example this read mapper: Syzygy
                            Well SSDs vary in speeds as well and while you have a point about SATA HDD RAID.
                            You can easily have SATA SSD RAID.
                            4 x SSD would have ~ 1200 MB/s by your numbers
                            only 8.33x slower than RAM!

                            btw your url is not formatted properly went to some weird site
                            From smartphone apps and robotics, to satellites, sensors and telescopes mapping the Universe, we're providing innovative solutions that are helping to secure Australia's digital future.
                            Last edited by KevinLam; 08-27-2010, 12:27 AM.
                            http://kevin-gattaca.blogspot.com/

                            Comment


                            • #15
                              Originally posted by KevinLam View Post
                              Well SSDs vary in speeds as well and while you have a point about SATA HDD RAID.
                              You can easily have SATA SSD RAID.
                              4 x SSD would have ~ 1.2 GB/s by your numbers
                              Yes you can have SSD RAID of course, and there are plenty of people with Enterprise budgets to do so - but I can't afford it!

                              The other issue is that 4xSSD RAID0 = 1.2 GB/s = 9.6 Gbit/sec. Even SATA3 is only 6.0 Gbit/sec, so you have to start investing in more expensive interconnects like 10GigE, multiple FC, etc. And have a PCIe bus and CPU<->BUS connection that can cope too!

                              My point is that it is still long way away from RAM throughput and latency (SSD = micro/milli seconds, RAM = nanoseconds).

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              47 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X