Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    8 TB drives were announced by Seagate recently http://www.seagate.com/about/newsroo...ves-pr-master/. Those would go nicely with the backblaze.
    so 45 bays in the backblaze times 8TB = 360TB.

    3 of these Storage POD 4.0s would give you a petabyte.

    That'd keep the backup tapes busy.

    Comment


    • #17
      Really appreciate everyone's input here. Are any of you using GATK?

      Comment


      • #18
        GATK for variant calling (SNPs, Indels) as an adjunct to samtools/bcftools calling. As mbblack said:

        The problem I find with any single pipeline or single commercial package is that data is too varied. There is no one size fits all when it comes to the best or most appropriate tool for analyzing a given data set.
        Sometimes GATK is better, sometimes Samtools.

        Personally I find GATK too "picky" for one-off use. You really need to buy into the whole of the GATK culture. That said there is nothing, as far as I know, wrong with the package.

        Comment


        • #19
          @norbert: Are you going to need a commercial license for GATK?

          Comment


          • #20
            I have to agree, don’t go with some 'pre-packaged’ commercial software and don’t buy a workstation from them. You’ll get ripped off on the hardware and the software from those places tends to be lagging what’s available publicly. Plus like others have mentioned it locks you into one method they pick. And as I’ve run across it can lead to irreproducible results if the company goes under or changes things drastically without much said to customers. Also, there is rich community of support for almost every open source software package out there right now, including those that come to this site. For your commercial software you’ll be tied down to them and their support. Maybe it will be great, I don’t know, but I had a friend that purchased geneious and never seemed to get all the answers they needed from support. It seems like the reason is that those people you’re talking to are not bioinformaticians that helped design to the programs, they are support personal reading the same docs you are.

            Personally, I think you’re best off just getting some supermicro intel workstation, putting Ubuntu on it and installing all your own packages. There really isn’t a substitute for knowing exactly what’s installed on your computer, where they are, what versions, etc. If you’re asking about GATK, I assume you’re doing a lot of WGS. So I’d recommend something with a lot of cores (2x8 core or 2x10 core are pretty reasonable these days) and moderate RAM, plus tons of disk space, probably including a NAS or some external solution. In just one 30x genome, we see nearly a 1TB of working files, which might be reduced to 300GB-ish once you have just the fastqs, final bam file and vcf file.

            Comment


            • #21
              Originally posted by GenoMax View Post
              @norbert: Are you going to need a commercial license for GATK?
              Yes I will need to buy the commercial license.

              Comment


              • #22
                Thanks everyone for your input! It will help in our discussions on how to move forward.

                Comment


                • #23
                  Originally posted by norbert View Post
                  My company wants to buy a bioinformatics workstation. I am looking at Knosys, Bina, CLC Bio and Ayrris. Does anyone here have one of these systems? What do you think of it? Do you have any recommendations?
                  Originally posted by cement_head View Post
                  We bought a workstation from CYBERTRONPC for $4500 - 16 core AMD + 128 GB ECC RAM + 120 GB SSD boot + 4 TB RAID. Then we installed BIOLINUX - single click updates ALL of the open source software. We then purchased CLC Genomics Workbench. Total cost => about $11,000 for a pretty swank bioinformatics workstation.
                  Originally posted by fahmida View Post
                  The choice of hardware and software will depend a lot on the problem and nature of data you are working with. For me I always want the option of customizing the environment: from OS to any open source/commercial application.

                  Earlier this year we've acquired a HP BL660c Blade with 16 cores, 512GB RAM and ~50TB RAID, which cost us ~$30,000. We mainly use it for small to medium sized plant genome assembly (de novo) and related NGS and comparative genomics problems. It's a linux machine with almost everything open source except a commercial CLC workbench. I think If your institution/company do not have a policy with brand/vendor etc. you can easily custom built a similar machine like mine with half the cost.

                  There is some aversion in this thread towards commercial workstations (and I get that)...but for those that did go commercial (@cement_head, @fahmida) were there no formidable open-source options?

                  If that's the case, what about CLC's workbench made it the best choice as opposed to Knosys, Bina and Ayrris?
                  Last edited by Hoss; 08-29-2014, 12:00 PM.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X