SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Computer requirements to do Bioinformatic analysis Giorgio C Bioinformatics 8 01-03-2012 07:10 AM
Minimum hardware requirements for TopHat Hunny Bioinformatics 0 04-28-2011 09:33 PM
Computer Hardware: CPU vs. Memory DZhang Bioinformatics 16 09-22-2010 04:52 AM
hardware requirements to run blastX litali Bioinformatics 6 09-21-2010 01:55 AM
workstation hardware Berlinq Bioinformatics 7 12-10-2009 01:18 AM

Reply
 
Thread Tools
Old 09-11-2009, 02:23 AM   #1
Najim
Junior Member
 
Location: Amsterdam

Join Date: Sep 2009
Posts: 3
Default Computer hardware requirements

Hello everybody,

I am new to the next generation sequencing thechnology, and guess that I will be harassing you for a while.
Let me start off with my first question. As I need to buy a new computer soon, withwhich I'll be performing the analysis of the tremendous sequencing data, I was wondering about the minimal requirements of the computer hardware.
Hope to read from you soon.
Greetings,
Najim
Najim is offline   Reply With Quote
Old 09-11-2009, 04:00 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,539
Default

What organisms are you working on? Small genomes like viruses, modest things like bacteria, or complex genomes like eukaryotes? This has an enormous impact on the amount of data and the amount of memory you'll need.
maubp is offline   Reply With Quote
Old 09-11-2009, 04:54 AM   #3
Najim
Junior Member
 
Location: Amsterdam

Join Date: Sep 2009
Posts: 3
Default

Quote:
Originally Posted by maubp View Post
What organisms are you working on? Small genomes like viruses, modest things like bacteria, or complex genomes like eukaryotes? This has an enormous impact on the amount of data and the amount of memory you'll need.
I'll be working on the human genome. I want to investigate the possible implementation of NGS for molecular diagnostics of genetic disorders involving large genes and/or multiple genes.
Najim is offline   Reply With Quote
Old 10-28-2009, 02:55 PM   #4
captainentropy
Member
 
Location: San Francisco

Join Date: Mar 2009
Posts: 87
Default

Najim, I'm doing lots of solexa sequencing of ChIP DNA and only need basic peak finding and bioinformatics so the requirements are modest for me. I built a workstation with a quad-core CPU and 8GB of RAM with velocipartor drives and it screams. For what you are doing I would suspect you will need lots more RAM. I forget where I read it but I think one person doing whole genome sequencing was using an 8-core system with 32GB or RAM. You'll have to have tons of storage space, though. Don't skimp. Get a 16TB RAID array and come up with a good backup/archiving solution for all your data. I quickly filled up a 2TB array, expanded it to 4TB and am working on getting an 8 or 16TB array. This line of experimentation isn't cheap.
captainentropy is offline   Reply With Quote
Old 10-28-2009, 06:44 PM   #5
pssclabs
Junior Member
 
Location: NY, NY

Join Date: Sep 2009
Posts: 6
Default

Hi Najim:

We have been working to develop turn key cluster solutions for different NGS platforms. You may be familiar with our GS FLX Titanium Cluster (www.pssclabs.com/roche) that is available from Roche and works in conjunction with their Titanium sequencer. Based on your sequencing of human genome you may want to consider a cluster based solution. Do you have a time restraint (ie 24 hours) for run time?
pssclabs is offline   Reply With Quote
Old 10-29-2009, 05:49 AM   #6
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by Najim View Post
Hello everybody,

I am new to the next generation sequencing thechnology, and guess that I will be harassing you for a while.
Let me start off with my first question. As I need to buy a new computer soon, withwhich I'll be performing the analysis of the tremendous sequencing data, I was wondering about the minimal requirements of the computer hardware.
Hope to read from you soon.
Greetings,
Najim
Can you tell us what sequencer instrument are you getting?
__________________
-drd
drio is offline   Reply With Quote
Old 11-02-2009, 01:14 AM   #7
Najim
Junior Member
 
Location: Amsterdam

Join Date: Sep 2009
Posts: 3
Default

Thanx for the info guys.

The main purpose of the computer will be mutation detection in specific genomic regions. I didn't realize I'll need that much of RAM memory (32Gb).
I reckon the sequence data assembly and partial data analysis (base-calling) happens on the workstation, which is coupled to the sequencer.

We are probably going to get the Illumina sequencer. Firstly, I am planning to perform a pilot sequence run of about 12 multiplexed samples on one lane (about 1 Mb of sequence each). The samples contain several mutation types, such as nucleotide substitions (heterozygous as well as homozygous), micro indels, and large deletions (ranging from one exon to several exons). The run is a success if all mutations feeded to the machine are detected (easily). By the way, I am considering using CLC-bio software. Any suggestions/comments on that matter is greatly appreciated.

The computer I have in mind has the following specs:
- memory: 6 Gb RAM
- Processor: AMD Phenom II X4
- Hard disk: 1 Tb
- Videocard: 1 Gb

What do you guys think?

Greetings,
Najim
Najim is offline   Reply With Quote
Old 11-02-2009, 05:14 AM   #8
Roald
Director at CLC bio
 
Location: Denmark

Join Date: Aug 2008
Posts: 26
Default

Hi Najim,

Due to the large file sizes there is a lot of IO going on in NGS bioinformatics analysis. For this reason it is a good idea to purchase a machine with fast discs.

For inspiration, you can have a look at the specs of the machine we have put together:
http://www.clcmachine.com/specs.php

Cheers

Roald Forsberg

************Disclaimer: I work at CLC bio*************
Roald is offline   Reply With Quote
Old 11-02-2009, 06:59 PM   #9
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by Roald View Post
Hi Najim,

Due to the large file sizes there is a lot of IO going on in NGS bioinformatics analysis. For this reason it is a good idea to purchase a machine with fast discs.

For inspiration, you can have a look at the specs of the machine we have put together:
http://www.clcmachine.com/specs.php

Cheers

Roald Forsberg

************Disclaimer: I work at CLC bio*************
Roald, can you tell me how much did you pay for that box?
__________________
-drd
drio is offline   Reply With Quote
Old 11-03-2009, 12:56 AM   #10
Roald
Director at CLC bio
 
Location: Denmark

Join Date: Aug 2008
Posts: 26
Default CLC Genomics Machine

Quote:
Originally Posted by drio View Post
Roald, can you tell me how much did you pay for that box?
Hi drio,

The machine is actually a product that we sell bundled with a full software solution. You can read the full specs of hardware and software at http://www.clcmachine.com/

If you write an e-mail to our sales office we are happy to give you a quote. You can find the contact info at http://www.clcbio.com/index.php?id=11

Cheers

Roald Forsberg

************Disclaimer: I work at CLC bio*************
Roald is offline   Reply With Quote
Old 11-03-2009, 01:17 AM   #11
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by drio View Post
Roald, can you tell me how much did you pay for that box?
Get your systems administrator to give you some space on one of their racks (another reason to be nice to sys-admins). You should be able to get such a system (depending on how many Us and vendor) for $2,500 - $5,000.
nilshomer is offline   Reply With Quote
Old 11-03-2009, 03:26 AM   #12
BaCh
Member
 
Location: Germany

Join Date: May 2008
Posts: 79
Default

Quote:
Originally Posted by nilshomer View Post
You should be able to get such a system (depending on how many Us and vendor) for $2,500 - $5,000.
Ummm ... where do you get 2 x X5550 and 48 GB RAM for USD 5k? Quotes I've seen were more in the USD 9-9.5k range.

B.
BaCh is offline   Reply With Quote
Old 11-03-2009, 04:20 AM   #13
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by nilshomer View Post
Get your systems administrator to give you some space on one of their racks (another reason to be nice to sys-admins). You should be able to get such a system (depending on how many Us and vendor) for $2,500 - $5,000.
I wish it was so simple.
And yes, that box is not going to be 2k5/5k.. more close to 7k.
__________________
-drd
drio is offline   Reply With Quote
Old 11-03-2009, 11:27 AM   #14
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by drio View Post
I wish it was so simple.
And yes, that box is not going to be 2k5/5k.. more close to 7k.
I'll PM you since I haven't received my commission from that company.
nilshomer is offline   Reply With Quote
Old 11-03-2009, 12:36 PM   #15
What_Da_Seq
Member
 
Location: RTP

Join Date: Jul 2008
Posts: 28
Default

Our 2 Xeon quad core, 32GB, 1.5TB Raid5 cost less than 3grand. You have to shop around for the memory and hard drives. And don't forget if you order a tower server the FANs are seriously loud.
What_Da_Seq is offline   Reply With Quote
Old 11-03-2009, 06:38 PM   #16
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Quote:
Originally Posted by What_Da_Seq View Post
Our 2 Xeon quad core, 32GB, 1.5TB Raid5 cost less than 3grand. You have to shop around for the memory and hard drives. And don't forget if you order a tower server the FANs are seriously loud.
The question is what computations are you running with this hardware?
How are the hdds? SAS? SATA? rpms?

That machine can probably compute the alignments of a full illumina FC or
full SOLiD slide (Fragments -- illumina:75bp/SoLiD:50bp -- human data -- bfast recommended indexes/parameters) in 3 or 4 days?

Now if you add faster Cpus, faster FSB and _specially_ faster hdds you can reduce that to one day. Is it worth it 3k more to get your computation
done in 1 day instead of 3 or 4? It depends.

P.S: of course computing this in your cluster is the way to go. But we are talking here about single machine performance.
__________________
-drd
drio is offline   Reply With Quote
Old 11-03-2009, 11:53 PM   #17
Roald
Director at CLC bio
 
Location: Denmark

Join Date: Aug 2008
Posts: 26
Default

Quote:
Originally Posted by drio View Post
The question is what computations are you running with this hardware?
How are the hdds? SAS? SATA? rpms?

That machine can probably compute the alignments of a full illumina FC or
full SOLiD slide (Fragments -- illumina:75bp/SoLiD:50bp -- human data -- bfast recommended indexes/parameters) in 3 or 4 days?

Now if you add faster Cpus, faster FSB and _specially_ faster hdds you can reduce that to one day. Is it worth it 3k more to get your computation
done in 1 day instead of 3 or 4? It depends.

P.S: of course computing this in your cluster is the way to go. But we are talking here about single machine performance.
@drio: to give you an idea about performance on our setup, please see http://www.clcmachine.com/benchmarks.php Note, that the benchmarks are on our SIMD accelerated softwares, so it may not be immediately comparable to e.g. other assemblers.

Because of the heavy IO fast discs for temporary storage help a lot.

Regarding your PS, we can comfortably handle all our computational needs by just using single machines with specs as discussed in this thread. To us there has been no need for large clusters or clouds yet.

Cheers

Roald

********* Disclaimer: I work for CLC bio ************
Roald is offline   Reply With Quote
Old 04-30-2010, 02:49 AM   #18
ymc
Senior Member
 
Location: Hong Kong

Join Date: Mar 2010
Posts: 493
Default

Quote:
Originally Posted by Roald View Post
Hi Najim,

Due to the large file sizes there is a lot of IO going on in NGS bioinformatics analysis. For this reason it is a good idea to purchase a machine with fast discs.

For inspiration, you can have a look at the specs of the machine we have put together:
http://www.clcmachine.com/specs.php

Cheers

Roald Forsberg

************Disclaimer: I work at CLC bio*************
I find that you can configure a similar box at dell.com. I think the cost is abt US$10,000+
ymc is offline   Reply With Quote
Old 04-30-2010, 02:54 AM   #19
ymc
Senior Member
 
Location: Hong Kong

Join Date: Mar 2010
Posts: 493
Default

Quote:
Originally Posted by nilshomer View Post
Get your systems administrator to give you some space on one of their racks (another reason to be nice to sys-admins). You should be able to get such a system (depending on how many Us and vendor) for $2,500 - $5,000.
I would like you to sell that to me for $5,000

I am in the process of shopping around, too. 48GB of RAM alone is at least $3,500.
ymc is offline   Reply With Quote
Old 04-30-2010, 08:38 AM   #20
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by ymc View Post
I would like you to sell that to me for $5,000

I am in the process of shopping around, too. 48GB of RAM alone is at least $3,500.
Feel free to PM me. I just looked at the website of a vendor (the ram is <$2500). I would make ~$500 dollars selling you that (intel 5550 and 48GB of RAM)) for $5,000
nilshomer is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO