Hello, friendly SEQanswers community .
Recently, we received a sizeable grant to start up a next-gen sequencing facility. Since I am the only bioinformaticist at my institute (but a biologist by training) -- I am having some difficulty estimating the computing infrastructure needed to handle a thus far unknown number of sequencing projects.
However, here are some things that I am fairly certain about:
Here's what I do not know:
I suppose my problem is manifold. I can not reliable estimate the usage that these sequencing machines will receive, thus I am having trouble coming up with the computing requirements to handle the data.
As my institute has not even placed the order for the sequencing machines (there's still a lot of bickering about which sequencing technologies to choose), should I simply wait and see what we REALLY end up getting before putting together a parts list?
I am thinking about polling the institute and talking with higher-ups to figure out how many groups will be doing sequencing at our currently non-existent facility as this information is probably critical in determining HDD/Memory/CPU requirements.
Nevertheless, what would be an economical and scalable set-up that one might start with, assuming that a single HiSeq machine will be used to capacity every month?
Thank you all a thousand times for any information.
Recently, we received a sizeable grant to start up a next-gen sequencing facility. Since I am the only bioinformaticist at my institute (but a biologist by training) -- I am having some difficulty estimating the computing infrastructure needed to handle a thus far unknown number of sequencing projects.
However, here are some things that I am fairly certain about:
- We will probably buy one HiSeq 1000 machine.
- We will probably buy one MiSeq machine.
- We may buy non-illumina technology, i.e., Ion Torrent, Roche 454.
- I would prefer to use open source software only.
Here's what I do not know:
- How often these sequencing machines will be used.
- How many users at the institute will be sequencing.
- How many users at the institute will want direct access to the server to run downstream jobs.
- If users outside of the institute will be allowed access to the servers to run other jobs (e.g., climate studies).
I suppose my problem is manifold. I can not reliable estimate the usage that these sequencing machines will receive, thus I am having trouble coming up with the computing requirements to handle the data.
As my institute has not even placed the order for the sequencing machines (there's still a lot of bickering about which sequencing technologies to choose), should I simply wait and see what we REALLY end up getting before putting together a parts list?
I am thinking about polling the institute and talking with higher-ups to figure out how many groups will be doing sequencing at our currently non-existent facility as this information is probably critical in determining HDD/Memory/CPU requirements.
Nevertheless, what would be an economical and scalable set-up that one might start with, assuming that a single HiSeq machine will be used to capacity every month?
Thank you all a thousand times for any information.
Comment