Hi
We are planning the IT for our sequencing project. We are expecting to sequence half a dozen dairy bulls (Bos Taurus) at 30 times coverage a year for the next few years. I plan to thoroughly investigate the read mapping and variant calling process so as not too loose too many SNPs for our gene discovery work. So we'll be consuming a couple of Terabyte in reads and alignments per year. I can rent relatively expensive high quality fibre channel SAN disks for our compute cluster from the folks in IT or I can try and buy much more cost effective SATA disks in a Network attached storage box e.g. a teeny tiny isilon, (it is NZ...) or even build it myself with a Supermicro 4U storage chasis, 24 1TB SATA drives and a linux CD. I expect to use either a lot of MosaikAligner or Bowtie.
Does anyone have any useful recommendations or experiences on:
a) the value of FibreChannel/SAN disk for I/O performance?
b) is NFS based storage for gzipped fastq read files good enough with 1 Gb networking i.e. compute node to SAN attached IO node or compute node to NAS node?
c) Are 7200 rpm, 1 or 2 TB Sata drives a cost effective way to store reads
for alignments.
If I don't have to rent expensive SAN disk we can sequence more animals!
Any thoughts would be appreciated.
We are planning the IT for our sequencing project. We are expecting to sequence half a dozen dairy bulls (Bos Taurus) at 30 times coverage a year for the next few years. I plan to thoroughly investigate the read mapping and variant calling process so as not too loose too many SNPs for our gene discovery work. So we'll be consuming a couple of Terabyte in reads and alignments per year. I can rent relatively expensive high quality fibre channel SAN disks for our compute cluster from the folks in IT or I can try and buy much more cost effective SATA disks in a Network attached storage box e.g. a teeny tiny isilon, (it is NZ...) or even build it myself with a Supermicro 4U storage chasis, 24 1TB SATA drives and a linux CD. I expect to use either a lot of MosaikAligner or Bowtie.
Does anyone have any useful recommendations or experiences on:
a) the value of FibreChannel/SAN disk for I/O performance?
b) is NFS based storage for gzipped fastq read files good enough with 1 Gb networking i.e. compute node to SAN attached IO node or compute node to NAS node?
c) Are 7200 rpm, 1 or 2 TB Sata drives a cost effective way to store reads
for alignments.
If I don't have to rent expensive SAN disk we can sequence more animals!
Any thoughts would be appreciated.
Comment