SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Core Cluster Setup - Linux, Ubuntu, Rocks, Data Storage, BluArc quantrix Bioinformatics 16 10-05-2012 09:46 AM
Long Term Data Storage gendxdoc Bioinformatics 16 01-10-2012 12:45 AM
Huge NGS data storage and transferring himwo Bioinformatics 2 03-24-2011 01:32 AM
Data Storage Space NGS analyst Bioinformatics 1 01-10-2011 08:22 AM
Data storage rdeborja Bioinformatics 2 11-28-2010 01:46 AM

Reply
 
Thread Tools
Old 05-30-2011, 01:11 AM   #1
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Question Data Storage after HiSeq Upgrade

Hi Folks,

after the (upcoming) upgrade of the HiSeq the local harddisks are too small to hold the data for the whole run locally; data needs to be written to some kind of external storage devices (e.g. the Illumina-recommended Isilon systems).

How are you managing the data storage for a running HiSeq?
Are you using Isilon systems or some home-made solutions (Linux/Windows)?
The old iPARs (SAS) can only be upgraded to 7.5TB (less than 7TB with RAID6) which is too small ...
Any experiences and comments on pro/contra of home-made solutions?

just curious :-)

Sven
sklages is offline   Reply With Quote
Old 05-30-2011, 09:17 AM   #2
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Hi Sven,

We run our HiSeq 2000 on a Dell T7500 installed with two 2.7 TB hard drives (one for each flow cell), which is sufficient local storage for two PE-101bp runs on each (at least with the current chemistry). We copy to an Isilon system for data storage, and (after compression) backup on external hard drives (an inelegant solution, but cheap).

Harold
HESmith is offline   Reply With Quote
Old 05-30-2011, 10:46 PM   #3
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by HESmith View Post
Hi Sven,

We run our HiSeq 2000 on a Dell T7500 installed with two 2.7 TB hard drives (one for each flow cell), which is sufficient local storage for two PE-101bp runs on each (at least with the current chemistry). We copy to an Isilon system for data storage, and (after compression) backup on external hard drives (an inelegant solution, but cheap).

Harold
Hi Harold,

that's how we do it currently, local data storage for one run, copying to a server after the run has finished. But after the upgrade (600G) we get more than 7TB data per run. So we need to write on a dedicated (external) system (most people will prefer commercial solutions from e.g. 'isilon' or 'bluearc'). I am curious about some advantages/pitfalls using non-commercial systems ...

thanks, Sven
sklages is offline   Reply With Quote
Old 05-31-2011, 05:00 AM   #4
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 509
Default

Hi Sven,

I didn't realize that the data size per run would increase that much after upgrading. Even with compression, that's going to fill up most storage systems in relatively short order. Perhaps it's worthwhile to consider cloud computing solutions...

Harold
HESmith is offline   Reply With Quote
Old 06-07-2011, 11:20 AM   #5
AijazS
Junior Member
 
Location: New York

Join Date: Jul 2010
Posts: 3
Default

I am assuming you are wanting to store CIF files on the disk. We configured RTA to delete the CIF files from the instrument after successful transfer to a remote (Isilon) data storage disk. By doing this you possibly wouldnt need large diskspace. 2.7 TB should suffice..
AijazS is offline   Reply With Quote
Old 06-07-2011, 11:27 AM   #6
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by AijazS View Post
I am assuming you are wanting to store CIF files on the disk. We configured RTA to delete the CIF files from the instrument after successful transfer to a remote (Isilon) data storage disk. By doing this you possibly wouldnt need large diskspace. 2.7 TB should suffice..
Data produced during the run is somewhat around 6-8TB; too much for local storage on the machine itself. Just another error by design :-)
Deleting files after transfer to whatever system is not a problem (though experience as a sequencing core has tought us to keep more files on disk as may be "necessary") ...

Sven

Last edited by sklages; 06-07-2011 at 10:04 PM. Reason: TB, not GB :-)
sklages is offline   Reply With Quote
Old 06-08-2011, 07:28 AM   #7
lletourn
Member
 
Location: Montreal

Join Date: Oct 2009
Posts: 63
Default

We plugged the hiseqs in a Pillar Axiom SAN. Our runs of the v3 kit for 100PE (207cycles, 7 for the index) have an average size of about 4.5Tb, no images, cifs and bcls.

With the v2 kits we got about 4.1Tb, the 400Gb difference is all in the gzipped fastqs.

Althought I must admit we haven't pushed the cluster density as high as the V3 allows yet. We still get ~220million reads per lane though.

We might hit 6Tb when we do...we'll see.
lletourn is offline   Reply With Quote
Old 06-08-2011, 07:39 AM   #8
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by lletourn View Post
We plugged the hiseqs in a Pillar Axiom SAN. Our runs of the v3 kit for 100PE (207cycles, 7 for the index) have an average size of about 4.5Tb, no images, cifs and bcls.

With the v2 kits we got about 4.1Tb, the 400Gb difference is all in the gzipped fastqs.

Althought I must admit we haven't pushed the cluster density as high as the V3 allows yet. We still get ~220million reads per lane though.

We might hit 6Tb when we do...we'll see.
Interesting .. you only stick with the fastq files, deleting cif/bcl? What if you need to re-basecall or re-convert from bcl for whatever reason?

You're probably right, with increasing cluster densities you'll get pretty fast to 6GB or more ..
sklages is offline   Reply With Quote
Old 06-08-2011, 07:44 AM   #9
lletourn
Member
 
Location: Montreal

Join Date: Oct 2009
Posts: 63
Default

What I meant was:
We don't keep images
We *do* keep cifs and bcls, but only for a month or 2.

If after a month no problems were seen in the run we delete everything but the fastqs.

so my 4.5Tb is cifs+bcls+fastqs

Sorry for the confusion.
lletourn is offline   Reply With Quote
Old 06-08-2011, 07:48 AM   #10
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Ah, .. ok. Now I got it. :-)
Same here (except we don't delete). Thanks for clarification ..

Last edited by sklages; 06-08-2011 at 07:50 AM. Reason: :-)
sklages is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:40 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO