Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data Storage after HiSeq Upgrade

    Hi Folks,

    after the (upcoming) upgrade of the HiSeq the local harddisks are too small to hold the data for the whole run locally; data needs to be written to some kind of external storage devices (e.g. the Illumina-recommended Isilon systems).

    How are you managing the data storage for a running HiSeq?
    Are you using Isilon systems or some home-made solutions (Linux/Windows)?
    The old iPARs (SAS) can only be upgraded to 7.5TB (less than 7TB with RAID6) which is too small ...
    Any experiences and comments on pro/contra of home-made solutions?

    just curious :-)

    Sven

  • #2
    Hi Sven,

    We run our HiSeq 2000 on a Dell T7500 installed with two 2.7 TB hard drives (one for each flow cell), which is sufficient local storage for two PE-101bp runs on each (at least with the current chemistry). We copy to an Isilon system for data storage, and (after compression) backup on external hard drives (an inelegant solution, but cheap).

    Harold

    Comment


    • #3
      Originally posted by HESmith View Post
      Hi Sven,

      We run our HiSeq 2000 on a Dell T7500 installed with two 2.7 TB hard drives (one for each flow cell), which is sufficient local storage for two PE-101bp runs on each (at least with the current chemistry). We copy to an Isilon system for data storage, and (after compression) backup on external hard drives (an inelegant solution, but cheap).

      Harold
      Hi Harold,

      that's how we do it currently, local data storage for one run, copying to a server after the run has finished. But after the upgrade (600G) we get more than 7TB data per run. So we need to write on a dedicated (external) system (most people will prefer commercial solutions from e.g. 'isilon' or 'bluearc'). I am curious about some advantages/pitfalls using non-commercial systems ...

      thanks, Sven

      Comment


      • #4
        Hi Sven,

        I didn't realize that the data size per run would increase that much after upgrading. Even with compression, that's going to fill up most storage systems in relatively short order. Perhaps it's worthwhile to consider cloud computing solutions...

        Harold

        Comment


        • #5
          I am assuming you are wanting to store CIF files on the disk. We configured RTA to delete the CIF files from the instrument after successful transfer to a remote (Isilon) data storage disk. By doing this you possibly wouldnt need large diskspace. 2.7 TB should suffice..

          Comment


          • #6
            Originally posted by AijazS View Post
            I am assuming you are wanting to store CIF files on the disk. We configured RTA to delete the CIF files from the instrument after successful transfer to a remote (Isilon) data storage disk. By doing this you possibly wouldnt need large diskspace. 2.7 TB should suffice..
            Data produced during the run is somewhat around 6-8TB; too much for local storage on the machine itself. Just another error by design :-)
            Deleting files after transfer to whatever system is not a problem (though experience as a sequencing core has tought us to keep more files on disk as may be "necessary") ...

            Sven
            Last edited by sklages; 06-07-2011, 10:04 PM. Reason: TB, not GB :-)

            Comment


            • #7
              We plugged the hiseqs in a Pillar Axiom SAN. Our runs of the v3 kit for 100PE (207cycles, 7 for the index) have an average size of about 4.5Tb, no images, cifs and bcls.

              With the v2 kits we got about 4.1Tb, the 400Gb difference is all in the gzipped fastqs.

              Althought I must admit we haven't pushed the cluster density as high as the V3 allows yet. We still get ~220million reads per lane though.

              We might hit 6Tb when we do...we'll see.

              Comment


              • #8
                Originally posted by lletourn View Post
                We plugged the hiseqs in a Pillar Axiom SAN. Our runs of the v3 kit for 100PE (207cycles, 7 for the index) have an average size of about 4.5Tb, no images, cifs and bcls.

                With the v2 kits we got about 4.1Tb, the 400Gb difference is all in the gzipped fastqs.

                Althought I must admit we haven't pushed the cluster density as high as the V3 allows yet. We still get ~220million reads per lane though.

                We might hit 6Tb when we do...we'll see.
                Interesting .. you only stick with the fastq files, deleting cif/bcl? What if you need to re-basecall or re-convert from bcl for whatever reason?

                You're probably right, with increasing cluster densities you'll get pretty fast to 6GB or more ..

                Comment


                • #9
                  What I meant was:
                  We don't keep images
                  We *do* keep cifs and bcls, but only for a month or 2.

                  If after a month no problems were seen in the run we delete everything but the fastqs.

                  so my 4.5Tb is cifs+bcls+fastqs

                  Sorry for the confusion.

                  Comment


                  • #10
                    Ah, .. ok. Now I got it. :-)
                    Same here (except we don't delete). Thanks for clarification ..
                    Last edited by sklages; 06-08-2011, 07:50 AM. Reason: :-)

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-27-2024, 06:37 PM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-27-2024, 06:07 PM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    53 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    68 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X