Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HiSeq 2000 BCL-converter sys. req.

    Hi,

    We are awaiting the installation of a HiSeq system on our site, and are investigating ICT requirements. We are especially looking for hands-on experience on the following items, as the manuals are rather vague.

    1. What is the filesize (and/or ratio) of pre and post BCL->fastq conversion using casava for a full HS2000 run.
    2. What are typical runtimes for bcl->fastq conversion for a full HiSeq 2000 run (please specify number of available threads & ram for comparison)

    Reason I'm asking is that we have limited computing power directly attached to the machine. So we have two options: covert locally to fastq, send fastq over high-speed network to our HPC, or send the whole data structure to HPC and convert to fastq there. I would assume that if our Sequencer-attached computing power (mainly storage, with 8 cores, 8Gb RAM, optionally a second 8 core, 16GB ram machine, using NFS) is sufficient, it will be more efficient to do it locally and transfer only fastq files for further alignment and analysis?

    Any experience or comments welcome :-)

    Geert

  • #2
    Geert,

    To answer your questions:

    1. Raw data folder size will vary depending on the type of run you are doing. We regularly see folder sizes ranging from ~120 GB (50 bp multiplex single end runs) to ~450 GB (100 bp paired end multiplex runs).

    2. Since there are 8 lanes on a flowcell it is convenient to use 8 cores to do the conversion (you could use more but since the process is not truly parallel you will not see a lot of benefit). We have used both SGE and LSF (way the jobs are run under the two is slightly different, you will have to tune this to your local cluster setup).

    If you have only one HiSeq then it should be fine to transfer the data to your cluster (or the other machine you mention) to do the conversion. On a gigabit network (end to end) you should see 35-40 MB/sec transfer rates (add up to 3 hours for the copy for the largest runs). If you are able to export a CIFS/samba share from your HPC cluster storage to the HiSeq workstation (and if your network link is reliable) then you can write the data directly to HPC storage (which could save you the copy effort).

    The actual fastq conversion and de-multiplexing takes up to 2-3 hours for the largest runs mentioned above. The kind of disk storage you have available on the cluster would influence the run time to some extent since there is a lot of disk I/O during the conversion process.


    Originally posted by geertvandeweyer View Post
    Hi,

    1. What is the filesize (and/or ratio) of pre and post BCL->fastq conversion using casava for a full HS2000 run.
    2. What are typical runtimes for bcl->fastq conversion for a full HiSeq 2000 run (please specify number of available threads & ram for comparison)

    Reason I'm asking is that we have limited computing power directly attached to the machine. So we have two options: covert locally to fastq, send fastq over high-speed network to our HPC, or send the whole data structure to HPC and convert to fastq there. I would assume that if our Sequencer-attached computing power (mainly storage, with 8 cores, 8Gb RAM, optionally a second 8 core, 16GB ram machine, using NFS) is sufficient, it will be more efficient to do it locally and transfer only fastq files for further alignment and analysis?

    Any experience or comments welcome :-)

    Geert

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    25 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    28 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    24 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    52 views
    0 likes
    Last Post seqadmin  
    Working...
    X