Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • novoalign multi-threading or para-processing?

    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !

  • #2
    Originally posted by qqcandy View Post
    I have a question about para-processing of novoalign:

    If we have 1 node with 8 cores and 8 lanes of Illuimina data to process, there are two ways to run it:

    1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;

    2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);

    Which one is faster, if there is a difference?

    Thanks in advance !
    Use the threading. I have found that it works well and then you have to avoid a merging step.

    Comment


    • #3
      Originally posted by qqcandy View Post
      1. assign 1 lane of data to each node, so that 8 lanes will be processed simultaneously on 8 nodes independently;
      2. run the 8 lanes of data sequentially using the multi-threading novoalign (-c 8);
      Which one is faster, if there is a difference?
      I would choose Option 2. You will have 8 cores processing a single data set. This will minimize RAM and disk I/O. Option 1 would require 8x as much RAM and cause lots of disk I/O switching between inputs.

      Comment


      • #4
        Thanks a lot! I think we've got a consensus for option 2

        Comment


        • #5
          Just clarify, option 1 wouldn't use 8x RAM as the index is in a shared memory segment and common to all instances of Novoalign. Performance would be similar.
          Colin

          Comment


          • #6
            @sparks

            Is it a feature of novoalign, or a feature of Linux? On LSF, we have to request enough memory for each job; otherwise LSF will kill the job due to memory limit.

            Comment


            • #7
              Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
              LSF I don't know about.

              Comment


              • #8
                Originally posted by sparks View Post
                Linux allows files to be mapped to memory with mmap() function and multiple programs can mmap() and share the same file. Novoalign mmap()s the index so if multiple copies of Novoalign are running and using the same index then only one copy of the index is in memory.
                LSF I don't know about.
                Also check out shm.h, which allows processors to share memory by attaching a shared memory segment to their own processes memory segment.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM
                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:57 AM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-06-2024, 07:17 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-02-2024, 08:06 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-30-2024, 12:17 PM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Working...
                X