Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • trimmomatic-0.36 problem

    --Hi,

    i have strange behaviour on my cluster node using trimmomatic regarding the use of multiple threads:

    this is the command line i use:

    java -XX:ParallelGCThreads=4 -XX:+DoEscapeAnalysis -Xmx8g -jar $prog/trimmomatic-0.36.jar PE -threads 16 -phred33 $IN_OUT/ERR532589_1.fastq.gz $IN_OUT/ERR532589_2.fastq.gz $IN_OUT/Out_ERR532589_1.fastq.gz $IN_OUT/Out_unpaired_ERR532589_1.fastq.gz $IN_OUT/Out_ERR532589_2.fastq.gz $IN_OUT/Out_unpaired_ERR532589_2.fastq.gz ILLUMINACLIP:$ADAPTERS/TruSeq3-PE-2.fa:2:40:15:8:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

    it was mentionned that i use 16 threads but in the log file of my LSF job i see only 4 processes:

    Resource usage summary:

    CPU time : 20157.96 sec.
    Max Memory : 3544 MB
    Average Memory : 2677.95 MB
    Total Requested Memory : 124800.00 MB
    Delta Memory : 121256.00 MB
    (Delta: the difference between Total Requested Memory and Max Memory.)
    Max Processes : 4
    Max Threads : 51


    What is the best way to optimize my command line, it seems that if i put 4,6..or 16 theads the behaviour and the elapsed time still the same.

    I work on node with 2 CPU (8 core each) and 128GB per node

    thank you,
    Laurent --

  • #2
    The number of threads and processes are different. And that message does not display elapsed time, but CPU time, which is (for a fully-multithreaded program) #threads times elapsed time. So, it won't change as you adjust the number of threads. Your command looks fine to me; you are presumably using 16 worker threads. But you can test that by comparing the elapsed time to the CPU time and seeing if one is in fact 16 times the other (just add "time " before "java" in the command line). Or run top on the node the process is running and look at the CPU utilization. If you want it to go faster, though, you can try BBDuk instead.

    Comment


    • #3
      --hi,

      thanks for your help Brian, i don't know BBDuk, i'm going to test your program.
      Do you think it's necessary to use the java garbage collector parameters such as: -XX:ParallelGCThreads=4 -XX:+DoEscapeAnalysis to optimize the process ?

      thnak you --

      Comment


      • #4
        No. I never use those. Theoretically "-XX:+DoEscapeAnalysis" might increase speed a tiny amount, but it's the kind of thing that typically gives timing results within the margin of error, and varies substantially between Java versions (both in its effects and what the default setting is); it's entirely possible that it's already enabled by default. There are a lot of non-default Java flags you can add and they mostly just increase the length of your command line. And I don't really see any reason to cap the number of parallel GC threads at 4 in any case, unless you are running on a shared system and are only allowed to use 4 threads max.

        Well, anyway, I thought I'd test it, as an example...

        Code:
        D:\temp\contam>java -ea -Xmx1g jgi.SplitPairsAndSingles rp in=ecc31.fq.gz unpigz
        Executing jgi.SplitPairsAndSingles [rp, in=ecc31.fq.gz, unpigz]
        
        Set INTERLEAVED to false
        No output stream specified.  To write to stdout, please specify 'out=stdout.fq' or similar.
        
        Input:                          12000000 reads          1797457160 bases.
        Result:                         12000000 reads (100.00%)        1797457160 bases (100.00%)
        Pairs:                          12000000 reads (100.00%)        1797457160 bases (100.00%)
        Singletons:                     0 reads (0.00%)         0 bases (0.00%)
        
        Time:                           20.177 seconds.
        Reads Processed:      12000k    594.73k reads/sec
        Bases Processed:       1797m    89.08m bases/sec
        Code:
        D:\temp\contam>java -XX:ParallelGCThreads=4 -XX:+DoEscapeAnalysis -ea -Xmx1g -cp D:\temp\BBTools_public\BBMap_35.92\bbmap\current jgi.SplitPairsAndSingles rp in=ecc31.fq.gz unpigz
        Executing jgi.SplitPairsAndSingles [rp, in=ecc31.fq.gz, unpigz]
        
        Set INTERLEAVED to false
        No output stream specified.  To write to stdout, please specify 'out=stdout.fq' or similar.
        
        Input:                          12000000 reads          1797457160 bases.
        Result:                         12000000 reads (100.00%)        1797457160 bases (100.00%)
        Pairs:                          12000000 reads (100.00%)        1797457160 bases (100.00%)
        Singletons:                     0 reads (0.00%)         0 bases (0.00%)
        
        Time:                           20.242 seconds.
        Reads Processed:      12000k    592.83k reads/sec
        Bases Processed:       1797m    88.80m bases/sec
        That's a 1.7% speed difference (in favor of NOT using those flags); under the margin of error which is around 2.5%, after running it multiple times.

        Comment


        • #5
          Brian, thank you for this detailled answer and quick test.
          I have installed BBmap and BBDuck to test it but i don't find an easy method to convert my trimmomatic command line below to BBDuck command line:

          java -Xmx8g -jar trimmomatic-0.36.jar PE -threads 16 -phred33 ERR532589_1.fastq.gz ERR532589_2.fastq.gz Out_ERR532589_1.fastq.gz Out_unpaired_ERR532589_1.fastq.gz Out_ERR532589_2.fastq.gz Out_unpaired_ERR532589_2.fastq.gz ILLUMINACLIP:TruSeq3-PE-2.fa:2:40:15:8:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

          Do you have some examples that can help me please ?

          thank you,
          Laurent --

          Comment


          • #6
            Hi Laurent,

            There are examples in /bbmap/docs/guides/BBDukGuide.txt

            But for reference, I recommend this command:

            Code:
            bbduk.sh -Xmx1g t=16 in=ERR532589_#.fastq.gz out=Out_ERR532589_#.fastq.gz ref=adapters.fa ktrim=r k=23 mink=11 hdist=1 tbo tpe qtrim=rl trimq=12 minlen=36

            Comment


            • #7
              Originally posted by mslider View Post
              What is the best way to optimize my command line, it seems that if i put 4,6..or 16 theads the behaviour and the elapsed time still the same.
              I have already given you the long answer by email, but for the benefits of the rest of the community, the key points are:
              1. Use of compressed output is the typical bottleneck in Trimmomatic - this part is (currently) limited to one thread per output file, and in many cases, 2 - 4 worker threads are already enough to move the bottleneck to the output compression threads.
              2. Assuming you want output compression, you probably want to run multiple datasets in parallel (using multiple Trimmomatic processes using e.g. the shell, xargs or queuing system jobs) and use e.g 4 worker threads each.
              3. Input decompression is also one thread per file, but since decompression is much faster, use of compressed input files only matters if you are really pushing things, beyond e.g. 12 worker threads. And you will need a decent disk setup to avoid it being a bottleneck.


              Hope this helps

              Comment


              • #8
                okay good,
                Thank you Tony i have received your message by email.
                Brian -> thank you so for the command line.

                Mark.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                59 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                57 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                56 views
                0 likes
                Last Post seqadmin  
                Working...
                X