View Single Post
Old 12-16-2016, 06:18 PM   #16
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It looks like in both cases, Clumpify did not run out of memory, but was killed by your job scheduling system or OS. This can happen sometimes when the job scheduler is designed to instantly kill processes when virtual memory exceeds a quota; it used to happen on JGI's cluster until we made some adjustments. The basic problem is this:

When Clumpify (or any other program) spawns a subprocess, that uses a fork operation, and the OS temporarily allocates twice the original virtual memory. It seems very strange to me, but here's what happens in practice:

1) You run Clumpify on a .bz2 file, and tell Clumpify to use 16 GB with the flag -Xmx16g, or similar. Even if it only needs 2GB of RAM to store the input, it will still use (slightly more than) 16 GB of virtual memory.
2) Clumpify sees that the input file is .bz2. Java cannot natively process bzipped files, so it starts a subprocess running bzip2 or pbzip2. That means a fork occurs and for a tiny fraction of a second the processes are using 32GB of virtual memory (even though at that point nothing has been loaded, so the physical memory being used is only 40 MB or so). After that fraction of a second, Clumpify will still be using 16 GB of virtual memory and 40 MB of physical memory, and the bzip2 process will be using a few MB of virtual and physical memory.
3) The job scheduler looks at the processes every once in a while to see how much memory they are using. If you are unlucky, it might look right at the exact moment of the fork. Then, if you only scheduled 16 GB and are using 32 GB of virtual memory, it will kill your process, even though you are only using 40 MB of physical memory at that time.

Personally, I consider this to be a major bug in the job schedulers that have this behavior. Also, not allowing programs to over-commit virtual memory (meaning, use more virtual memory than is physically present) is generally a very bad idea. Virtual memory is free, after all. What job scheduler are you using? And do you know what your cluster's policy is for over-comitting virtual memory?

I think that in this case the system will allow the program to execute and not kill it if you request 48 GB, but add the flag "-Xmx12g" to Clumpify. That way, even when it uses a fork operation to read the bzipped input, and potentially another fork operation to write the gzipped output with pigz, it will still stay under the 48 GB kill limit. Alternately you could decompress the input before running Clumpify and tell it not to use pigz with the pigz=f flag, but I think changing the memory settings is a better solution because that won't affect speed.

As for the 25% file size reduction - that's fairly low for NextSeq data with binned quality scores; for gzip in and gzip out, I normally see ~39%. Clumpify can output bzipped data if you name the output file as .bz2; if your pipeline is compatible with bzipped data, that should increase the compression ratio a lot, since .bz2 files are smaller than .gz files. Of course, unless you are using pbzip2 the speed will be much lower; but with pbzip2 and enough cores, .bz2 files compress fast and decompress even faster than .gz.

Anyway, please try with requesting 48GB and using the -Xmx12g flag (or alternately requesting 16GB and using -Xmx4g) and let me know if that resolves the problem.

Oh, I should also mention that if you request 16GB, then even if the program is not doing any forks, you should NOT use the flag -Xmx16g, you should use something like -Xmx13g (roughly 85% of what you requested). Why? -Xmx16g sets the heap size, but Java needs some memory for other things too (like per-thread stack memory, memory for loading classes, memory for the virtual machine, etc). So if you need to set -Xmx manually because the memory autodetection does not work (in which case, I'd like to hear the details about what the program does when you don't define -Xmx, because I want to make it as easy to use as possible) then please allow some overhead. Requesting 16GB and using the flag -Xmx16G is something I would expect to always fail on systems that to not allow virtual memory overcommit. In other words, possibly, your first command would work fine if you just changed the -Xmx16g to -Xmx13g.
Brian Bushnell is offline   Reply With Quote