View Single Post
Old 12-06-2016, 09:14 AM   #6
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,814

If all data can fit in memory, Clumpify needs the amount of time it takes to read and write the file once. If the data cannot fit in memory, it takes around twice that long.
Is there a way to force clumpify to use just memory (if enough is available) instead of writing to disk?

Edit: On second thought that may not be practical/useful but I will leave the question in for now to see if @Brian has any pointers.

For a 12G input gziped fastq file, clumpify made 28 temp files (each between 400-600M in size).

Edit 2: Final file size was 6.8G so a significant reduction in size.

Last edited by GenoMax; 12-06-2016 at 12:38 PM.
GenoMax is offline   Reply With Quote