Hi Chiayi,
I can't replicate the slowdown from -Xmx settings - that seems to be a result of your filesystem and virtual memory, caching, and overcommit settings, which are causing disk-swapping. But I'm glad you got it working at a reasonable speed, and hopefully this will help others who have had extremely slow performance in some situations.
I've identified the problem causing the slowdown with optical deduplication. It's because in your dataset there is one huge clump of 293296 reads, with a huge number of duplicates that are not optical duplicates. In that situation the performance can become O(N^2) with the size of the clump, which is very slow (though it's still making progress), since it currently compares every duplicate to every other duplicate to find if they are within the distance limit of each other, and both headers are parsed every time. I've modified it to be 5x faster now, and I am continuing to modify it to be faster still by sorting based on lane and tile number; hopefully, in most cases, it can become >100x faster.
I can't replicate the slowdown from -Xmx settings - that seems to be a result of your filesystem and virtual memory, caching, and overcommit settings, which are causing disk-swapping. But I'm glad you got it working at a reasonable speed, and hopefully this will help others who have had extremely slow performance in some situations.
I've identified the problem causing the slowdown with optical deduplication. It's because in your dataset there is one huge clump of 293296 reads, with a huge number of duplicates that are not optical duplicates. In that situation the performance can become O(N^2) with the size of the clump, which is very slow (though it's still making progress), since it currently compares every duplicate to every other duplicate to find if they are within the distance limit of each other, and both headers are parsed every time. I've modified it to be 5x faster now, and I am continuing to modify it to be faster still by sorting based on lane and tile number; hopefully, in most cases, it can become >100x faster.
Comment