Hi,
I am working on some de novo assemblies with Soapdenovo2 with the multi-mer option, and wanted to know if anyone has experience with getting these crunched quickly-ish.
Specifically, can Soap split itself over multiple nodes on a cluster (via SLURM)? All examples sbatch scripts I see have it use only one node and 8 cpus. I have access to more if it can use it.
Details on what I'm running currently:
My data: 33M (x2) reads of paired end Illumina, 150b read length on fragments in the 300-400bp range. Interleaved in a *.fa. Animal genome, estimating ~100-300Mb.
Relevant SBATCH info:
MYCONFIG:
For Soapdenovo2, what's the best combination of resources to crank up? Nodes? ntasks (and correspondingly the -p parameter for my run command)? mem-per-cpu?
Thank you!
I am working on some de novo assemblies with Soapdenovo2 with the multi-mer option, and wanted to know if anyone has experience with getting these crunched quickly-ish.
Specifically, can Soap split itself over multiple nodes on a cluster (via SLURM)? All examples sbatch scripts I see have it use only one node and 8 cpus. I have access to more if it can use it.
Details on what I'm running currently:
My data: 33M (x2) reads of paired end Illumina, 150b read length on fragments in the 300-400bp range. Interleaved in a *.fa. Animal genome, estimating ~100-300Mb.
Relevant SBATCH info:
Code:
#SBATCH --nodes=1 #SBATCH --ntasks-per-node=8 #SBATCH --time=40:00:00 #SBATCH --mem-per-cpu=4000 SOAPdenovo-63mer all -s MYCONFIG -K 63 -m 57 -R -o TESTRUN 1>TEST_ass.log 2>TEST_ass.err
Code:
max_rd_len=150 [LIB] # most options just commented out, assuming defaults are fine for my short paired data avg_ins=350 asm_flags=3 rank=1 p=TESTDATA.fa
Thank you!
Comment