Speeding Up Soapdenovo2

igwill

Junior Member

Join Date: Nov 2018

Posts: 4
- Share
- Tweet
#1

Speeding Up Soapdenovo2

11-03-2018, 11:50 AM

Hi,
I am working on some de novo assemblies with Soapdenovo2 with the multi-mer option, and wanted to know if anyone has experience with getting these crunched quickly-ish.
Specifically, can Soap split itself over multiple nodes on a cluster (via SLURM)? All examples sbatch scripts I see have it use only one node and 8 cpus. I have access to more if it can use it.

Details on what I'm running currently:

My data: 33M (x2) reads of paired end Illumina, 150b read length on fragments in the 300-400bp range. Interleaved in a *.fa. Animal genome, estimating ~100-300Mb.

Relevant SBATCH info:

Code:

#SBATCH --nodes=1 #SBATCH --ntasks-per-node=8 #SBATCH --time=40:00:00 #SBATCH --mem-per-cpu=4000 SOAPdenovo-63mer all -s MYCONFIG -K 63 -m 57 -R -o TESTRUN 1>TEST_ass.log 2>TEST_ass.err

MYCONFIG:

Code:

max_rd_len=150 [LIB] # most options just commented out, assuming defaults are fine for my short paired data avg_ins=350 asm_flags=3 rank=1 p=TESTDATA.fa

For Soapdenovo2, what's the best combination of resources to crank up? Nodes? ntasks (and correspondingly the -p parameter for my run command)? mem-per-cpu?

Thank you!
Tags: None
igwill

Junior Member

Join Date: Nov 2018

Posts: 4
- Share
- Tweet
#2

11-04-2018, 04:16 PM

Just for anyone wandering in later with a similar question, after getting some more info, we are trying this:

#SBATCH --nodes=4
#SBATCH --ntasks-per-node=8
#SBATCH –-mem=128000
#SBATCH --time=168:00:00

Should be plenty to let SOAP do it's thing as quickly as it can, hopefully.
Comment

Previous template Next

Advancing Precision Medicine for Rare Diseases in Children

by seqadmin

Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
- Channel: Articles
12-16-2024, 07:57 AM
Recent Advances in Sequencing Technologies

by seqadmin

Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...
- Channel: Articles
12-02-2024, 01:49 PM

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 22 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 28 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

Speeding Up Soapdenovo2

Comment

Latest Articles

ad_right_rmr

News