SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Running tophat on a cluster NM_010117 Bioinformatics 14 10-31-2016 08:16 AM
Bison: BISlfite alignment On Nodes of a cluster dpryan Bioinformatics 19 10-27-2014 12:56 AM
Running Tophat/Cufflinks on a cluster with *Multiple* nodes kitinje Bioinformatics 12 03-31-2014 05:42 PM
Running Velvet on a cluster Brett_CCG Bioinformatics 5 03-07-2014 10:58 PM
How to further optimize blast+ on a cluster? jpearl01 Bioinformatics 12 02-06-2014 08:16 AM

Reply
 
Thread Tools
Old 01-10-2015, 12:42 PM   #1
TauOvermind
Member
 
Location: UK

Join Date: Jul 2012
Posts: 14
Default Running Blast+ on multiple nodes on a cluster -- what is the best way to that?

Hi, I have been recently granted access to the HPC cluster of my university. I am going to run several blastx searches (Blast+ version, not legacy blast) there to identify potential virulence factors and toxins in Illumina metagenomic datasets.

The cluster I will be using has the following characteristics:
Nodes: 112 Dell R410 (quad-core Xeons, 8 threads) with 24Gb RAM each. I can use up to 6 nodes at once.
OS: RHEL v 5
Queuing system: Torque (PBS)

The problem is, I am a molecular biologist with no formal bioinformatics training and absolutely no previous experience with HPC clusters. I am also the first one to use this cluster for biology-related computations, and, as it has been used only by physicists and mathematicians so far, IT guys are unable to help me with my questions.

So I would like to ask people with more knowledge on that topic, what would be the best way to run my blast searches? As far as I understood from reading other posts (http://seqanswers.com/forums/showthread.php?t=29760 and http://seqanswers.com/forums/showthread.php?t=40048) and blast+ documentation, blast+ does support multithreading, but has no built-in means to parallelise runs on different CPUs/PCs/nodes. Should I split my fasta files, run 6 independent 8-threaded instances of blast search on 6 nodes, and combine blast outputs in the end?

On a side note, I would be very grateful if someone could recommend me a short intro into HPC computing for biologists, so I wouldn't bother busy people with newbie questions any longer.
TauOvermind is offline   Reply With Quote
Old 01-11-2015, 05:29 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

If you had an input query FASTA file of (for example) 1000 query sequences, then I would split this into several separate FASTA files (e.g. ten files of 100 sequences each), and submit them to the cluster as ten jobs, and then combine the BLAST output.

Each BLAST job could be set to ask for a single machine with 8 threads. Or, there is flexibility here - while BLAST does get faster when given more threads, this is not perfect - so it might be faster overall to use four threads for each BLAST job (meaning on your cluster, there could be two BLAST jobs running at the same time - fine if you have enough RAM).
maubp is offline   Reply With Quote
Old 01-11-2015, 05:53 AM   #3
TauOvermind
Member
 
Location: UK

Join Date: Jul 2012
Posts: 14
Default

maubp, thank you for your suggestion, I will try it tomorrow.

Just wanted to clarify, when you mentioned two BLAST jobs running at the same time, did you mean that they would be running on the same node simultaneously? So, if I have 6 nodes with 8 threads on each and I submit 12 blast jobs with 4 threads for each, there would be 12 independent blast instances (jobs) running in parallel, assuming that the memory is not a problem?
TauOvermind is offline   Reply With Quote
Old 01-11-2015, 06:00 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Yes - assuming your limit is really six machines at once. I'm not familiar enough with Torque/PBS to really guess, but it could be you are limited to six active jobs at once?
maubp is offline   Reply With Quote
Old 01-11-2015, 06:19 AM   #5
TauOvermind
Member
 
Location: UK

Join Date: Jul 2012
Posts: 14
Default

Thank you for clarification, and you are probably right about the jobs limit, but I am not really sure about that. I was told that I can use 48 threads per job at most, so all the rest are just my guesses, and I might be completely wrong.
TauOvermind is offline   Reply With Quote
Old 01-11-2015, 09:35 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,048
Default

With 24GB RAM per node you should not start a lot of threads. Size of the database you are going to search against is going to determine the outcome here. Blastx searches are compute intensive as is.

I would recommend that you run one or two exploratory jobs (start with 4 and 8 threads, keep threads from a job on the same node) and allocate the maximum RAM you are allowed to use (with 24G physical RAM you can probably use 20-22G at most for the job, provided nothing else is running on node) and see how much RAM is actually used by the job in the log. Depending on results you can then decide on number of threads to use per node.
GenoMax is offline   Reply With Quote
Old 01-11-2015, 03:57 PM   #7
TauOvermind
Member
 
Location: UK

Join Date: Jul 2012
Posts: 14
Default

Thank you for your suggestions as well, GenoMax. I made a script (virulence.sh) for Torque, which would submit 6 jobs of blastx with 4 threads per each node (total 24 threads):

Code:
#!/bin/bash


#Setting Torque parameters
#PBS -N vir_blast
#PBS -j oe
#PBS -m abe
#PBS -M my.mail@uni.edu
#PBS -q main_queue
#PBS -l mem=22000mb
#PBS -l nodes=1:ppn=4
#PBS -t 0-5

#Loading modules
module add shared
module add torque
module add blastx


#Executing commands

cd $PBS_O_WORKDIR

#Each blastx instance cosists of one main thread and 'k' working threads, whose number is specified by '-num_threads' parameter
#Thus, to use 4 CPU threads per node '-num_threads' should be set to 3 (1 main and 3 worker blastx threads will be created)
 
blastx -db virDB -query ./meta_chunk_${PBS_ARRAYID}.fa -e 1e-5 -num_threads 3 -otfmt 6 -out ./results_chunk_${PBS_ARRAYID}.fm6
The script will be executed with 'qsub virulence.sh' command.

Could someone take a look at the script and tell me if it looks fine?

I am still trying to comprehend how queueing actually works. Let's assume I have an idle cluster with 6 nodes, 8 cores on each, so 48 cores in total. If I request 12 independent jobs with 4 cores per job (#PBS -l nodes=1:ppn=4, #PBS -t 0-11), what would happen? Will all my jobs run simultaneously on the cluster, with 2 jobs running on each node, or will only 6 jobs be started with 6 others waiting in the queue?
TauOvermind is offline   Reply With Quote
Old 01-11-2015, 04:23 PM   #8
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,048
Default

Quote:
Originally Posted by TauOvermind View Post
I am still trying to comprehend how queueing actually works. Let's assume I have an idle cluster with 6 nodes, 8 cores on each, so 48 cores in total. If I request 12 independent jobs with 4 cores per job (#PBS -l nodes=1pn=4, #PBS -t 0-11), what would happen? Will all my jobs run simultaneously on the cluster, with 2 jobs running on each node, or will only 6 jobs be started with 6 others waiting in the queue?
If you only consider job slots then technically your 12 independent jobs will start at the same time, if all 48 cores are idle. But in this case we are ignoring other requests you are making. e.g. if you request 22G of memory for each job then one one of the jobs can run at a give time considering your nodes have 24G of RAM. So a job scheduler takes into account a combination of what you request in terms of resources then matches it with what you have access to/are allowed to use based on local "fare share" policy and ultimately what the current load status is for the cluster (how busy are the nodes, are all job slots full etc).

Here is an example page of how PBS is used: http://arc.research.umich.edu/flux-a...ux/pbs-basics/

Last edited by GenoMax; 01-11-2015 at 04:28 PM.
GenoMax is offline   Reply With Quote
Old 01-11-2015, 04:27 PM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,048
Default

BTW: I don't know why you are referring to a main thread and working threads in your script. Here is an example of a PBS script that starts with "n" CPU's and equal number of threads: http://swes.cals.arizona.edu/maier_l...ome/docs/blast
GenoMax is offline   Reply With Quote
Old 01-11-2015, 04:45 PM   #10
TauOvermind
Member
 
Location: UK

Join Date: Jul 2012
Posts: 14
Default

Thank you very much for both your explanations and the links you provided, GenoMax. I have spent a lot of time googling for a good example of a BLAST+ script for a cluster with Torque/PBS today, but haven't seen the second page. I read about main and working threads of BLAST+ here, but now I am confused:
https://wiki.hpcc.msu.edu/display/Bi...ple+Processors

I will try to run a test job tomorrow.
TauOvermind is offline   Reply With Quote
Old 01-11-2015, 04:54 PM   #11
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,048
Default

I don't use PBS but it does appear that the information at the link you provided indicates that a core is needed for the "main" job. Try with n = CPU = threads first and see what happens.
GenoMax is offline   Reply With Quote
Reply

Tags
blast, blast+, cluster, parallel, torque

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:33 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO