View Single Post
Old 03-31-2015, 04:54 PM   #1
Location: oz

Join Date: Apr 2010
Posts: 12
Default TBLASTX with nt database and PBS PRO job scheduler

Hi I'm wondering if someone might be able to help.

I'm attempting to do a TBLASTX search with 5 query sequences (each about ~1kb in size) against the nt database from NCBI (28990570 sequences).

The nt database (renamed nt_fa in this example) has been split up into 22 x 1GB segments and has a corresponding .nal file.

I'm using the PBS PRO job scheduler with a cluster at our university. I'm submitting the job as an array (which splits it up into 5 separate jobs).

The qsub command:

qsub -l select=1:ncpus=2:mem=8GB:NodeType=any -l walltime=72:00:00 -A sf-UQ -q workq -N BLAST /work1/xxzw/perl_blast_plus/output/job_submit.pbs

The PBS script (job_submit.pbs) is as follows:

#!/bin/bash -l
#PBS -S /bin/bash
#PBS -J 0-4

the blast searches are executed from five different bash scripts ( to for the five query sequences test_0.fa to test_4.fa) which have the general form:

/work1/xxzw/perl_blast_plus/ncbi-blast-2.2.30+/bin/tblastx -query /work1/xxzw/perl_blast_plus/temp/test_0.fa -num_descriptions 1 -num_alignments 1 -evalue 0.01 -db /work1/xxzw/perl_blast_plus/database/nt_fa -out /work1/xxzw/perl_blast_plus/output/test_0_fa_tblastx_nt_fa.blast -word_size 3 -num_threads 8

Everything works ok if I use a smaller database (~5mb split up into five 1mb segments for testing purposes) but when I try and use the nt database I get five blast files that only contain the line:

TBLASTX 2.2.30+

...and nothing else happens after many hours!.

When I terminate the errors or clues are reported in the STDOUT STDERR files for each job.

I've checked the .nal files for the nt_fa database.....everything is fine. I've remade the nt_fa database from fasta files. Same thing.

It seems the issue is to do with the size of the nt_fa database. I've tried increasing the number of processors to 8 and the memory to 22GB in the qsub statement with no effect.

Any ideas what could be the problem?

I'm still quite new to using PBS PRO. This is basically a small scale test for future blasting of several thousand query sequences.

Any help would be appreciated.
quokka is offline   Reply With Quote