Hi I'm wondering if someone might be able to help.
I'm attempting to do a TBLASTX search with 5 query sequences (each about ~1kb in size) against the nt database from NCBI (28990570 sequences).
The nt database (renamed nt_fa in this example) has been split up into 22 x 1GB segments and has a corresponding .nal file.
I'm using the PBS PRO job scheduler with a cluster at our university. I'm submitting the job as an array (which splits it up into 5 separate jobs).
The qsub command:
qsub -l select=1:ncpus=2:mem=8GB:NodeType=any -l walltime=72:00:00 -A sf-UQ -q workq -N BLAST /work1/xxzw/perl_blast_plus/output/job_submit.pbs
The PBS script (job_submit.pbs) is as follows:
#!/bin/bash -l
#PBS -S /bin/bash
#PBS -J 0-4
/work1/xxzw/perl_blast_plus/temp/${PBS_ARRAY_INDEX}.sh
the blast searches are executed from five different bash scripts (0.sh to 4.sh for the five query sequences test_0.fa to test_4.fa) which have the general form:
#!/bin/bash
/work1/xxzw/perl_blast_plus/ncbi-blast-2.2.30+/bin/tblastx -query /work1/xxzw/perl_blast_plus/temp/test_0.fa -num_descriptions 1 -num_alignments 1 -evalue 0.01 -db /work1/xxzw/perl_blast_plus/database/nt_fa -out /work1/xxzw/perl_blast_plus/output/test_0_fa_tblastx_nt_fa.blast -word_size 3 -num_threads 8
Everything works ok if I use a smaller database (~5mb split up into five 1mb segments for testing purposes) but when I try and use the nt database I get five blast files that only contain the line:
TBLASTX 2.2.30+
...and nothing else happens after many hours!.
When I terminate the job....no errors or clues are reported in the STDOUT STDERR files for each job.
I've checked the .nal files for the nt_fa database.....everything is fine. I've remade the nt_fa database from fasta files. Same thing.
It seems the issue is to do with the size of the nt_fa database. I've tried increasing the number of processors to 8 and the memory to 22GB in the qsub statement with no effect.
Any ideas what could be the problem?
I'm still quite new to using PBS PRO. This is basically a small scale test for future blasting of several thousand query sequences.
Any help would be appreciated.
I'm attempting to do a TBLASTX search with 5 query sequences (each about ~1kb in size) against the nt database from NCBI (28990570 sequences).
The nt database (renamed nt_fa in this example) has been split up into 22 x 1GB segments and has a corresponding .nal file.
I'm using the PBS PRO job scheduler with a cluster at our university. I'm submitting the job as an array (which splits it up into 5 separate jobs).
The qsub command:
qsub -l select=1:ncpus=2:mem=8GB:NodeType=any -l walltime=72:00:00 -A sf-UQ -q workq -N BLAST /work1/xxzw/perl_blast_plus/output/job_submit.pbs
The PBS script (job_submit.pbs) is as follows:
#!/bin/bash -l
#PBS -S /bin/bash
#PBS -J 0-4
/work1/xxzw/perl_blast_plus/temp/${PBS_ARRAY_INDEX}.sh
the blast searches are executed from five different bash scripts (0.sh to 4.sh for the five query sequences test_0.fa to test_4.fa) which have the general form:
#!/bin/bash
/work1/xxzw/perl_blast_plus/ncbi-blast-2.2.30+/bin/tblastx -query /work1/xxzw/perl_blast_plus/temp/test_0.fa -num_descriptions 1 -num_alignments 1 -evalue 0.01 -db /work1/xxzw/perl_blast_plus/database/nt_fa -out /work1/xxzw/perl_blast_plus/output/test_0_fa_tblastx_nt_fa.blast -word_size 3 -num_threads 8
Everything works ok if I use a smaller database (~5mb split up into five 1mb segments for testing purposes) but when I try and use the nt database I get five blast files that only contain the line:
TBLASTX 2.2.30+
...and nothing else happens after many hours!.
When I terminate the job....no errors or clues are reported in the STDOUT STDERR files for each job.
I've checked the .nal files for the nt_fa database.....everything is fine. I've remade the nt_fa database from fasta files. Same thing.
It seems the issue is to do with the size of the nt_fa database. I've tried increasing the number of processors to 8 and the memory to 22GB in the qsub statement with no effect.
Any ideas what could be the problem?
I'm still quite new to using PBS PRO. This is basically a small scale test for future blasting of several thousand query sequences.
Any help would be appreciated.
Comment