Hi all:
I have a bash script which manages our analysis pipeline to do assembly of multiple sequences. It runs without trouble on a Mac workstation, but when I run it (with slight modification) on a cluster, using pbsdsh to distribute a unique copy to each of 16 processors on one node, an apparently random subset crash with status 134 and the following in the log:
The amount of memory available per processor (4 GB) is the same as the amount of memory on the Mac workstation. The processes which survive are not fully consistent between runs, but are often the smaller datasets being assembled on a node. The failure appears even when I distribute the 16 jobs to the even-numbered processors on two nodes, thus doubling (I presume) the available memory. It does not report a segmentation fault.
Any thoughts on what might be going on?
Many thanks in advance,
Mark
I have a bash script which manages our analysis pipeline to do assembly of multiple sequences. It runs without trouble on a Mac workstation, but when I run it (with slight modification) on a cluster, using pbsdsh to distribute a unique copy to each of 16 processors on one node, an apparently random subset crash with status 134 and the following in the log:
Code:
[bwa_seq_open] fail to open file '/scratch/user/path/data03/585M.txt.trim'. Abort! /scratch/user/path/bin/analyze.pbsdsh3.sh: line 77: 20910 Aborted (core dumped) $BWA samse -r '@RG\tID:1\tSM:1\tPL:ILLUMINA' $FASTA $PREF.sai $PREF > $PREF.sam
Any thoughts on what might be going on?
Many thanks in advance,
Mark