View Single Post
Old 08-09-2016, 02:11 PM   #4
Magdoll
Member
 
Location: Bay Area

Join Date: Aug 2011
Posts: 30
Default

Without SGE, it will be slow anyway.

But if you think that single node can handle it, one way is to run multiple instances of `pbtranscript.py cluster` on the different bins.

ex: tofu_wrap.py always creates bins 0to1kb_part0, 1to2kb_part0, etc

You can terminate tofu_wrap and keep the bins as they are. Then separately in each bin call a separate instance of cluster:

pbtranscript.py cluster isoseq_flnc.fasta final.consensus.fa \
--nfl_fa isoseq_nfl.fasta -d cluster --ccs_fofn reads_of_insert.fofn \
--bas_fofn input.fofn --quiver --use_sge \
--max_sge_jobs 40 --unique_id 300 --blasr_nproc 24 --quiver_nproc 8

(for you, you would remove the --use_sge and --max_sge_jobs option)

(see cluster tutorial here: https://github.com/PacificBioscience...-and-Quiver%29)

My guess is this would make it a bit faster but still relatively slow since everything is running in serial instead of parallel, but may be better than nothing...
Magdoll is offline   Reply With Quote