![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
hisat2 multiple thread usage | ronaldrcutler | RNA Sequencing | 6 | 06-16-2016 10:12 AM |
High multiple mapping rate Drosophila pair-end 50 bp RNAseq | TTYE | Illumina/Solexa | 7 | 06-03-2015 06:22 AM |
Thoughts on new workstation/server using Haswell CPUs | atcghelix | Bioinformatics | 18 | 09-04-2014 12:53 PM |
Should one instance of mpileup use a full thread? Computer running slowly. | Heisman | Bioinformatics | 6 | 07-19-2013 05:12 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: changsha Join Date: Mar 2016
Posts: 46
|
![]()
Hi, there. I have 7 smrtcell data. It's really slow to run tofu_wrap.py, I want to know to weather I can use multiple CPUs or thread to speed up the running rate. Thank you for any tips!
|
![]() |
![]() |
![]() |
#2 |
Member
Location: Bay Area Join Date: Aug 2011
Posts: 30
|
![]()
Hi,
Short answer is yes. There are multiple ways to hack tofu_wrap.py -- will require additional monitoring of the parallel jobs. But I first need to ask what parameters did you use to call tofu_wrap.py? What are the size bins (aka what are the subdirectories in clusterOut/)? Do you have an SGE cluster? Do you have multiple nodes from which you can run parallel pbtranscript.py cluster jobs? Also, please consider joining the Iso-Seq google group: https://groups.google.com/forum/#!forum/smrt_isoseq Since tofu_wrap.py is on the cutting edge (it's not officially supported in SMRTAnalysis 2.x but is on the agenda for SMRTAnalysis 3.x), the google group is better suited ![]() --Liz |
![]() |
![]() |
![]() |
#3 | |
Member
Location: changsha Join Date: Mar 2016
Posts: 46
|
![]() Quote:
Code:
tofu_wrap.py --nfl_fa isoseq_nfl.fasta --ccs_fofn reads_of_insert.fofn --bas_fofn input.fofn -d clusterOut --quiver --bin_manual "(0,2,4,6,8,9,10,11,12,13,15,17,19,20,23)" --gmap_db /zs32/data-analysis/liucy_group/llhuang/Reflib/gmapdb --gmap_name gmapdb_h19 --output_seqid_prefix human isoseq_flnc.fasta final.consensus.fa has no an SGE cluster. Can I use multiple CPUs to run command? |
|
![]() |
![]() |
![]() |
#4 |
Member
Location: Bay Area Join Date: Aug 2011
Posts: 30
|
![]()
Without SGE, it will be slow anyway.
But if you think that single node can handle it, one way is to run multiple instances of `pbtranscript.py cluster` on the different bins. ex: tofu_wrap.py always creates bins 0to1kb_part0, 1to2kb_part0, etc You can terminate tofu_wrap and keep the bins as they are. Then separately in each bin call a separate instance of cluster: pbtranscript.py cluster isoseq_flnc.fasta final.consensus.fa \ --nfl_fa isoseq_nfl.fasta -d cluster --ccs_fofn reads_of_insert.fofn \ --bas_fofn input.fofn --quiver --use_sge \ --max_sge_jobs 40 --unique_id 300 --blasr_nproc 24 --quiver_nproc 8 (for you, you would remove the --use_sge and --max_sge_jobs option) (see cluster tutorial here: https://github.com/PacificBioscience...-and-Quiver%29) My guess is this would make it a bit faster but still relatively slow since everything is running in serial instead of parallel, but may be better than nothing... |
![]() |
![]() |
![]() |
Thread Tools | |
|
|