Hello all,
I've been playing with the Agalma transcriptomics pipeline.
I'm attempting to run the genetree step in the pipeline, which involves running raxml on a large set of homologs clustered by mcl. Agalma then takes the individual gene trees and 'prunes' out paralogs.
Here's the problem: the genetree script works, but instead of executing multiple runs simultaneously, raxml is being run on one homolog at a time and only using a single thread despite having access to more. This isn't a huge issue on small datasets with few taxa, but it quickly becomes an insurmountable bottleneck.
I'd assumed there was a simple fix (Agalma is just calling raxml with bash) or some setting I'd overlooked, but after considerable exertion, I've yet to make any progress. The only thing I can think of doing is running raxml externally on all 7000+ homologs and inserting the trees back into Agalma, but I've got a feeling the SQL database won't like that.
Has anyone encountered this issue, or have any suggestions on solutions? Thanks!
I'm using a mac pro running Mavericks, and I get the same issue with a Ubuntu Linux 14.04 virtual machine.
I've been playing with the Agalma transcriptomics pipeline.
I'm attempting to run the genetree step in the pipeline, which involves running raxml on a large set of homologs clustered by mcl. Agalma then takes the individual gene trees and 'prunes' out paralogs.
Here's the problem: the genetree script works, but instead of executing multiple runs simultaneously, raxml is being run on one homolog at a time and only using a single thread despite having access to more. This isn't a huge issue on small datasets with few taxa, but it quickly becomes an insurmountable bottleneck.
I'd assumed there was a simple fix (Agalma is just calling raxml with bash) or some setting I'd overlooked, but after considerable exertion, I've yet to make any progress. The only thing I can think of doing is running raxml externally on all 7000+ homologs and inserting the trees back into Agalma, but I've got a feeling the SQL database won't like that.
Has anyone encountered this issue, or have any suggestions on solutions? Thanks!
I'm using a mac pro running Mavericks, and I get the same issue with a Ubuntu Linux 14.04 virtual machine.