Hello,
I have been struggling for the last few weeks to get Breakdancer to run accross some whole genome data. The data was sequenced on SOLiD machines and aligned using Bioscope.
I have been able to get Breakdancer to build a configuration file using the parameters for SOLiD (the -C color space option), the actual command looks like:
bam2cfg.pl -n 1000000 -g -h -C normal.bam tumor.bam > breakdancer.cfg
I am then able to run breakdancer_max using that cofig file as such:
breakdancer_max breakdancer.cfg -g output.GBrowse -d fast_q_evidence.o
This command runs.. and runs.. and runs... and finally either runs out of memory or computation time.
The last run I did ran for 100 hours, using 48GB of memory before the job was cancelled for running too long. The output of this was about 6.7 million "detected" structural variations. And it only just got up to chromosome 3!
This leads me to believe it would need 1,000 hours or so of computation time to run fully, which is not feasible at the moment (42 days!). At that rate it would also find 67 million SV's, which doesn't quite seem right!
Is this in line with anyone else's experience?
The tumor and normal files are 120GB and 180GB each, so I don't expect it to be a fast process, but 40 days seems excessive.
I have also attempted to run Breakdancer in single chromosome mode, but this fails with a segmentation fault immediately.
Has anyone been able to get the single chromosome version to work? Or know why it would segfault?
Thank you.
I have been struggling for the last few weeks to get Breakdancer to run accross some whole genome data. The data was sequenced on SOLiD machines and aligned using Bioscope.
I have been able to get Breakdancer to build a configuration file using the parameters for SOLiD (the -C color space option), the actual command looks like:
bam2cfg.pl -n 1000000 -g -h -C normal.bam tumor.bam > breakdancer.cfg
I am then able to run breakdancer_max using that cofig file as such:
breakdancer_max breakdancer.cfg -g output.GBrowse -d fast_q_evidence.o
This command runs.. and runs.. and runs... and finally either runs out of memory or computation time.
The last run I did ran for 100 hours, using 48GB of memory before the job was cancelled for running too long. The output of this was about 6.7 million "detected" structural variations. And it only just got up to chromosome 3!
This leads me to believe it would need 1,000 hours or so of computation time to run fully, which is not feasible at the moment (42 days!). At that rate it would also find 67 million SV's, which doesn't quite seem right!
Is this in line with anyone else's experience?
The tumor and normal files are 120GB and 180GB each, so I don't expect it to be a fast process, but 40 days seems excessive.
I have also attempted to run Breakdancer in single chromosome mode, but this fails with a segmentation fault immediately.
Has anyone been able to get the single chromosome version to work? Or know why it would segfault?
Thank you.
Comment