HI,
I just started working with ChIP-seq and I am using MACS to predict binding sites for a fungal genome (12MB).
As mentioned here, I am getting the same problem as some previous users: I get an assertion error when calculating negative peaks, and I only managed to solve the problem by reducing the amount of data to 75%, i.e. if the amount of reads goes to 80% or more of the original amount, I will get the error again.
I have a control file with 31M 36bp single reads (4.1GB) and a sample file
with 28M 36bp single reads (3.6GB).
I tried it both a laptop (4GB RAM) and a server (16GB RAM), in
both cases it was using 1.6GB of memory and was behaving the same way. I also tried it on Cistrome, idem.
Changing the mfold parameter didn't help.
With 75% of the data I did get sensible results, but I am not sure how can I move on discarding 25% of a dataset... it just adds another layer of complexity to the analysis.
Does anybody have an idea of the cause of the problem, and of the reason
why reducing the amount of data works?
Also, if anybody knows any alternative, valid tool, feel free to suggest
Thanks!
I just started working with ChIP-seq and I am using MACS to predict binding sites for a fungal genome (12MB).
As mentioned here, I am getting the same problem as some previous users: I get an assertion error when calculating negative peaks, and I only managed to solve the problem by reducing the amount of data to 75%, i.e. if the amount of reads goes to 80% or more of the original amount, I will get the error again.
I have a control file with 31M 36bp single reads (4.1GB) and a sample file
with 28M 36bp single reads (3.6GB).
I tried it both a laptop (4GB RAM) and a server (16GB RAM), in
both cases it was using 1.6GB of memory and was behaving the same way. I also tried it on Cistrome, idem.
Changing the mfold parameter didn't help.
With 75% of the data I did get sensible results, but I am not sure how can I move on discarding 25% of a dataset... it just adds another layer of complexity to the analysis.
Does anybody have an idea of the cause of the problem, and of the reason
why reducing the amount of data works?
Also, if anybody knows any alternative, valid tool, feel free to suggest
Thanks!
Comment