Could anyone please provide a working config file for bfast.submit.pl to use as an example? We tried to create a config file, but it's sometimes hard to guess which parameter name in the schema stands for which in the BFAST programs. Also, Eclipse reports that the provided xml schema is incorrect at some places. Despite our efforts, the bfast.submit.pl exited without producing output. (We're using the data from the latest BFAST version.) bfast.submit.pl looks like a valuable tool and it would be very useful to get it to run for our cluster.
In that context, I'd like to know what the most efficient way of running BFAST is. I can use a node with 16 CPUs, up to 128 GB RAM. The 10 indexes for the human genome are 12 GB each so it's probably impossible to load them all into memory and keep enough space for the rest, especially when using pipes. As I noted, reading the indexes (one at a time as done by default) is the most time-consuming part in our case. Instead of splitting up the reads much and call multiple instances of bfast match with all indexes, I think it would be better to process all reads with one of the indexes in parallel.
Thanks in advance for the help
Barbara
In that context, I'd like to know what the most efficient way of running BFAST is. I can use a node with 16 CPUs, up to 128 GB RAM. The 10 indexes for the human genome are 12 GB each so it's probably impossible to load them all into memory and keep enough space for the rest, especially when using pipes. As I noted, reading the indexes (one at a time as done by default) is the most time-consuming part in our case. Instead of splitting up the reads much and call multiple instances of bfast match with all indexes, I think it would be better to process all reads with one of the indexes in parallel.
Thanks in advance for the help
Barbara
Comment