EDIT (July 5th, 2011): An alternate version of pBWA is now available that cleans up the workflow a bit. The user is no longer required to enter the number of reads in the FASTQ file, and SAM information is output to one file in parallel by all processors. There are also a few minor stability enhancements that should make pBWA compatible with MPICH. Performance appears to be similar to pBWA-r32. Thanks go to Rob Egan for the enhancements.
For my master's thesis in computer science, I developed a parallel version of BWA based on the OpenMPI library, called pBWA. pBWA retains and improves upon the multithreading provided by BWA while adding efficient parallelization for its core alignment functions [aln, sampe, samse]. The wall-time speedup of pBWA is bounded only by the size of the parallel system as it can run on any number of nodes and/or cores simultaneously. With suitable computer systems, pBWA can align billions of sequence reads within hours, more efficiently facilitating the analysis of new generations of NGS data.
Note that the improvements pBWA makes for the multithreading have been shown to Heng Li and will probably be implemented in a future release of BWA.
I have successfully tested pBWA on a couple systems, namely the SHARCNET (www.sharcnet.ca) and a school server with the most basic OpenMPI install.
If you have access to a cluster or parallel machine, you may want to give pBWA a try. Due to the nature of parallel computing, the optimal number of nodes/threads used will vary greatly depending on things like RAM and interconnect speeds.
pBWA can be obtained by visiting
A manual page is located at
Thanks for your time!
For my master's thesis in computer science, I developed a parallel version of BWA based on the OpenMPI library, called pBWA. pBWA retains and improves upon the multithreading provided by BWA while adding efficient parallelization for its core alignment functions [aln, sampe, samse]. The wall-time speedup of pBWA is bounded only by the size of the parallel system as it can run on any number of nodes and/or cores simultaneously. With suitable computer systems, pBWA can align billions of sequence reads within hours, more efficiently facilitating the analysis of new generations of NGS data.
Note that the improvements pBWA makes for the multithreading have been shown to Heng Li and will probably be implemented in a future release of BWA.
I have successfully tested pBWA on a couple systems, namely the SHARCNET (www.sharcnet.ca) and a school server with the most basic OpenMPI install.
If you have access to a cluster or parallel machine, you may want to give pBWA a try. Due to the nature of parallel computing, the optimal number of nodes/threads used will vary greatly depending on things like RAM and interconnect speeds.
pBWA can be obtained by visiting
A manual page is located at
Thanks for your time!
Comment