Hi all,
after a successful release of SSPACE (http://seqanswers.com/forums/showthread.php?t=8350) we have generated a new tool, called GapFiller, for closing the remaining gaps produced after scaffolding.
GapFiller seeks to find reads that potentially fall within gaps by aligning paired-reads with Bowtie or BWA(-sw). Per gap, it extends both sides until a user-defined overlap is find, and the number of gaps corresponds to the initial number of gapped nucleotides in the scaffolds (allowing a user-defined deviation).
The main features;
* Inputs are simple FASTA scaffold sequences as well as (multiple) FASTA/FASTQ paired-read data
* Multiple library input of both paired-end and/or mate pair datasets
* High-quality closing of gaps
* High reduction of the number of gaps, and the number of gapped nucleotides
* Detailed output of the gaps, e.g.number of reads used, number of nucleotides, remaining gapped nucleotides
* Detailed output of the gapclosing process.
GapFiller has been tested and compared with various datasets (PE and MP), *different gapclosure tools (IMAGE and SOAP's GapClosure ) and different species. GapFiller was tested on four prokaryotes; E.coli,* (E.coli, S.coelicolor, S. aureus, R.* sphaeroides) and two eukaryotes (S.cerevisiae, human chromosome 14).
The results, using the quality metrics of GAGE ( http://gage.cbcb.umd.edu/results/index.html), show that the quality of the closure of GapFiller is more accurate than IMAGE and SOAP's GapClosure.
Although GapFiller yields similar results in terms of the number of gaps/nucleotides closed as SOAP's GapClosure, the smaller error rate indicates that our tool is more appropriate for reliable gap filling.
Further details are provided in our paper in biology (http://genomebiology.com/2012/13/6/R56/abstract). The program can be obtained from our website (http://www.baseclear.com/bioinformatics-tools/) and is free for academic users.
Hope it could be useful and any comments or questions are welcome.
Regards,
Marten Boetzer a.k.a. Boetsie
after a successful release of SSPACE (http://seqanswers.com/forums/showthread.php?t=8350) we have generated a new tool, called GapFiller, for closing the remaining gaps produced after scaffolding.
GapFiller seeks to find reads that potentially fall within gaps by aligning paired-reads with Bowtie or BWA(-sw). Per gap, it extends both sides until a user-defined overlap is find, and the number of gaps corresponds to the initial number of gapped nucleotides in the scaffolds (allowing a user-defined deviation).
The main features;
* Inputs are simple FASTA scaffold sequences as well as (multiple) FASTA/FASTQ paired-read data
* Multiple library input of both paired-end and/or mate pair datasets
* High-quality closing of gaps
* High reduction of the number of gaps, and the number of gapped nucleotides
* Detailed output of the gaps, e.g.number of reads used, number of nucleotides, remaining gapped nucleotides
* Detailed output of the gapclosing process.
GapFiller has been tested and compared with various datasets (PE and MP), *different gapclosure tools (IMAGE and SOAP's GapClosure ) and different species. GapFiller was tested on four prokaryotes; E.coli,* (E.coli, S.coelicolor, S. aureus, R.* sphaeroides) and two eukaryotes (S.cerevisiae, human chromosome 14).
The results, using the quality metrics of GAGE ( http://gage.cbcb.umd.edu/results/index.html), show that the quality of the closure of GapFiller is more accurate than IMAGE and SOAP's GapClosure.
Although GapFiller yields similar results in terms of the number of gaps/nucleotides closed as SOAP's GapClosure, the smaller error rate indicates that our tool is more appropriate for reliable gap filling.
Further details are provided in our paper in biology (http://genomebiology.com/2012/13/6/R56/abstract). The program can be obtained from our website (http://www.baseclear.com/bioinformatics-tools/) and is free for academic users.
Hope it could be useful and any comments or questions are welcome.
Regards,
Marten Boetzer a.k.a. Boetsie
Comment