Syndicated from PubMed RSS Feeds
YOABS: Yet Other Aligner of Biological Sequences -- an efficient linearly scaling nucleotide aligner.
Bioinformatics. 2012 Mar 7;
Authors: Galinsky V
Abstract
MOTIVATION: Explosive growth of short read sequencing technologies in the recent years resulted in rapid development of many new alignment algorithms and programs. But most of them are not efficient or not applicable for reads ? 200 bp because these algorithms specifically designed to process short queries with relatively low sequencing error rates. However the current trend to increase reliability of detection of structural variations in assembled genomes as well as to facilitate de novo sequencing demand complimenting high-throughput short read platforms with long-read mapping. Thus, algorithms and programs for efficient mapping of longer reads are becoming crucial. However, the choice of long read aligners effective in terms of both performance and memory are limited and includes only handful of hash table (BLAT, SSAHA2) or trie (BWT-SW, BWA-SW) based algorithms. RESULTS: New O(n) algorithm that combines the advantages of both hash and trie based methods has been designed to effectively align long biological sequences (? 200 bp) against a large sequence database with small memory footprint (e.g. ~ 2 GB for the human genome). The algorithm is accurate and significantly more fast than BLAT or BWT-SW, but similar to BWT-SW it can find all local alignments. It is as accurate as SSAHA2 or BWA-SW, but uses 3+ times less memory and 10+ times faster than SSAHA2, several times faster than BWA-SW with low error rates and almost two times less memory. AVAILABILITY: The local hit table binary and indices are available at ftp://styx.ucsd.edu. CONTACT: [email protected].
PMID: 22402614 [PubMed - as supplied by publisher]
More...
YOABS: Yet Other Aligner of Biological Sequences -- an efficient linearly scaling nucleotide aligner.
Bioinformatics. 2012 Mar 7;
Authors: Galinsky V
Abstract
MOTIVATION: Explosive growth of short read sequencing technologies in the recent years resulted in rapid development of many new alignment algorithms and programs. But most of them are not efficient or not applicable for reads ? 200 bp because these algorithms specifically designed to process short queries with relatively low sequencing error rates. However the current trend to increase reliability of detection of structural variations in assembled genomes as well as to facilitate de novo sequencing demand complimenting high-throughput short read platforms with long-read mapping. Thus, algorithms and programs for efficient mapping of longer reads are becoming crucial. However, the choice of long read aligners effective in terms of both performance and memory are limited and includes only handful of hash table (BLAT, SSAHA2) or trie (BWT-SW, BWA-SW) based algorithms. RESULTS: New O(n) algorithm that combines the advantages of both hash and trie based methods has been designed to effectively align long biological sequences (? 200 bp) against a large sequence database with small memory footprint (e.g. ~ 2 GB for the human genome). The algorithm is accurate and significantly more fast than BLAT or BWT-SW, but similar to BWT-SW it can find all local alignments. It is as accurate as SSAHA2 or BWA-SW, but uses 3+ times less memory and 10+ times faster than SSAHA2, several times faster than BWA-SW with low error rates and almost two times less memory. AVAILABILITY: The local hit table binary and indices are available at ftp://styx.ucsd.edu. CONTACT: [email protected].
PMID: 22402614 [PubMed - as supplied by publisher]
More...