SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bwa alignment failed CTGF Bioinformatics 1 10-24-2011 01:35 AM
BWA alignment followed by TopHat slny Bioinformatics 9 06-06-2011 07:19 AM
Spliced alignment with BWA telos SOLiD 7 10-06-2010 07:32 AM
Alignment statistics from bwa? menenuh Bioinformatics 1 01-29-2010 02:14 PM
BWA Alignment Errors Seq1234 Bioinformatics 0 12-15-2009 09:59 PM

Reply
 
Thread Tools
Old 05-09-2011, 06:28 AM   #1
ilveroluca
Junior Member
 
Location: Italy

Join Date: May 2011
Posts: 2
Default Announcing Seal 0.1.0: BWA alignment on Hadoop

Hello everyone.

We've just released Seal (http://biodoop-seal.sourceforge.net/), a Hadoop-
based distributed short read alignment and analysis toolkit. Currently SEAL
includes tools for: read alignment (based on BWA), duplicate read removal,
and sorting read mappings. SEAL scales, easily handling TB of data. If you’re
aligning read data sets of more than a couple of hundred MB, and you have a
cluster of computers (even a small one, say 4 or 5 nodes, and up to hundreds
of nodes) then Seal might be for you.

On a 16-node Hadoop cluster, with 8 cores and 16 GB of RAM per node, we have
measured map+rmdup throughputs of 13 Gbp / hour, and 19 Gbp / hour in map-only
mode. Scalability tests show that the throughput per node is maintained as
the number of nodes increases through to 128.

We have been working on Seal to support the needs of the CRS4 Sequencing
laboratory, which operates 6 Illumina sequencing machines and thus generates
lots of data to process. The regular workflow was being overwhelmed
notwithstanding the increased number of computers made available and was
regularly overloading our Lustre shared storage volume. Now all
data processing at the lab starts with Seal, with very positive results with
respect to speed and maintenance effort.

In case you were wondering, Hadoop (http://hadoop.apache.org/) is an open
source, distributed, and robust MapReduce framework for data-intensive
processing, providing a distributed computing system and a distributed file
system.

We're eager to get people to try our new tool. Please visit the Seal web site
(http://biodoop-seal.sourceforge.net/) and feel free to contact myself or the
other Seal authors if you have any question or problems.

--
Luca Pireddu
CRS4 - Distributed Computing Group
Loc. Pixina Manna Edificio 1
Pula 09010 (CA), Italy
Tel: +39 0709250452
ilveroluca is offline   Reply With Quote
Old 04-05-2012, 08:50 PM   #2
avt
Junior Member
 
Location: Davis, CA

Join Date: Dec 2009
Posts: 3
Default Work Flow with oozie

Hi Luca
Thank you for sharing. Since this is in a hadoop cluster, can it be put into oozie?
An Tat
avt is offline   Reply With Quote
Old 04-06-2012, 01:13 AM   #3
ilveroluca
Junior Member
 
Location: Italy

Join Date: May 2011
Posts: 2
Default

Hi An Tat,

although we've never tried, I don't see why it wouldn't work. Actually, if you do try to use the Seal tools with Oozie I'd be quite interested in hearing about your experience. At the very least it could be something we add to the documentation.

Are you already using Oozie?

Luca

ps: although we haven't announced them here on SEQanswers, we've had several releases of Seal since 0.1.0 and have added several tools to the suite. See http://biodoop-seal.sourceforge.net/ for the details.
ilveroluca is offline   Reply With Quote
Old 04-07-2012, 09:59 PM   #4
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

Hi Luca

Great tool. Any plans to support novoalign in SEAL? We think it would be a great addition to your toolset and we can make a good case why it should be added as an alternative to BWA especially for Illumina/Ion Torrent and SOLiD reads.

Private message me for more details on how we can get you full access from the aligner available at www.novocraft.com.
zee is offline   Reply With Quote
Reply

Tags
bwa, hadoop, read alignment

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:50 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO