SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
[ sequence alignments, big data, bio informatics ] x [ programming and UI design ] delinquentme Bioinformatics 1 07-04-2011 08:54 PM
visualization tools for sequence alignments and assembling dicty Bioinformatics 4 03-01-2011 12:48 PM
finding polymorphisms by large sequence alignments Alex Clop Bioinformatics 8 11-24-2008 06:52 PM
In Sequence: At AGBT, 454, Illumina, ABI Vow to Improve Speed, Yield, Quality of Next Newsbot! Illumina/Solexa 0 02-19-2008 01:13 PM

Reply
 
Thread Tools
Old 12-12-2007, 04:06 PM   #1
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default Speed up sequence alignments using your video card!

Schatz, et al have just published a paper in BioMed Central Bioinformatics describing a novel sequence alignment algorithm designed and optimized to run on commonly available graphics processors.

Why use a graphics processor? My understanding is that processing data for 3D renderings involves specialized parallel processors that can do certain types of calculations extremely fast. This means that if one has optimized code to take advantage of this architecture, it is possible to perform calculations faster than on a standard CPU of the same price. That is the extent to which I can explain the advantages of a graphics processor, as a biologist.

They demonstrate up to a 10-fold improvement in alignment speed when compared to a standard CPU. The current fastest commercial graphics adapter, the nVidia 8800GTX, was able to run the project 3.79X faster than a single core 3Ghz Xeon processor (which costs the same).

A link to the paper PDF can be found here: http://www.biomedcentral.com/content...2105-8-474.pdf

A link the project sourceforge homepage can be found here: http://mummergpu.sourceforge.net
ECO is offline   Reply With Quote
Old 02-27-2008, 11:46 AM   #2
apfejes
Senior Member
 
Location: Oakland, California

Join Date: Feb 2008
Posts: 236
Default

I had a conversation with someone about this, back in 2004. I don't think there was any question, then or now, that using a processor that's specially designed for vector processing would be a lot faster than using a general purpose CPU for vector calculations. Even the article points to several instances of earlier use of GPUs for processing.

Still, the most interesting part of the article to me is that the improvement vs. processing on a CPU decreases as the sequence length becomes longer. This is probably an artefact of the necessity of caching small chunks of their suffix tree to the GPU at a time. The larger the suffix tree, the more time you need to spend pre-caching suffix tree elements. (Just a guess.. someone tell me if I'm wrong.) That tells me that there's *probably* a dramatically better algorithm out there for this application than a suffix tree.

In the end, I'm just surprised to see that they managed to get a speed up at all. Sequence alignments are a non-vector application, so the use of a vector processor seems non-intuitive. If this were a molecular simulation, on the other hand... but then again, I believe that's been done before, as well.
__________________
The more you know, the more you know you don't know. —Aristotle
apfejes is offline   Reply With Quote
Old 04-01-2008, 10:15 AM   #3
mschatz
Junior Member
 
Location: CSHL

Join Date: Apr 2008
Posts: 3
Default

Maybe I can answer your questions for you. GPUs aren't exactly vector processors, and have a lot more flexibility than those. Instead think of them as single-board mini-grids containing many lightweight processors that all run the same program at the same time (SIMD, not vector architecture). The processors are optimized for the number crunching needed for rendering 3D graphics, but the programs they run can perform arbitrary computations using regular programming statements like loops and conditionals. This means that if you have a problem that requires the same computation for many different inputs, you can probably use a GPU to speed up your application.

Current GPUs only cost ~$500, but have up to 256 processors! As such they becoming really attractive platforms for high-throughput computation in many different fields (including molecular dynamics, meteorology, financial, cryptography, ...) . Some applications that perform a lot of number crunching have achieved 100x speedup over the CPU. In contrast, MUMmerGPU performs very little number crunching, but is very data intensive. As such, the processors on the GPU can't run at full speed, and have to wait for data to move around on the board. Even still, MUMmerGPU gets ~10x speedup on the 8800 GTX with 128 processors for short reads. Over the last couple months we reworked how the data is organized, and we have managed to double that speed. Check the MUMmerGPU Sourceforge page for a new release soon.

As for apfejes' comment about decreasing performance with longer reads, this is an artifact of how we organize the suffix tree on the board. The GPU has a very small cache, so we put the tree on the board in a very specific way to try to get as much use of the cache as possible (see the paper for all the gory details). It wasn't until recently that we fully understood the problem, but the way that we place the tree on the board is sub-optimal for longer reads. Again we are actively working on this and the next release should have much more consistent performance.

If you have any more questions, feel free to post here or email me directly.

Thanks for you interest,

Michael Schatz
mschatz is offline   Reply With Quote
Old 04-01-2008, 10:58 AM   #4
apfejes
Senior Member
 
Location: Oakland, California

Join Date: Feb 2008
Posts: 236
Default

Thanks for the reply - that was really helpful. I look forward to reading about the future releases!

Anthony
__________________
The more you know, the more you know you don't know. —Aristotle
apfejes is offline   Reply With Quote
Old 04-08-2008, 10:23 AM   #5
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Michael,

thanks for the explanation. What is the minimum requirements for the graphichs card and what is most important, mem size / bus /speed or number of processors? Does it work in SLI with two cards? Also, have you done any speed comparisons to other short read aligners?
Chipper is offline   Reply With Quote
Old 05-05-2008, 08:24 PM   #6
pfh
Junior Member
 
Location: Melbourne

Join Date: May 2008
Posts: 7
Default

Sequence alignment is vectorizable, and there are various SIMD implementations. There is a brute force sequence aligner in the FASTA package that uses SIMD, for example.

If you want to align multiple sequences, it's even easier. I've been working on a brute force aligner of short reads to a reference that runs on Cell processors such as the PlayStation 3, available here: http://savannah.nongnu.org/projects/myrialign/

I am impressed that they've managed to do MUMmer on a GPU, it uses quite a different algorithm to the usual dynamic programming sequence alignment, afaik.
pfh is offline   Reply With Quote
Old 04-14-2009, 02:21 PM   #7
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Quote:
Originally Posted by mschatz View Post
Maybe I can answer your questions for you. GPUs aren't exactly vector processors, and have a lot more flexibility than those. Instead think of them as single-board mini-grids containing many lightweight processors that all run the same program at the same time (SIMD, not vector architecture). The processors are optimized for the number crunching needed for rendering 3D graphics, but the programs they run can perform arbitrary computations using regular programming statements like loops and conditionals. This means that if you have a problem that requires the same computation for many different inputs, you can probably use a GPU to speed up your application.

Current GPUs only cost ~$500, but have up to 256 processors! As such they becoming really attractive platforms for high-throughput computation in many different fields (including molecular dynamics, meteorology, financial, cryptography, ...) . Some applications that perform a lot of number crunching have achieved 100x speedup over the CPU. In contrast, MUMmerGPU performs very little number crunching, but is very data intensive. As such, the processors on the GPU can't run at full speed, and have to wait for data to move around on the board. Even still, MUMmerGPU gets ~10x speedup on the 8800 GTX with 128 processors for short reads. Over the last couple months we reworked how the data is organized, and we have managed to double that speed. Check the MUMmerGPU Sourceforge page for a new release soon.

As for apfejes' comment about decreasing performance with longer reads, this is an artifact of how we organize the suffix tree on the board. The GPU has a very small cache, so we put the tree on the board in a very specific way to try to get as much use of the cache as possible (see the paper for all the gory details). It wasn't until recently that we fully understood the problem, but the way that we place the tree on the board is sub-optimal for longer reads. Again we are actively working on this and the next release should have much more consistent performance.

If you have any more questions, feel free to post here or email me directly.

Thanks for you interest,

Michael Schatz

Any work still going on in this field or are the bowtie-type aligners on cpu superior?
Chipper is offline   Reply With Quote
Old 04-14-2009, 09:53 PM   #8
Cole Trapnell
Senior Member
 
Location: Boston, MA

Join Date: Nov 2008
Posts: 212
Default

Quote:
Originally Posted by Chipper View Post
Any work still going on in this field or are the bowtie-type aligners on cpu superior?
Mike and I submitted a second paper on MUMmerGPU a couple of months back, but it's still under review. The paper contains a new GPGPU algorithm for translating suffix tree node coordinates into reference coordinates. It also contains a very detailed exploration about how seemingly orthogonal design decisions interact because of the peculiarities of the GPU architecture. The new paper is more targeted to the GPGPU community than to bioinformaticians.

Mike, Ben Langmead, and I have actually spent some time thinking about putting Bowtie on the GPU, but we're worried about the relatively long latency of the GPU's memory bus. The architecture is organized so that sucking down big streams of data (e.g. large textures) is fast, but other than the initial loading of the reads, that's not the access pattern of Burrows-Wheeler search. Bowtie's performance essentially comes down to waiting for small chunks of data to come in from the memory bus (i.e. cache misses). Since recent nVidia GPUs have a global memory latency that is substantially longer than that of your typical x86 cache miss, I worry that you'd wipe out all your gains from massively parallel processing in the longer per-read processing time.

That said, suffix tree traversal was supposed to be a bad fit for GPGPU for the same reasons, and the MUMmerGPU search kernel was substantially faster on the GPU than on the CPU. I doubt the three of us will get to putting Bowtie on the GPU, but if there's some brave soul out there willing to give it a try... nVidia makes cards now that have big enough memories to store the Bowtie index of the human genome.
Cole Trapnell is offline   Reply With Quote
Old 04-15-2009, 11:41 PM   #9
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

Thanks Cole. It would be fun though to see if a set-up like htttp://fastra.ua.ac.be/en/index.html or http://www.asrock.com/news/pop/X58/index.htm could be used for sequence analysis.
Chipper is offline   Reply With Quote
Old 03-22-2010, 02:36 AM   #10
Berlinq
Junior Member
 
Location: Berlin, Germany

Join Date: Dec 2009
Posts: 7
Default

unfortunately CUDA will not work with xen kernel, which uses for instant RHEL5
Berlinq is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:09 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO