Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Purdue is building a cluster with Xeons plus the Xeon Phi. Impressive statistics but all for non-bioinformatics programs (if you discount modeling as being bioinformatics). The Phi (for those who do not know) is basically a 1GHz, 60-core, 4-thread-per-core, 8 GB extremely fast memory x386 chip that is optimized for floating point. So if I had a program that could work with 240 threads then the Phi would be great. But I am wracking my brain for a bioinformatics program that could actually use 240 threads without creating I/O bottlenecks. Any ideas?
Oh yes, going back to GenoMax's post, the presenters at the seminar I went to today did say that the same code will run on the 8-core Xeon chip and the 60-core Phi processor.
-
Going into specifics. Each node has two 8-core Xeon with 64 GB memory plus two 60-core Xeon Phi coprocessors. The nodes are connected to a 1.6 PB "Lustre" scratch file space via a QDR Infiniband, 18GB/sec throughput. If I got my information down correctly that 18 GB is not shared with other nodes. And yes, the brochure has the big 'B' implying Bytes instead of bits -- Wikipedia says QDR at 12x is 96 Gbit/sec actual, uni-directional.
So, it would take, oh not that long to load in a hiseq lane's worth of data but then what would the Phi chip do with it? Need software!!
Comment
-
The storage would likely be mounted through an Infiniband switch so an individual node would not see anywhere close to the 18GB/s speed since that bandwidth would be shared.
Did you get any details about how the Xeon Phi processors show up on the node? Are they going to appear to the OS as "16 + 120" cores? If that is not the case then is there some other piece of software that allows the on board Xeon's to talk with the co-processors?
As with the graphics cards we are going to be limited by the bandwidth of the PCI-E bus.Last edited by GenoMax; 08-28-2013, 01:37 PM.
Comment
-
Originally posted by GenoMax View PostAs with the graphics cards we are going to be limited by the bandwidth of the PCI-E bus.
Comment
-
This mostly seems like Intel's answer to the Tesla series from NVIDIA that have been out for a while now, only you won't need to tweak your code as much. I know our engineering department has discussed installing a few of these in our cluster, but they already have a bunch of nodes that include Tesla chips and as far as I know the only people who use them are doing computational modeling.
The Phi seems to be another leap in hardware like the Tesla was, but the software just isn't there yet to fully take advantage. My suspicions are that by the time a compelling piece of software comes out to use that much processing firepower, the sequencing realm will have advanced to the point that it might not matter.
Comment
-
I keep forgetting that Phi is ultimately a co-processor and as such is going to be limited to an extent.
Comment
-
Originally posted by rhinoceros View PostWouldn't this be great for things like Blast?
These chips will ultimately find a use in bioinformatics, but I don't expect anything amazing. The problem with bioinformatics isn't really a lack of processing power, but well designed software and algorithms that can easily and readily take advantage of that power to do something better and more useful than previous options.
Comment
-
Back from an extended vacation weekend. In response to GenoMax's comment about infiband (IB) speeds this is what I am getting from the technical people. They talk about our old cluster and our new cluster. The major difference between the two is (1) the Phi co-processors and (2) all nodes now have 64 GB memory instead of a mixture of 32 GB and 64 GB.
The old clusters's IB fabric is connected at 56 Gbps, but is oversubscribed at
2:1 - if every node is using every IB port in the old cluster at full rate,
they'll only get 50% of the maximum bandwidth.
The new cluster's peak is only 40 Gbps, but is not oversubscribed - meaning that
every node could fully use its IB connection and still get line rate.
This is on the IB fabric in general including access to the storage space. For access to storage, however, the effective performance will be limited by the performance of the underlying storage. The new cluster's scratch storage space goes up to 20 GB/sec in aggregate.
We still need a good bioinformatics use case for the Phi co-processors. I am not sure if Blast is it, though, since it seems like it would be I/O-limited or memory-limited.
Really, what we need to consider is what bioinformatics program is compute-bound? Or a program where we can trade off IO & memory for compute requirements. As mcnelson.phd bioinformatics is generally not limited by cpu power.
Comment
-
And just to reinforce what the Phi needs in terms of programming, here is a quote from a Dr. Dobb's article.
Succinctly put the single key concept to understand about Intel Xeon Phi is that that the program must express sufficient parallelism and vector capability to achieve high performance. Measurements presented in this tutorial suggest that the application or offload region must use at least 120 concurrent threads of execution.
Comment
-
You should still see a nice boost to your normal workload once the new cluster becomes operational.
On a different note if you have not used HiSeq Analysis Software (HAS) before then this could be a good test for your cluster. HAS is designed to basically take over all hardware resources on a node (note: a minimum of 48 GB RAM is needed) and run WGS/Enrichment workflows at maximum speed possible on the hardware in question (HAS will generate tens (perhaps hundreds in your case) of alignment threads automatically).
If your admins are willing to install HAS (currently supplied as an RPM on iCom) and you have access to a flowcell with human samples (ISAAC aligner only has hg19 indexes, I have not tried to build others yet) give it a whirl. We have got it working on LSF though illumina does not officially support it.
PS: I am not sure if the Phi processors can access the RAM on the node (i.e. not the local RAM). If they can't then this may not be worth pursuing.Last edited by GenoMax; 09-04-2013, 06:44 AM.
Comment
-
Originally posted by GenoMax View PostYou should still see a nice boost to your normal workload once the new cluster becomes operational.
If I could make the argument that the new cluster would increase compute speeds by a significant amount via using the Phi then the purchasing decision would be much easier. As it is we will probably wait until 2014 when we hope that Central IT (the people running the clusters) decide that a slower yet large memory cluster would be a Good Thing To Have. Of course who gets into the "let's have some bragging rights" supercomputing top 100 list that way. :-(
Another way to get into the new cluster would be to create a internal grant proposal to port some bioinformatics software to the Phi coprocessors. But I am just not getting a good idea of what, if any, software could be profitably ported.
On a different note if you have not used HiSeq Analysis Software (HAS) before then this could be a good test for your cluster. ...
PS: I am not sure if the Phi processors can access the RAM on the node (i.e. not the local RAM). If they can't then this may not be worth pursuing.
Comment
-
Originally posted by westerman View PostCorrect me if I am wrong but it appears that HAS is only good for HG19. With only one human sample coming through our lab in the last 5 or so years I always look for the magical "un-characterized plant and animal genomes supported" sticker on software before getting excited about the software.
Comment
Latest Articles
Collapse
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:03 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Today, 07:03 AM
|
||
Started by seqadmin, 05-10-2024, 06:35 AM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
05-10-2024, 06:35 AM
|
||
Started by seqadmin, 05-09-2024, 02:46 PM
|
0 responses
41 views
0 likes
|
Last Post
by seqadmin
05-09-2024, 02:46 PM
|
||
Started by seqadmin, 05-07-2024, 06:57 AM
|
0 responses
33 views
0 likes
|
Last Post
by seqadmin
05-07-2024, 06:57 AM
|
Comment