Overcoming the Transition to Multicore
Posted by David Rich in Multi-Core and GPU on May 8th, 2009
Much has been written about the shift to multi-core processor architectures, and if you’re reading this blog you likely know quite a bit about it. But for the last few years I’ve found one particular model to be very helpful when talking about this transition and maybe it will help you when you are educating others on the issue.
I use a modified form of the Kübler-Ross model. You might not know it by name, this is the model that describes the stages that people go through when dealing with loss. The KR model has five stages, but I’ve seen four cover most situations: Denial, Anger, Sadness, Acceptance. The loss in this case is free, no code changes required, single thread performance growth.
I suppose there might be a fifth pre-stage of ignorance, but somebody would need some thick blinders to have missed the appearance of multicore processors by this point.
From my perspective, the most interesting question is what should be done to adapt to the new processor architectures, but you can’t have this conversation unless your colleague has moved through to accepting the new reality. This is where the model helps, because it helps you to respond in a way that best helps them move forward.
If they are in denial, “I thought 8ghz processors were coming later this year.” then a science based response is often the best strategy. Explain about power consumption limits at higher frequencies, the diminishing returns of super-scalar architectures and other similar items. Point them at public statements and roadmaps from Intel and AMD.
“Don’t the processor vendors know how expensive it is for us to change our code!” “How are we going to get the next generation simulation done in time?” There is likely to be anger. There isn’t much you can do here except help them get past it, so perhaps provoking more anger is the best strategy. “Remember that thermal bug causing shut downs during long runs?” Or maybe a trip to professional wrestling?
Anger vented, sadness is sure to follow. “My boss is going to fire me when I tell him we need to double the size of the data center.” “You know, everybody that’s ever worked on this code has already retired — what are we going to do?” and other ‘woe is me’ statements. Commiseration is the right reaction here. Tissues are always appropriate, but you can also point out that they are not alone. In fact, any and everybody concerned with performance is going through exactly the same thing.
Finally, acceptance is reached and there is an understanding that this industry-wide shift means that software changes will be needed. Now the real work can begin.
And that’s when it is time to learn about Star-P.
Which came first, the eye or the brain? Can Zigbee fly without HPC?
Posted by David Rich in A Look to the Future on March 19th, 2009
Some recent conversations with potential partners and customers on the topic of large scale automated knowledge discovery lead me to wonder if we could learn something from biological evolution. What is the relationship between the ability to gather and make use of information?
This question is relevant for us as we have recently started a consulting practice to bring high end knowledge discovery tools to bear on customer problems, but the technology is so new that we doubt the general public – or even the technology aware – realize the value to be gained by collecting very large amounts of unstructured data.
Take the eye, a very complicated and effective sensor. There’s an interesting article in Wikipedia on the evolution of the eye, but from an IT point of view, the article seems to miss an essential point. Certainly it is amazing to see the development of the optics and nerves which convert light into some kind of electro-chemical pulses, but what about the processing? What good is the lens without an image processing capability to discern objects as well as some higher order processing to assign meaning and decide on responses?
The principles of evolution would lead us to believe that the eye’s high end optics would not have developed without an ability to recognize food or danger and act developing at roughly the same time. There would be little use for 12mega-pixel cameras if disk drives were still measured in megabytes.
If you haven’t heard of it, Zigbee is a standard for low power, wireless networks that is on the cusp of entering our lives. The first wide scale application is likely to be next-generation power meters in homes which also serve as Zigbee base stations. These meters will, for example, be able to tell your air conditioner to slow down because power is in short supply. Or able to tell your dishwasher when to start based on the cost of electricity.
As with the eye, these Zigbee equipped power meters can only survive if there is a network and controlling intelligence to make use of the information and ability to control devices in a way that produces positive value.
Let’s assume that smart meters, or some other application, support the proliferation of connected Zigbee base stations. That might provoke an explosion in the number of wireless sensors and controllers in the environment – so long as there was an ability to use the information profitably. But the volume of data will be huge and just like the first organism with the ability to sense light, we won’t initially understand the information or how to use it.
Happily, another species of technology is developing right besides low cost, low power sensor networks. Knowledge Discovery refers to a set of mathematical techniques for exploring large data sets. Steve Reinhardt wrote a nice overview article on the topic in Scientific Computing World. The combination of new mathematical techniques, low-cost HPC hardware and the scalability and ease of use of Star-P should allow our civilization to gain valuable and unexpected insight from the mass of data which is becoming available.
What could we figure out if our HPC systems could collect data from every device consuming energy? Like the first species able to hunt by sight, would an economy able to accurately tune energy consumption in real time have a competitive advantage?
With low cost, pervasive wireless networks, perhaps all houses will be equipped with network connected simple weather stations (hobbyists are already playing with at this!) such that we can do a better job of predicting tornadoes, thunder storms and floods. If the climatologists are to be believed, such a capability to predict extreme weather events may be required for survival in a few decades.
Our goal here at ISC is to evolve the capability, scalability and ease of use of knowledge discovery fast enough to keep pace with the growth of sensor networks.
Put more simply, if you are swiming in a sea of data and are not sure what it means, give us a call!
Parallel Processing People
Posted by David Rich in Parallel Programming on February 27th, 2009
Steve Apiki writes the following in Dr. Dobb’s;
“…the people with domain or legacy application knowledge are unlikely to also have deep threading experience…”
The article Perils of Going Parallel discusses in detail the challenge of putting together the right kind of team to multi-thread a perviously serial application. The high level message is that this is not easy and even building a team with the right balance of skills takes some expertise.
Of course we agree. All the more reason to let domain experts use a tool like Star-P to quickly meet their performance goals — and with Star-P, you can always bring in performance experts later to hand tune computational kernels if need be.
XO Supercomputing
Posted by Andy Greenwell in Cloud Computing on November 25th, 2008
Like many other Americans, I will be traveling home to see my family for Thanksgiving this week. As is usually the case when I go home for Thanksgiving, I will be asking subtle questions of my family members about what they might enjoy finding under the Christmas tree in a few weeks time…except for my 5 year old niece. I hope she does not read this blog entry in the near future, because this year I have already purchased her Christmas present. And if you are reading this post, then please forgive me, because I already opened your present.
Last week, the One Laptop Per Child Foundation restarted it’s “Give One, Get One” program, distributing their XO laptops through Amazon.com http://www.amazon.com/xo . Having previously installed the XO’s Sugar operating system on my own laptop, I realized that my niece would really enjoy Sugar’s many “Activity” programs, such as Etoys, TamTamJam, Memorize, Paint, and especially Record (given her interest in digital photography). Upon receiving the XO in the mail last Thursday, my curiosity got the best of me and I decided to open the box (ok…that was really my plan all along, but I digress)…
Since the XO comes with its own pre-installed Python editor, Pippy, and is pre-loaded with IPython (an enhanced, interactive Python shell) and NumPy (a numerical computing package for Python), I thought that it would be an interesting exercise to try installing the Star-P Python Client on the XO. Since Sugar is currently based on the Fedora Linux operating system, it also has SSH capabilities built-in, so remote authentication should not be a problem.
Upon arriving at the office on Friday, I immediately sat down at my desk, flipped up the ears on my XO (I mean her XO), connected to our wireless network (I was also finding potential wireless connections with the XO that I never see with my other laptops), opened up the Terminal Activity, and connected via SSH to one of our servers. After scp-ing a copy of a Star-P Linux client tarball onto my XO (yes…I mean her XO), I was able to untar the package, and install the Star-P Python client. With the Star-P client installed, then I just needed to test whether I could import the starp module within a Python session and establish a connection to a Star-P server. As you can see below, the connection was a success.
This experience brought to my mind the following article by Andy Greenberg (slightly ironic name) that I read in Forbes last December about IBM’s donation of a Blue Gene to the Center for High Performance Computing in Cape Town, South Africa. While the author himself makes the comparison between IBM’s donation and the OLPC project, the portions that stuck in my mind were a couple of quotes from Horst Simon of UC Berkeley and Johan Eksteen of the Meraka Institute:
Simon says the donation could work together with programs like One Laptop Per Child as a complementary initiative to bring IT to Africa. “The OLPC program builds computer literacy from the ground up and gets a large number of mathematically inclined individuals involved,” he says. “In fact, we need both of these approaches.”
Comparing the supercomputer to [Nicholas] Negroponte’s initiative, the Meraka Institute’s Eksteen agrees that the two programs should work in tandem. “There must be projects that develop applications for ordinary citizens,” he says. “But if you don’t have the scale of projects that we’re talking about here, then there’s nothing to take advantage of the kind of progress that One Laptop Per Child could create. If we can educate, increase scientific capacity and also build infrastructure, then we can create something that’s truly sustainable.”
While I doubt that there will be very many 5 year olds clamoring for a Star-P Activity on their XO’s in the near future, Star-P’s client-server architecture does provide the capability for the world’s least powerful (and most portable) computers to harness the computational capacity of the world’s most powerful (and least portable) computers. For example, one could envision medical diagnostic or imaging applications written in Python that collect input data at a remote field clinic using a device such as the XO, process that data on an HPC system in the “Cloud”, and return the results to the portable, remote device. And with Star-P’s programming model for Python being such that it tracks the interface provided in NumPy, it’s a small hop for any programmer that understands NumPy to program applications with the starp module.
Now as for that Christmas gift…I was debating as to whether, in addition to giving my niece my (I mean her) XO, she would also like a package of hours for Star-P On-Demand so that she can get started writing parallel programs in Pippy right away. On second thought…I think I will just keep this XO for myself, and order another (unopened) XO for her.


Is an NVIDIA Based Top500 Machine News?
Posted by David Rich in Multi-Core and GPU on November 17th, 2008
This round’s TOP500 has an NVIDIA based machine — a first. Here’s a Gigaom posting on the topic. But is this news? I’m sure the folks at NVIDIA are happy about it.
I’m writing this posting after the opening evening of Supercomputing 2008. As usual, the TOP500 list has a way of grabbing attention and headlines, but I think the real news is the growing number of users that are actively considering production use of accelerators.
We are watching this carefully. Star-P is an excellent way for an organization to get started with accelerators as it provides a level of abstraction for the programmer. As one customer pointed out to me, ‘the fastest accelerator crown is going to change often and may vary by application, we’re going to want an easy way to port codes as the hardware changes.’ What better way than to write the code in a very high level language like Matlab or Python and then place the accelerator specific code in function libraries?
HPC in the Sky
Posted by David Rich in Cloud Computing on November 7th, 2008
Star-P On Demand was mentioned yesterday in an HPCWire article; Increasing Clouds by Michael Feldman. Certainly we are seeing a lot of interest in clouds, but there is an additional reason why we, as an ISV have some interest in their success.
It turns out that many of our customers are new to HPC and have little experience buying, installing or adminstering clusters. Often times our support engineers get involved in debugging cluster administration problems. For some of these users, a pre-configured, managed by somebody else compute resource would save quite a bit of time and effort.
Over time, clusters will become easier to install and manage and cloud usage models will become better understood and supported. As usual, there is an interesting race to watch.
Is there enough inherent parallelism in applications?
Posted by Steve Reinhardt in Applications on November 4th, 2008
Ed Sperling notes that many applications, especially commercial, don’t have enough parallelism to keep the current and planned multicore processors busy. Depending on the context, this is both true and not true.
- Applications can have parallelism at many different levels (e.g., for a payroll, calculating FICA and 401(k) deductions in parallel for a given employee, or calculating paychecks for different employees in parallel). As Sperling notes, there are many apps that have absurd levels of parallelism in them (e.g.,search). If there is truly no parallelism, then multicore won’t help. But many apps do have significant parallelism, when viewed at the right level.
- Applications depend to varying degrees on parallel infrastructure. For instance, if the application is rendering a frame on a screen, the output is typically completely independent, and the potentially hard parts that Sperling notes of splitting the task up and putting it back together are not very hard. By contrast, handling numerous ATM transactions in parallel depends on having an infrastructure, often a database, that accepts simultaneous requests for data that is truly independent, and appropriately mediates access to data for requests that are not completely independent. For these latter apps, an organization will be highly dependent on the infrastructure vendor to enable enough internal parallelism to run the application at scale; this is often difficult for the vendor.
- Applications whose compute time scales with the amount of data received will often be good candidates for parallelism. For instance, the resolution of numerous sensors (e.g., computer tomography, mass spectrometers, and video cameras) is growing very quickly, and often the algorithms to process the data are highly parallel. These applications will readily consume huge amounts of cores.
- In my experience, organizations don’t tend to rewrite existing applications for new architectures. Rather, they write new applications with new architectures in mind. The impact in the current world is that, if you’re writing an application that you want to be useful for 10 years, you will probably be able to buy 1,000 times more cores than you do today, for the same price. (A separate question is whether your application needs 1,000 times more compute power, or whether it needs (say) 100 times more compute power at 1/10th today’s price.) So you need to structure the program consistently with those needs. If there’s not enough parallelism, expect not to have performance improvements over time. If there is enough parallelism in the abstract problem, you need tools that let you expose the parallelism fully. (Insert shameless plug for our own Star-P product here.)
Sperling is exactly right, of course, that most of today’s existing applications have not been designed with parallelism in mind, and will need substantial rework to perform well in a many-core world. And while tools can help that rework, don’t expect it to be a push-button transformation of today’s existing code base. Instead, expect tools that help programmers express parallelism at an appropriate level, but probably in different languages/approaches than programmers are used to. In my view, this is unavoidable. The change to parallel-everywhere is fundamental and cannot be papered over.
How do scientists develop their own software?
Posted by Steve Reinhardt in A Look to the Future on November 1st, 2008
Greg Wilson, a software productivity researcher and long-time acquaintaince now at UToronto, forwarded me the attached invitation, which I thought you may want to accept as well.
A group of us are running a survey to find out how scientists actually use computers in their day-to-day work. The blurb we’re sending out is included below, and I’d be happy to provide more information. We’ve promised first crack at the results to “American Scientist”, but will be making the data generally available. We’d be very grateful if you could spread the word through mailing lists, your blog, your web site, or whatever — we’d like to get as many people to respond as possible.
Thanks,
Greg—————————————————————–
Computers are as important to modern scientists as test tubes, but we know surprisingly little about how scientists develop and use software in their research. To find out, the University of Toronto, Simula Research Laboratory, and the National Research Council of Canada have launched an online survey in conjunction with “American Scientist” magazine. If you have 20 minutes to take part, please go to:
http://softwareresearch.ca/seg/SCS/scientific-computing-survey.html
Thanks in advance for your help!
Jo Hannay (Simula Research Laboratory)
Hans Petter Langtangen (Simula Research Laboratory)
Dietmar Pfahl (Simula Research Laboratory)
Janice Singer (National Research Council of Canada)
Greg Wilson (University of Toronto)
New Materials Need New Simulation Tools
Posted by David Rich in A Look to the Future on October 29th, 2008
We spotted this very optimistic piece of news yesterday: “New solar cell material achieves almost 100% efficiency, could solve world-wide energy problems.” It reminded me of something we’ve been noticing recently about a class of Star-P users. Very often it seems like our users are inventing brand new materials, processes or other technology. Of course they want to use computer simulation to do their work, but existing tools are not sufficient for their needs.
And that’s what brings them to Star-P; a fast way to implement new simulations with good speed and efficiency.
New Intro to Star-P Webinar; Including Knowledge Discovery
Posted by David Rich in Happenings on October 9th, 2008
Just a quick note, Viral and I did a webinar for the SGI developer’s program. It’s the first time I’ve ever used a movie poster in a presentation. Of course it being a webinar, its hard to know if the movie poster was appreciated!
You can see a replay of the webinar on SGI’s Website It has a basic overview of Star-P and then a discussion of how Star-P has been used recently for knowledge discovery.

