The dramatic increase in the deployment and utilization of supercomputing owes much to microprocessor innovation and the attendant improvement in price performance typically associated with improved transistor density. But the "simple" idea of Moore's Law, characterized as the doubling transistor density per unit time, has its limits. Just as one cannot fold a piece of paper in half and in half again without reaching a fundamental limit within just a few folds, neither is it reasonable to think that increased density of transistors can be pursued without reaching some rather material physical limits over time.
One reaction to this reality has been the emergence of multi-core CPUs, each core running at a somewhat modest (but still significant) frequency, but orchestrated to work in concert on the computational problem at hand. This approach is meant to finesse the design limitations of forever-faster, single core CPUs while still attending to the insatiable needs for compute power in all market segments. And the supercomputing market segment, having more need for speed than any other segment, is aggressive in its pursuit and acceptance of this innovation.
The implications of multi-core designs to supercomputing are profound in terms of overall benefit. The fastest supercomputer in the world today, the Blue Gene/L system from IBM installed at Lawrence Livermore National Laboratory,1 is an innovative, multi-core- designed system on a chip (with each core running at less than a GhZ) that is roughly three times faster than the next fastest machine on the TOP500 list and roughly eight times faster than the NEC Earth Simulator (which was the last non-IBM system to lead the TOP500 list).
It is instructive to note that measuring from an electrical power consumption perspective, the Blue Gene system is rated on one measure at 112 MFlop/watt and the Earth Simulator at 3MFlop/Watt.3 Even allowing for some ambiguity in measurement technique, this is a stunning difference in the power consumption characteristics of contemporary supercomputer designs and a benefit related directly to the multi-core design attributes of the Blue Gene system. Given that the fastest systems in the world today routinely consume a megawatt or more of power annually, this kind of efficiency materializes itself as very significant operational cost savings.
Perhaps one of the most ambitious multi-core designs currently available is the heterogeneous, multi-core Cell Broadband Engine (Cell BE), designed by a partnership of Sony, Toshiba and IBM and found in today's Sony PlayStation 3. This chip is currently one of the key technical components of the Roadrunner supercomputer system that IBM is building in collaboration with the Department of Energy's Los Alamos National Laboratory. This system, when complete, will be capable of sustained performance of one petaflop.
The architecture of the Roadrunner system is an example of what we expect to be a plethora of hybrid supercomputer designs that amalgamate a variety of technologies to achieve maximum performance for given applications.






