Finally, the most prominent architectural approach to low-power supercomputing is IBM BlueGene/L, which debuted nine months ago on the Top500 Supercomputer List 1 as the fastest supercomputer in the world, relative to the Linpack benchmark. For an overview of the IBM BlueGene/L architecture and system software, see respective notes.17 18 Initial performance evaluations of IBM BlueGene/L can also be found in notes.19 20 21 In short, IBM Blue Gene/L is a very large-scale, low-power (for its size) supercomputer. Its 65,536 CPUs, which are PowerPC 440s, are organized into 64 racks of 1024 CPUs per rack, where each rack of 1024 CPUs consumes only 28.14 kW, resulting in an aggregate power consumption of 1.8 MW.
Given that the only program that has been run across the aforementioned systems is the Linpack benchmark, Table 3 presents the same evaluation metrics as in Table 2 but for the Linpack benchmark.22 And as in Table 2, Table 3 highlights the leader for a given metric in red.23 One of the most striking aspects of this table is that IBM Blue Gene/L does not use the most amount of space or power despite having the most number of CPUs. Its resulting performance-space and performance-power ratios are consequently astounding, at least relative to Linpack. As an additional reference point, the Japanese Earth Simulator, which has been argued to be the most powerful supercomputer in the world relative to executing real applications, reaches 35,860 Gflops for Linpack while occupying 17,222 ft2 and consuming 7,000 kW. This translates to performance-space and performance-power ratios of 2,082 Mflops/ft 2 and 5.13 Mflops/W, respectively.
| Metric HPC System | ASCI Red | ASCI White | Green Destiny | MegaProto | Orion DS-96 | IBM Blue Gene/L |
|---|---|---|---|---|---|---|
| Year | 1996 | 2000 | 2002 | 2004 | 2005 | 2005 |
| Performance (Gflops) | 2379 | 7226 | 101 | 5.62 | 110 | 136800 |
| Space (ft2) | 1600 | 9920 | 5 | 3.52 | 2.95 | 2500 |
| Power (kW) | 1200 | 2000 | 5 | 0.33 | 1.58 | 1800 |
| DRAM (GB) | 585 | 6200 | 150 | 4 | 96 | 32768 |
| Disk (TB) | 2.0 | 160.0 | 4.8 | n/a | 7.68 | n/a |
| DRAM Density (MB/ft2) | 366 | 625 | 30000 | 1136 | 32542 | 13107 |
| Disk Density (GB/ft2) | 1 | 16 | 960 | n/a | 2603 | n/a |
| Perf/Space (Mflops/ft2) | 1487 | 728 | 20202 | 1597 | 37228 | 54720 |
| Perf/Power (Mflops/W) | 2 | 4 | 20 | 17 | 70 | 76 |
Despite the performance of HPC systems such as Green Destiny, MegaProto, Orion Multisystems DT-12 and DS-96, and IBM Blue Gene/L, many HPC researchers gripe about the raw performance per compute node, which then requires additional compute nodes to compensate for the lower per-node performance. This, of course, is in contrast to using fewer but more powerful and more power-hungry server processors, e.g., Power5 in ASC Purple, which is slated to require 7.5 MW to power and cool its 12,000+ CPU system. The full system is expected to generate more than 16,000,000 BTU/h in heat, thus requiring new air-handling designs and specifications. Furthermore, all the above solutions do not rely entirely on commodity technologies, and hence, may not be cost-effective. For instance, Blue Gene/L is a stripped-down version of the 700-MHz PowerPC 400 embedded CPU while Green Destiny relies on a customized high-performance version of Transmeta’s code-morphing software (CMS)24 without the high-performance code-morphing software (HP-CMS). Consequently, its 16 CPUs only achieve 5.62 Gflops on Linpack. To address the criticisms with respect to non-commodity parts and low performance, the next section proposes an alternative approach for reducing power consumption, one that is largely architecture-independent and based on high-end commodity hardware.






