CTWatch
November 2006 B
High Productivity Computing Systems and the Path Towards Usable Petascale Computing
Tzu-Yi Chen, Pomona College
Meghan Gunn, University of San Diego
Beth Simon, UC San Diego
Laura Carrington, San Diego Supercomputer Center
Allan Snavely, San Diego Supercomputer Center

3
2. Ranking using machine metrics

Ideally we would be able to perfectly rank all machines in an application-independent manner, using only machine characteristics. Although this may be impossible, we can still measure how close any ranking comes to perfect. In this section, we describe a metric for ranking defined by thresholded inversions. We describe the optimal ranking for that metric on our combination of machines and applications, and explore how close we can come to optimal using machine characteristics both individually and in combination.

2.1 Measuring the quality of a ranking

We evaluate the quality of a ranking by the number of thresholded inversions that it contains. For example, consider a list of n systems m1, m2, . . .mn, sorted from highest to lowest interprocessor network bandwidth so that i < j means that the bandwidth achievable on system mi is at least as good as the bandwidth achievable on system ,mj. Given an application A and times t(1), t(2), . . . , t(n), where t(k) is the measured running time of A on machine mk, we would normally say that machines mi and mj are inverted if i < j, but t(i) > t(j). Intuitively, this says that all other things being equal, application A should run faster on the machine with the higher network bandwidth. If it does not, we call that an inversion. The number of inversions in the ranking, then, is the number of pairs of machines that are inverted;15 in our example this is the number of pairs of machines for which the inter-processor network bandwidth incorrectly predicts which machine should run faster on application A. Clearly this number is at least zero and is no larger than n(n - 1)/2.

In practice, to allow for variance in the measured runtimes, we count only thresholded inversions. Given a threshold α satisfying 0 ≤ α ≤ 1, machines mi and mj are considered inverted only if mi is predicted to be faster than mj, and t(i) > (1 + α)t(j). If α = 0, this is equivalent to the number of inversions as described in the previous paragraph. However, choosing a small, nonzero α allows us to handle variations in timing measurements that can be caused for reasons including those discussed in.16

Furthermore, to allow for variance in the benchmark timings that give the machine metrics, we use a second threshold parameter β, where 0 ≤ β ≤ 1. Even if two machines mi and mj would be considered inverted by the above measures because t(i) > (1 + α)t(j), we only count the inversion as such if the results of the benchmark (e.g., network bandwidth) also differ by a relative measure β. In general β should be chosen to be smaller than α because one expects less variance in benchmark times than in full application runtimes. In this paper we use α = .01, which means a difference of up to 1% in the application runtimes is considered insignificant, and β = .001.

Note that a metric based on thresholded inversions is practical because it is monotonic in the sense that adding another machine mk and its associated runtime t(k) cannot decrease the number of inversions in a ranking. Within our context of large parallel applications, this is useful because we have only partial runtime data; not all of the applications in Table 4, with all possible processor counts, were run on all the machines in Table 5. In some cases these gaps are necessary because some systems do not have enough compute processors to run every application with all processor counts. Furthermore, in practice, it is not unusual for timings that have been collected at different times on different machines by different people to be incomplete in this way.

Pages: 1 2 3 4 5 6 7 8 9

Reference this article
"Metrics for Ranking the Performance of Supercomputers ," CTWatch Quarterly, Volume 2, Number 4B, November 2006 B. http://www.ctwatch.org/quarterly/articles/2006/11/metrics-for-ranking-the-performance-of-supercomputers/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.