CTWatch
November 2006 A
High Productivity Computing Systems and the Path Towards Usable Petascale Computing
Susan Squires, Sun Microsystems Inc.
Michael L. Van De Vanter, Sun Microsystems Inc.
Lawrence G. Votta, Sun Microsystems Inc.

5
4.2 Example: HPC Workflow

Two methods were chosen to collect data: one quantitative, one qualitative. Quantitative information on programmer activity (time on task details) was collected using HackyStat, an in-process software engineering measurement and analysis tool.28 HackyStat recorded hours of event traces from development tools (for example “open file,” and “build”) while HPC professionals developed code. Additional (qualitative) data was collected in the form of real-time, time-stamped journals written by professional code developers who agreed to record a personal narrative of their work.

By combining HackyStat telemetry data that measured activity with programmer journals, the team was able to corroborate, validate, and interpret results. For example, the significance of the expertise gaps and bottlenecks to the HPC productivity problem became apparent when studying individual professionals and their time usage; the journal entries represented real-time accounts of code development from the programmer’s perspective. These patterns were used to define a typical HPC development workflow, identify where in the workflow the most effort is being expended, characterize the expertise profiles associated with workflow tasks, and draw conclusions about productivity bottlenecks and their root causes.

Figure 2

Figure 2. Work Model of HPC Programmers.

These results are summarized in Figure 2, which illustrates the typical workflow that developers go through in creating and optimizing HPC applications along with the skill sets required to perform each activity. A typical workflow includes understanding the problem that needs to be solved, formulating an initial computational solution, empirically evaluating the proposed solution through prototyping or experimentation, coding for sequential execution, evaluating the overall computational approach, then coding and optimizing the results for a parallel platform.

Activities that consume the greatest proportion of resources, effort, time, and expertise, within the overall programming effort were also identified:

  • Developing correct scientific programs: activities associated with translating an understanding of the scientific problem that must be solved (e.g., a predictive weather model) into code.
  • Code optimization and tuning: activities associated with refining a serial version of the code to ensure correctness and achieve desired levels of accuracy and efficiency.
  • Code parallelization and optimization: activities associated with parallelizing the code and tuning to achieve high machine utilization and rapid execution.
  • Porting: where a solution exists, this comprises the activities associated with translating the existing solution to a representation appropriate for a new computing platform.

Once the expertise problem was understood and the likely location that consumed the most time and effort for those experts was pinpointed, the question of how widely these findings could be applied was raised. Full validation required mapping the extent of the expertise gap, calling for a large quantitative survey of the sort suitable for the next stage of the research framework.

Pages: 1 2 3 4 5 6 7

Reference this article
Squires, S., Van De Vanter, M. L., Votta, L. G. "Software Productivity Research In High Performance Computing ," CTWatch Quarterly, Volume 2, Number 4A, November 2006 A. http://www.ctwatch.org/quarterly/articles/2006/11/software-productivity-research-in-high-performance-computing/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.