CTWatch Quarterly » Experiments to Understand HPC Time to Development

Experiments to Understand HPC Time to Development

Lorin Hochstein, University of Nebraska
Taiga Nakamura, University of Maryland, College Park
Victor R. Basili, University of Maryland, College Park; Fraunhofer Center, Maryland
Sima Asgari, University of Maryland, College Park
Marvin V. Zelkowitz, University of Maryland, College Park; Fraunhofer Center, Maryland
Jeffrey K. Hollingsworth, University of Maryland, College Park
Forrest Shull, Fraunhofer Center, Maryland
Jeffrey Carver, Mississippi State University
Martin Voelp, University of Maryland, College Park
Nico Zazworka, University of Maryland, College Park
Philip Johnson, University of Hawaii

Defect studies

As part of our effort to understand development issues, our classroom experiments have moved beyond effort analysis and have started to look at the impact of defects (e.g., incorrect or excessive synchronization, incorrect data decomposition) on the development process. By understanding how, when, and the kind of defects that appear in HPC codes, tools and techniques can be developed to mitigate these risks to improve the overall workflow. As we have shown ², automatically determining workflow is not precise, so we are working on a mixture of process activity (e.g., coding, compiling, executing) with source code analysis techniques. The process of defect analysis we are building consists of the following main activities:

Analysis:

Analyze successive versions of the developing code looking for patterns of changes represented by successive code versions (e.g., defect discovery, defect repair, addition of new functionality).
Record the identified changes.
Develop a classification scheme and hypotheses.

For example, a small increase in source code, following a failed execution and following a large code insertion, could represent the pattern of the programming adding new functionality, followed by a test and then defect correction. Syntactic tools that find specific defects can be used to aid the human-based heuristic search for defects.

Verification:
We then need to analyze these results at various levels. Verification consists of the following steps, among others:

If we can somehow obtain the “true” defect sets, we can directly compare our analysis results with them to evaluate the analysis results quantitatively.
Multiple analysts can independently analyze the source code and record identified defects.
Examine individual instances of defects to check if each defect is correctly captured and documented.
Provide defect instances and classify them into one of the given defect types. This can be used to check the consistency of the classification scheme.

Nakamura, et al. explore this defect methodology in greater detail. ⁵

3. Results

Early results needed to validate our process were to verify that students could indeed produce good HPC codes and that we could measure their increased performance. Table 2 is one set of data that shows that students achieved speedups of approximately three to seven on an 8-processor HPC machine. CxAy means class number x, assignment number y. This coding was used to preserve anonymity of the student population.

Data set	Programming Model	Speedup on 8 processors
Speedup measured relative to serial version:
C1A1	MPI	mean 4.74, sd 1.97, n=2
C3A3	MPI	mean 2.8, sd 1.9, n=3
C3A3	OpenMP	mean 6.7, sd 9.1, n=2
Speedup measured relative to parallel version run on 1 processor:
C0A1	MPI	mean 5.0, sd 2.1, n=13
C1A1	MPI	mean 4.8, sd 2.0, n=3
C3A3	MPI	mean 5.6, sd 2.5, n=5
C3A3	OpenMP	mean 5.7, sd 3.0, n=4

Table 2. Mean, standard deviation, and number of subjects for computing speedup on Game of Life program.

Figure 3. HPC productivity.

Pages: 1 2 3 4 5 6

CTWatch is a collaborative effort				Sponsored By