As part of our effort to understand development issues, our classroom experiments have moved beyond effort analysis and have started to look at the impact of defects (e.g., incorrect or excessive synchronization, incorrect data decomposition) on the development process. By understanding how, when, and the kind of defects that appear in HPC codes, tools and techniques can be developed to mitigate these risks to improve the overall workflow. As we have shown 2, automatically determining workflow is not precise, so we are working on a mixture of process activity (e.g., coding, compiling, executing) with source code analysis techniques. The process of defect analysis we are building consists of the following main activities:
Analysis:
- Analyze successive versions of the developing code looking for patterns of changes represented by successive code versions (e.g., defect discovery, defect repair, addition of new functionality).
- Record the identified changes.
- Develop a classification scheme and hypotheses.
For example, a small increase in source code, following a failed execution and following a large code insertion, could represent the pattern of the programming adding new functionality, followed by a test and then defect correction. Syntactic tools that find specific defects can be used to aid the human-based heuristic search for defects.
Verification:
We then need to analyze these results at various levels. Verification consists of the following steps, among others:
- If we can somehow obtain the “true” defect sets, we can directly compare our analysis results with them to evaluate the analysis results quantitatively.
- Multiple analysts can independently analyze the source code and record identified defects.
- Examine individual instances of defects to check if each defect is correctly captured and documented.
- Provide defect instances and classify them into one of the given defect types. This can be used to check the consistency of the classification scheme.
Nakamura, et al. explore this defect methodology in greater detail. 5
Early results needed to validate our process were to verify that students could indeed produce good HPC codes and that we could measure their increased performance. Table 2 is one set of data that shows that students achieved speedups of approximately three to seven on an 8-processor HPC machine. CxAy means class number x, assignment number y. This coding was used to preserve anonymity of the student population.
Data set | Programming Model | Speedup on 8 processors |
Speedup measured relative to serial version: | ||
C1A1 | MPI | mean 4.74, sd 1.97, n=2 |
C3A3 | MPI | mean 2.8, sd 1.9, n=3 |
C3A3 | OpenMP | mean 6.7, sd 9.1, n=2 |
Speedup measured relative to parallel version run on 1 processor: | ||
C0A1 | MPI | mean 5.0, sd 2.1, n=13 |
C1A1 | MPI | mean 4.8, sd 2.0, n=3 |
C3A3 | MPI | mean 5.6, sd 2.5, n=5 |
C3A3 | OpenMP | mean 5.7, sd 3.0, n=4 |
