CTWatch Quarterly » Software Productivity Research In High Performance Computing

Software Productivity Research In High Performance Computing

Susan Squires, Sun Microsystems Inc.
Michael L. Van De Vanter, Sun Microsystems Inc.
Lawrence G. Votta, Sun Microsystems Inc.

4. Definition Research, Stage 2: Combining Qualitative with Quantitative Measurement

Definition Research helps test an idea or hypothesis focusing the research on more detailed use, use features, and meaning associated with activities and helps define work models and workflow. The methods used during Definition Research differ from Discovery because of what has already been learned during Discovery: parameters of the topic, the concepts, and something about the individuals. Definition Research usually starts with a series of questions, based on some grounded hypothesis generated during discovery. For example, researchers investigating programming techniques can compare existing codes in context to make inferences about how proposed changes might change both the programming work and the results.

Definition research concentrates on details, so any interviews conducted in this stage are more structured than during Discovery. These interviews follow a set of well-understood rules; for example the interviewer builds rapport in the first segment and subsequently seeks deeper information. Details are summarized at the conclusion of the interview in order to confirm the data with the respondent. Verbal and written statements are validated through observed actions. In order to fully understand the specifics of the context, researchers listen for native language: words, terms, and descriptions.

4.1 HPCS Stage 2 Methods

An advantage of both Discovery and Definition Research is that relatively few cases are needed to discern relevant cultural patterns and to learn about shared understandings and behaviors in a group. This approach has powerful implications for quantifying human behavior patterns in ways never imagined a few years ago. Notre Dame mathematician Albert-Laszlo Barabasi recently wrote in Science that our grouping ability to collect data about human actions combined “with the sophisticated tool of network theory, . . . (provides) a glimpse of an unprecedented opportunity to quantify human dynamics.”²⁴ Social network analysis can take millions of bits of data and construct reliable and predictive human patterns. But such an approach can begin with small data sets as well.

Mathematician Duncan Watts points out that “the world that we live in is not at all random. We are very much constrained by our socioeconomic status, our geographical location, our background, our education and our profession, our interests and hobbies. All these things make our circle of acquaintances highly nonrandom.”²⁵ Watts and fellow mathematician Steven Strogatz are among a growing number of researchers who have been examining highly structured social networks in order to understand and use mathematical formulations to predict membership connectiveness. Their work has been inspired by, and extends the work of, the theoretical mathematician Paul Erdös, who has been indirectly responsible for popularizing the idea of six degrees of separation: the idea that only six other individuals link all humankind through their social networks to almost everybody else in the world.

Although Watts and Strogatz are most interested in the interconnectiveness of social networks, many others such as Borgatti and Everett²⁶ (see also Mathematical Social Sciences and Social Networks) focus on the highly nonrandom nature of social networks. Their goal is to devise statistically reliable mathematical formulae that predict the number of individuals who need to be interviewed in order to capture the shared characteristics of social networks: shared beliefs, values and behaviors. The analysis of Handwerker and Wozniak suggests the surprising low number of seven,²⁷ although this number is reliable only when certain criteria are met:

The information gathered is about shared or core understandings within the social network. This is not about group variation.
The cognitive domain of the social network is internally consistent and ordered.
Information is gathered from key members of the social network: cultural experts.
Informants are interviewed independently of others in their social network. Informants must not be allowed to confound information by checking or comparing notes with other members.
There are no known divisions within the social network: no sub-groupings that might have distinct sets of core knowledge and behaviors.²⁷

One way to determine if a social network is bounded is to analyze patterns that emerge from the data. Highly redundant information about membership of a group and their perceptions of cultural norms are strong evidence that there is consensus about who is in the group and what they consider appropriate. However, if any of the criteria are not met, then another individual must be interviewed. Once a consensus is identified, then it is recorded as a discovery and the researcher moves on. In our Discovery research we were able to identify these shared work patterns with a high level of confidence, making it possible to identify potential participants for the following research stage.

One of the HPCS goals during the Definition research stage was to understand the workflow of HPC programmers and to identify how and where current HPC development paradigms limit code development. For example, a hypothesis from the Discovery stage was that one limiting factor is the level and range of expertise needed. Definition research was designed to validate the hypothesis and, if validated, shed additional light on the development process: where specific expertise was deployed and where programmers encounter significant time and effort bottlenecks.

Pages: 1 2 3 4 5 6 7

CTWatch is a collaborative effort				Sponsored By