We used a comprehensive framework to collect data in our 2006 classroom experiment. We will describe our 2006 classroom experiment in full detail in a separate paper.10 Briefly, the programmers programmed the Game of Life on a large grid, which would not fit in the memory of a typical desktop computer. The Game of Life is played on a two dimensional grid. Cells may be alive or dead; their state evolves according to set of simple rules based on the state of their neighbors. Half the programmers used C/MPI, while the other half used UPC.
Our data collection process gathers enough data at compile time and run time so that programmer experience can be accurately recreated offline. This allows us to replay the sequence of events (every compile and run) and collect specific data that may be required by the modelling process but was not captured while the experiment was in progress. We used such replays very effectively for our 2004 classroom experiment.9 Our replay capabilities were not perfect then, and we had to use reasonable proxies for run time parameters. We refined our techniques in light of the lessons learned, and achieved perfect replay capabilities for the 2006 classroom experiment.
The programmers were provided with a basic harness that included build infrastructure, data generators, test cases and specific performance targets. We captured timestamps for every compile and run, stored every code snapshot at compile time, and recorded run time information such as number of processors, inputs, correctness and compute time. Due to a glitch in the run time data collection, some of the run time information we needed for the modeling process was missing. However, we were able to gather the missing run time information by replaying every single compile and run for every programmer. The replays took 10 hours to compile, and 950 hours of processor time for all runs.
In our 2004 classroom experiment, the programmers wrote a parallel sorting code using C++/MPI. Since the model based on timed Markov processes was proposed after the 2004 classroom experiment, we used replays to gather the required data and fit it to a timed Markov model. Our experience with data gathering and modeling for the 2004 and 2006 classroom experiments has led us to believe that the replay is the most important facet of the data gathering process.