We now describe in more detail the use of compiler analyses and source-level code instrumentation to extract as much information about the program behavior as possible and with the least amount of run-time overhead.
Figure 2 depicts the overall system architecture. We use the Open64 infrastructure for program internal representation and transformation.12 Given a source file, the compiler performs a series of analyses and generates a set of static information files. These files have some of the information about the program structure such as basic blocks and loop organization, and about the program structure the compiler can determine statically. Next, the compiler instruments the source code with library function calls to capture at run-time the missing pieces of information. At run-time the execution collects these missing pieces of information in internal data structures to be output to selected files when the program executes with real input data. These instrumented files are then compiled with the native compiler to generate an executable that is linked against the instrumentation library.
We have implemented the following basic compiler analyses in Open64 that form the basis for our instrumentation:
- Basic Block Recognition: The analysis uses the basic control flow information in the Open64 intermediate representation (Whirl) to recognize basic blocks and loop structure in addition to call graph;
- Induction Variable Recognition: We derive induction variables, in some cases capturing induction variables across multiple control flow branches, and as always focusing on scalar variables only;
- Loop Invariants and Symbolic Constant Loop Bounds: Using induction variables and loop index variables (in Fortran) we recognize which bounds of the inner loops are symbolically constant, i.e., their value is unknown at compile time but does not change through the execution of the loop they control;
- Data Access Patterns in Array Expressions: We focus on affine index function recognition to determine, either statically or with the help of run-time information, the stride of the array accesses for a given array reference.
Using the basic information from these analyses the compiler instruments the code to determine the following program execution properties:
- Basic Block Execution Frequency: the absolute number of times a given basic block is executed. In many cases with symbolically constant expression, we hoist the expression that evaluates the number of executions outside the loop that encloses the corresponding basic block.
- Loop Bounds: The actual values of the loop bounds and thus the number of loop iterations across all invocations of a given loop construct.
- Virtual Array Address Ranges. For each array reference, the instrumentation keeps track of the range of addresses accessed for each invocation of the loop.
- Array stride Information: Determine stride of consecutive accesses for each given reference measured in terms of array index function and not the virtual address accessed.