CTWatch Quarterly » Symbolic Performance Modeling of HPCS Applications

Symbolic Performance Modeling of HPCS Applications

Sadaf R. Alam, Oak Ridge National Laboratory
Nikhil Bhatia, Oak Ridge National Laboratory
Jeffrey S. Vetter, Oak Ridge National Laboratory

3. Construction and Validation of Symbolic Models

As a motivating example, we demonstrate MA on the NAS CG benchmark. NAS CG computes an approximation to the smallest eigenvalue of a large, sparse, symmetric positive definite matrix, which is characteristic of unstructured grid computations. The main subroutine is conj_grad, which is called niter times. The benchmark results report time in seconds for the time step (do it = 1, niter) loop that is shown in Figure 2. Hence, we started constructing the model in the main program, starting from the conj_loop, as shown is Figure 2. The first step was to identify the key input parameters, na, nonzer, niter and nprocs (number of MPI tasks). Then, using ma_def_variable_assign_int function, we declared the essential derived values, which are later used in the MA annotations to simplify the model representation. Figure 2 only shows the MA annotations as applied to the NAS CG main loop. It does not show the annotations in the conj_grad subroutine, which are similar except for additional ma_subroutine_start and ma_subroutine_end calls. The annotations shown in Figure 2 capture the overall flow of the application at different levels in the hierarchy; loops (e.g., conj_loop, mpi), subroutines (e.g., conj_grad), basic loop blocks, etc. These annotations also define variables (e.g., na, nonzer, niter, nprocs), which are important to the user in terms of quantities that determine the problem resolution and its workload requirements. At the lowest level is the norm_loop as shown in Figure 2. The ma_loop_start and ma_loop_end annotations specify that the enclosed loop executes a number of iterations (e.g., l2npcols) with a typical number of MPI send and receive operations. More specifically, the ma_loop_start routine captures an annotation name (i.e., l2npcols = log2(num_proc_cols)), a symbolic expression that defines the number of iterations in terms of an earlier defined MA variable (i.e., num_proc_cols).

call maf_def_variable_int(’na’,na) call maf_def_variable_assign_int (’num_proc_cols’, ’2ˆceil(log(nprocs)/(2*log(2)))’, num_proc_cols) call maf_def_variable_assign_int( ’l2npcols’ call maf_loop_start(’conj_loop’, ’niter’, niter) do it = 1, niter call conj_grad ( colidx, reduce_recv_lengths )........ call maf_loop_start(’norm_loop’, ’l2npcols’, l2npcols) do i = 1, l2npcols call maf_mpi_irecv(’nrecv’,’dp*2’, dp*2,1) call mpi_irecv( norm_temp2, 2, dp_type ........ call maf_loop_end(’norm_loop’,i-1)

Figure 2. Annotation of the CG benchmark with MA API calls.

Pages: 1 2 3 4 5 6

CTWatch is a collaborative effort				Sponsored By