As a motivating example, we demonstrate MA on the NAS CG benchmark. NAS CG computes an approximation to the smallest eigenvalue of a large, sparse, symmetric positive definite matrix, which is characteristic of unstructured grid computations. The main subroutine is conj_grad
, which is called niter
times. The benchmark results report time in seconds for the time step (do it = 1, niter
) loop that is shown in Figure 2. Hence, we started constructing the model in the main program, starting from the conj_loop
, as shown is Figure 2. The first step was to identify the key input parameters, na
, nonzer
, niter
and nprocs
(number of MPI tasks). Then, using ma_def_variable_assign_int
function, we declared the essential derived values, which are later used in the MA annotations to simplify the model representation. Figure 2 only shows the MA annotations as applied to the NAS CG main loop. It does not show the annotations in the conj_grad
subroutine, which are similar except for additional ma_subroutine_start
and ma_subroutine_end
calls. The annotations shown in Figure 2 capture the overall flow of the application at different levels in the hierarchy; loops (e.g., conj_loop
, mpi), subroutines (e.g., conj_grad
), basic loop blocks, etc. These annotations also define variables (e.g., na
, nonzer
, niter
, nprocs
), which are important to the user in terms of quantities that determine the problem resolution and its workload requirements. At the lowest level is the norm_loop
as shown in Figure 2. The ma_loop_start
and ma_loop_end
annotations specify that the enclosed loop executes a number of iterations (e.g., l2npcols) with a typical number of MPI send and receive operations. More specifically, the ma_loop_start
routine captures an annotation name (i.e., l2npcols
= log2(num_proc_cols)
), a symbolic expression that defines the number of iterations in terms of an earlier defined MA variable (i.e., num_proc_cols
).
call maf_def_variable_int(’na’,na)
call maf_def_variable_assign_int (’num_proc_cols’,
’2ˆceil(log(nprocs)/(2*log(2)))’, num_proc_cols)
call maf_def_variable_assign_int( ’l2npcols’
call maf_loop_start(’conj_loop’, ’niter’, niter)
do it = 1, niter
call conj_grad ( colidx, reduce_recv_lengths )........
call maf_loop_start(’norm_loop’, ’l2npcols’, l2npcols)
do i = 1, l2npcols
call maf_mpi_irecv(’nrecv’,’dp*2’, dp*2,1)
call mpi_irecv( norm_temp2, 2, dp_type ........
call maf_loop_end(’norm_loop’,i-1)