Submitted by scrawford on
|Title||The Case for Directive Programming for Accelerator Autotuner Optimization|
|Publication Type||Tech Report|
|Year of Publication||2017|
|Authors||Fayad, D., J. Kurzak, P. Luszczek, P. Wu, and J. Dongarra|
|Technical Report Series Title||Innovative Computing Laboratory Technical Report|
|Institution||University of Tennessee|
In this work, we present the use of compiler pragma directives for parallelizing autotuning of specialized compute kernels for hardware accelerators. A set of constructs, that include prallelizing a source code that prune a generated search space with a large number of constraints for an autotunning infrastructure. For a better performance we studied optimization aimed at minimization of the run time.We also studied the behavior of the parallel load balance and the speedup on four different machines: x86, Xeon Phi, ARMv8, and POWER8.