On the performance and energy efficiency of sparse linear algebra on GPUs

Hartwig Anzt; Stanimire Tomov; Jack Dongarra

Submitted by webmaster on Wed, 12/07/2016 - 09:01

Title	On the performance and energy efficiency of sparse linear algebra on GPUs
Publication Type	Journal Article
Year of Publication	2016
Authors	Anzt, H., S. Tomov, and J. Dongarra
Journal	International Journal of High Performance Computing Applications
Date Published	2016-10
Abstract	In this paper we unveil some performance and energy efficiency frontiers for sparse computations on GPU-based supercomputers. We compare the resource efficiency of different sparse matrix–vector products (SpMV) taken from libraries such as cuSPARSE and MAGMA for GPU and Intel’s MKL for multicore CPUs, and develop a GPU sparse matrix–matrix product (SpMM) implementation that handles the simultaneous multiplication of a sparse matrix with a set of vectors in block-wise fashion. While a typical sparse computation such as the SpMV reaches only a fraction of the peak of current GPUs, we show that the SpMM succeeds in exceeding the memory-bound limitations of the SpMV. We integrate this kernel into a GPU-accelerated Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) eigensolver. LOBPCG is chosen as a benchmark algorithm for this study as it combines an interesting mix of sparse and dense linear algebra operations that is typical for complex simulation applications, and allows for hardware-aware optimizations. In a detailed analysis we compare the performance and energy efficiency against a multi-threaded CPU counterpart. The reported performance and energy efficiency results are indicative of sparse computations on supercomputers.
URL	http://hpc.sagepub.com/content/early/2016/10/05/1094342016672081.abstract
DOI	10.1177/1094342016672081

File:

icl-utk-1370-2016.pdf

External Publication Flag: