Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors

Hartwig Anzt; Jack Dongarra; Goran Flegar; Enrique S. Quintana-Orti

Submitted by claxton on Fri, 08/17/2018 - 03:20

Title	Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors
Publication Type	Journal Article
Year of Publication	2019
Authors	Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti
Journal	Parallel Computing
Volume	81
Pagination	131-146
Date Published	2019-01
Keywords	Batched algorithms, Block-Jacobi, Gauss–Jordan elimination, Graphics processor, matrix inversion, sparse linear systems
Abstract	In this work, we address the efficient realization of block-Jacobi preconditioning on graphics processing units (GPUs). This task requires the solution of a collection of small and independent linear systems. To fully realize this implementation, we develop a variable-size batched matrix inversion kernel that uses Gauss-Jordan elimination (GJE) along with a variable-size batched matrix–vector multiplication kernel that transforms the linear systems’ right-hand sides into the solution vectors. Our kernels make heavy use of the increased register count and the warp-local communication associated with newer GPU architectures. Moreover, in the matrix inversion, we employ an implicit pivoting strategy that migrates the workload (i.e., operations) to the place where the data resides instead of moving the data to the executing cores. We complement the matrix inversion with extraction and insertion strategies that allow the block-Jacobi preconditioner to be set up rapidly. The experiments on NVIDIA’s K40 and P100 architectures reveal that our variable-size batched matrix inversion routine outperforms the CUDA basic linear algebra subroutine (cuBLAS) library functions that provide the same (or even less) functionality. We also show that the preconditioner setup and preconditioner application cost can be somewhat offset by the faster convergence of the iterative solver.
DOI	10.1016/j.parco.2017.12.006

Project Tags:

magma

File:

icl-utk-1068-2018.pdf

External Publication Flag: