GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure

TitleGPU-based LU Factorization and Solve on Batches of Matrices with Band Structure
Publication TypeConference Paper
Year of Publication2023
AuthorsAbdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra
Conference NameSC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis
Date Published2023-11
Conference LocationDenver, CO
ISBN Number9798400707858

This paper presents a portable and performance-efficient approach to solve a batch of linear systems of equations using Graphics Processing Units (GPUs). Each system is represented using a special type of matrices with a band structure above and/or below the diagonal. Each matrix is factorized using an LU factorization with partial pivoting for numerical stability. Subsequently, the factors are used to find the solution for as many right hand sides as needed. The width of the band is often small enough that performing a fully dense LU factorization results in poor performance. We follow the standard LAPACK specifications for addressing this type of problems and develop a dedicated solver that runs efficiently on GPUs. No similar solver is currently available in the vendor’s software stack, so performance results are shown on both NVIDIA and AMD GPUs relative to a parallel CPU solution utilizing OpenMP for thread-level parallelization.

Project Tags: 
External Publication Flag: