Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices

Submitted by webmaster on Wed, 07/05/2017 - 11:14

Title	Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices
Publication Type	Conference Paper
Year of Publication	2017
Authors	Dong, T., A. Haidar, S. Tomov, and J. Dongarra
Conference Name	International Conference on Computational Science (ICCS 2017)
Date Published	2017-06
Publisher	Procedia Computer Science
Conference Location	Zurich, Switzerland
Abstract	A challenging class of problems arising in many GPU applications, called batched problems, involves linear algebra operations on many small-sized matrices. We designed batched BLAS (Basic Linear Algebra Subroutines) routines, and in particular the Level-2 BLAS GEMV and the Level-3 BLAS GEMM routines, to solve them. We proposed device functions and big-tile settings in our batched BLAS design. We adopted auto-tuning to optimize different instances of GEMV routines. We illustrated our batched BLAS approach to optimize batched bi-diagonalization progressively on a K40c GPU. The optimization techniques in this paper are applicable to the other two-sided factorizations as well.
URL	http://www.sciencedirect.com/science/article/pii/S1877050917308645
DOI	10.1016/j.procs.2017.05.237

Project Tags:

File:

External Publication Flag: