Submitted by webmaster on
Title | Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices |
Publication Type | Conference Paper |
Year of Publication | 2017 |
Authors | Dong, T., A. Haidar, S. Tomov, and J. Dongarra |
Conference Name | International Conference on Computational Science (ICCS 2017) |
Date Published | 2017-06 |
Publisher | Procedia Computer Science |
Conference Location | Zurich, Switzerland |
Abstract | A challenging class of problems arising in many GPU applications, called batched problems, involves linear algebra operations on many small-sized matrices. We designed batched BLAS (Basic Linear Algebra Subroutines) routines, and in particular the Level-2 BLAS GEMV and the Level-3 BLAS GEMM routines, to solve them. We proposed device functions and big-tile settings in our batched BLAS design. We adopted auto-tuning to optimize different instances of GEMV routines. We illustrated our batched BLAS approach to optimize batched bi-diagonalization progressively on a K40c GPU. The optimization techniques in this paper are applicable to the other two-sided factorizations as well. |
URL | http://www.sciencedirect.com/science/article/pii/S1877050917308645 |
DOI | 10.1016/j.procs.2017.05.237 |
File:
External Publication Flag: