| Initialize/finalize | |
| ▼Utilities | |
| Allocate GPU device memory | |
| Allocate CPU host memory | |
| Allocate pinned CPU host memory | |
| ►Communication CPU <=> GPU | |
| copyvector: GPU => GPU | |
| getvector: GPU => CPU | |
| setvector: CPU => GPU | |
| copymatrix: GPU => GPU | |
| getmatrix: GPU => CPU | |
| setmatrix: CPU => GPU | |
| getmatrix_transpose: GPU => CPU | |
| setmatrix_transpose: CPU => GPU | |
| getmatrix_bcyclic: multi-GPU => CPU | |
| setmatrix_bcyclic: CPU => multi-GPU | |
| ►Constants and converters | |
| Map LAPACK => MAGMA | |
| Map MAGMA => LAPACK | |
| Map CBLAS => MAGMA | |
| Map clBLAS => MAGMA | |
| Map cuBLAS => MAGMA | |
| Device management | |
| Queue management | |
| Event management | |
| ►Error handling | |
| MAGMA error codes | |
| ►Miscellaneous utilities | |
| make_lwork: Round lwork for float | |
| Print matrix | |
| Timer | |
| Tuning (get_nb, etc.) | |
| ▼Dense LA | Designed to operate on large, dense matrices |
| ►Linear system solvers | Solves \(Ax = b\) |
| ►General matrices: LU | Solves \(Ax = b\) using LU factorization for general matrices |
| gesv: Solves Ax = b using LU factorization (driver) | |
| getrf: LU factorization | |
| getrs: LU forward and back solves | |
| getri: LU inverse | |
| ►Auxiliary routines | |
| getf2: LU panel factorization | |
| laswp: Swap rows | |
| ►No pivoting variant | |
| gesv: Solves Ax = b using LU factorization - no pivoting (driver) | |
| getf2: LU panel factorization - no pivoting | |
| getrf: LU factorization - no pivoting | |
| getrs: LU forward and back solves - no pivoting | |
| gerfs: Refine solution - no pivoting | |
| ►General matrices: RBT + no pivoting LU | |
| gesv_rbt: Solves Ax = b using RBT + LU factorization (driver) | |
| gerbt: Apply random butterfly transformation (RBT) | |
| ►Auxiliary routines | |
| prbt | |
| prbt_mv | |
| prbt_mtv | |
| ►General matrices: least squares | |
| gels: Least squares solves Ax = b using QR factorization (driver) | |
| gglse: Least squares solves Ax = b subject to Bx = d (driver) | |
| geqrsv: Solves Ax = b using QR factorization (driver) | |
| ►Symmetric/Hermitian positive definite: Cholesky | |
| posv: Solves Ax = b using Cholesky factorization (driver) | |
| potrf: Cholesky factorization | |
| potrs: Cholesky forward and back solves | |
| potri: Cholesky inverse | |
| ►Auxiliary routines | |
| potf2: Cholesky panel factorization | |
| lauum: Multiply triangular matrices; used in potri | |
| ►Symmetric/Hermitian indefinite | |
| sy/hesv: Solves Ax = b using symmetric/Hermitian indefinite factorization (driver) | |
| sy/hetrf: symmetric/Hermitian indefinite factorization (Bunch-Kaufman pivoting) | |
| sy/hetrf: symmetric/Hermitian indefinite factorization (Aasen) | |
| ►Auxiliary routines | |
| lahef: Partial factorization; used by hetrf | |
| laswp_sym: Swap rows/cols | |
| ►No pivoting variant | |
| sy/hesv: Solves Ax = b using symmetric/Hermitian indefinite factorization - no pivoting (driver) | |
| sy/hetrf: symmetric/Hermitian indefinite factorization - no pivoting | |
| sy/hetrs: symmetric/Hermitian indefinite forward and back solves - no pivoting | |
| ►Symmetric indefinite | |
| ►No pivoting variant | |
| sysv: Solves Ax = b using symmetric indefinite factorization - no pivoting (driver) | |
| sytrf: Symmetric indefinite factorization - no pivoting | |
| sytrs: Symmetric indefinite forward and back solves - no pivoting | |
| ►Orthogonal/unitary factorizations | Factor \(A\) using \(QR, RQ, QL, LQ\) |
| ►QR factorization | Factor \(A = QR\) |
| geqrf: QR factorization | |
| geqp3: QR factorization with column pivoting | |
| gegqr: QR factorization and generate Q | |
| or/unmqr: Multiply by Q from QR factorization | |
| or/ungqr: Generate Q from QR factorization | |
| geqrs: | |
| ►Auxiliary routines | |
| geqr2: QR panel factorization | |
| laqps: Partial factorization; used by geqp3 | |
| nrm2_adjust: auxiliary for geqp3 | |
| nrm2_check: auxiliary for geqp3 | |
| nrm2_cols: auxiliary for geqp3 | |
| nrm2_row_check_adjust: auxiliary for geqp3 | |
| ►RQ factorization | |
| gerqf: RQ factorization | |
| ggrqf: generalized RQ factorization of an M-by-N matrix A and a P-by-N matrix B | |
| or/unmrq: Multiply by Q from RQ factorization | |
| or/ungrq: Generate Q from RQ factorization | |
| ►QL factorization | |
| geqlf: QL factorization | |
| or/unmql: Multiply by Q from QL factorization | |
| or/ungql: Generate Q from QL factorization | |
| ►LQ factorization | |
| gelqf: LQ factorization | |
| or/unmlq: Multiply by Q from LQ factorization | |
| or/unglq: Generate Q from LQ factorization | |
| ►Eigenvalues | Solves \(Ax = \lambda x\) |
| ►Non-symmetric eigenvalues | Solves \(Ax = \lambda x\) where \(A\) is general |
| geev: Non-symmetric eigenvalues (driver) | |
| gehrd: Hessenberg reduction | |
| or/unghr: Generate Q from Hessenberg reduction | |
| ►Auxiliary routines | |
| lahr2: Partial factorization; used by gehrd | |
| lahru: Partial factorization; used by gehrd | |
| trevc: Compute eigenvectors; used by geev | |
| latrsd: Triangular solve with modified diagonal; used by trevc | |
| laqtrsd: Quasi-Triangular solve with modified diagonal; used by trevc | |
| laln2: Solve 2x2 system; used by trevc | |
| ►Symmetric/Hermitian eigenvalues | |
| sy/heevx: Solves using QR iteration (expert) | |
| sy/heevd: Solves using divide-and-conquer (driver) | |
| sy/heevdx: Solves using divide-and-conquer (expert) | |
| sy/heevr: Solves using MRRR (driver) | |
| sy/hetrd: Tridiagonal reduction | |
| or/unmtr: Multiply by Q from tridiagonal reduction | |
| or/ungtr: Generate Q from tridiagonal reduction | |
| ►Auxiliary routines | |
| latrd: Partial factorization; used by hetrd | |
| stedx: Eigenvalues & vectors of tridiagonal using D&C | |
| laex0: Eigenvalues & vectors of tridiagonal using D&C | |
| laex1: Updated eigensystem after rank-1 update. | |
| laex3: Roots of secular equation. | |
| ►2-stage variant | |
| he2hb: 1st stage, full to band | |
| sy2sb: 1st stage, full to band | |
| hb2st: 2nd stage, band to tridiagonal | |
| sb2st: 2nd stage, band to tridiagonal | |
| hbtype1cb | |
| hbtype2cb | |
| hbtype3cb | |
| ►Generalized Symmetric/Hermitian eigenvalues | |
| sy/hegvx: Solves using QR iteration (expert) | |
| sy/hegvd: Solves using divide-and-conquer (driver) | |
| sy/hegvdx: Solves using divide-and-conquer (expert) | |
| sy/hegvr: Solves using MRRR (driver) | |
| ►Auxiliary routines | |
| hegst: Reduce generalized problem to standard problem. | |
| ►Singular Value Decomposition (SVD) | Factor \(A = U \Sigma V^T\) |
| gesvd: SVD using QR iteration | |
| gesdd: SVD using divide-and-conquer | |
| gebrd: Bidiagonal reduction | |
| or/unmbr: Multiply by Q or P from bidiagonal reduction | |
| or/ungbr: Generate Q or P from bidiagonal reduction | |
| ►Auxiliary routines | |
| labrd: Partial factorization; used by gebrd | |
| ►MAGMA BLAS and Auxiliary | BLAS and Auxiliary functions |
| ►Math functions (sqrt, etc.), O(1) work | |
| ceil(x/y) and ceil(x/y)*y | |
| sqrt | |
| NAN and INF checks | |
| complex number support | |
| ►Level 1: vectors operations, O(n) work | |
| asum: Sum vector | \(\sum_i |x_i|\) |
| axpy: Add vectors | \(y = \alpha x + y\) |
| copy: Copy vector | \(y = x\) |
| dot: Dot (inner) product | \(x^T y\) or \(x^H y\) |
| iamax: Find max element | \(\text{argmax}_i\; |x_i|\) |
| iamin: Find min element | \(\text{argmin}_i\; |x_i|\) |
| nrm2: Vector 2 norm | \(||x||_2\) |
| rot: Apply Givens rotation | |
| rotg: Generate Givens rotation | |
| rotm: Apply modified Givens rotation | |
| rotmg: Generate modified Givens rotation | |
| scal: Scale vector | \(x = \alpha x\) |
| swap: Swap vectors | \(x <=> y\) |
| ►Level 2: matrix-vector operations, O(n^2) work | |
| geadd: Add matrices | \(B = \alpha A + \beta B\) |
| gemv: General matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| ger: General matrix rank 1 update | \(A = \alpha xy^T + A\) |
| hemv: Hermitian matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| her: Hermitian rank 1 update | \(A = \alpha xx^T + A\) |
| her2: Hermitian rank 2 update | \(A = \alpha xy^T + \alpha yx^T + A\) |
| symv: Symmetric matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| syr: Symmetric rank 1 update | \(A = \alpha xx^T + A\) |
| syr2: Symmetric rank 2 update | \(A = \alpha xy^T + \alpha yx^T + A\) |
| trmv: Triangular matrix-vector multiply | \(x = Ax\) |
| trsv: Triangular matrix-vector solve | \(x = op(A^{-1})\; b\) |
| swapblk: Swap several rows | |
| swapdblk: Swap diagonal blocks | |
| symmetrize: Symmetrize matrix | \(\text{upper}(A) = \text{lower}(A)^T\) or \(\text{lower}(A) = \text{upper}(A)^T\) |
| transpose: Transpose matrix | \(B = A^T\) or \(B = A^H\) |
| lacgv: Conjugate vector | \(x = conj(x)\) |
| lacpy: Copy matrix | \(B = A\) |
| lascl: Scale matrix by scalar | \(A = \alpha A\) |
| lascl_diag: Scale matrix by diagonal | \(A = D A\) |
| lascl_2x2: Scale matrix by 2-by-2 pivot | \(A = D A\) |
| laset: Set matrix to constants | \(A_{ij} =\) diag if \(i=j\); \(A_{ij} =\) offdiag otherwise |
| laset_band: Set band of matrix to constants | \(A_{ij} =\) diag if \(i=j\); \(A_{ij} =\) offdiag otherwise |
| ►Level 3: matrix-matrix operations, O(n^3) work | |
| gemm: General matrix multiply: C = AB + C | \(C = \alpha \;op(A) \;op(B) + \beta C\) |
| hemm: Hermitian matrix multiply | \(C = \alpha A B + \beta C\) or \(C = \alpha B A + \beta C\) where \(A\) is Hermitian |
| herk: Hermitian rank k update | \(C = \alpha A A^T + \beta C\) where \(C\) is Hermitian |
| her2k: Hermitian rank 2k update | \(C = \alpha A B^T + \alpha B A^T + \beta C\) where \(C\) is Hermitian |
| symm: Symmetric matrix multiply | \(C = \alpha A B + \beta C\) or \(C = \alpha B A + \beta C\) where \(A\) is symmetric |
| syrk: Symmetric rank k update | \(C = \alpha A A^T + \beta C\) where \(C\) is symmetric |
| syr2k: Symmetric rank 2k update | \(C = \alpha A B^T + \alpha B A^T + \beta C\) where \(C\) is symmetric |
| trmm: Triangular matrix multiply | \(B = \alpha \;op(A)\; B\) or \(B = \alpha B \;op(A) \) where \(A\) is triangular |
| trsm: Triangular solve matrix | \(C = op(A)^{-1} B \) or \(C = B \;op(A)^{-1}\) where \(A\) is triangular |
| trtri: Triangular inverse; used in getri, potri | \(A = A^{-1}\) where \(A\) is triangular |
| trtri_diag: Invert diagonal blocks of triangular matrix; used in trsm | |
| ►Householder reflectors | |
| larfy: Apply Householder reflector to symmetric/Hermitian matrix | |
| larfg: Generate Householder reflector | |
| larfb: Apply block of Householder reflectors (Level 3) | |
| ►Precision conversion | |
| <em>lag2</em>: Converts general matrix between single and double | |
| <em>lat2</em>: Converts triangular matrix between single and double | |
| ►Matrix norms | |
| lange: General matrix norm | 1, Frobenius, or Infinity norm; or largest element |
| lansy/he: Symmetric/Hermitian matrix norm | 1, Frobenius, or Infinity norm; or largest element |
| ▼Batched LA | Batched functions operate on a large set of small matrices in parallel, for instance, 10000 matrices of size 100 x 100 |
| ►Linear system solvers | Solves \(Ax = b\) |
| ►General matrices: LU | Solves \(Ax = b\) using LU factorization for general matrices |
| gesv: Solves Ax = b using LU factorization (driver) | |
| getrf: LU factorization | |
| getrs: LU forward and back solves | |
| getri: LU inverse | |
| gerfs: Refine solution | |
| ►Auxiliary routines | |
| getf2: LU panel factorization | |
| laswp: Swap rows | |
| ►No pivoting variant | |
| gesv: Solves Ax = b using LU factorization - no pivoting (driver) | |
| getf2: LU panel factorization - no pivoting | |
| getrf: LU factorization - no pivoting | |
| getrs: LU forward and back solves - no pivoting | |
| ►General matrices: RBT + no pivoting LU | |
| gesv_rbt: Solves Ax = b using RBT + LU factorization (driver) | |
| ►Auxiliary routines | |
| gerbt: Apply random butterfly transformation (RBT) | |
| prbt | |
| prbt_mv | |
| prbt_mtv | |
| ►General matrices: least squares | |
| gels: Least squares solves Ax = b using QR factorization (driver) | |
| ►Symmetric/Hermitian positive definite: Cholesky | |
| posv: Solves Ax = b using Cholesky factorization (driver) | |
| potrf: Cholesky factorization | |
| potrs: Cholesky forward and back solves | |
| potri: Cholesky inverse | |
| porfs: Refine solution | |
| ►Auxiliary routines | |
| potf2: Cholesky panel factorization | |
| lauum: Multiply triangular matrices; used in potri | |
| ►Hermitian indefinite | |
| hesv: Solves Ax = b using symmetric indefinite factorization (driver) | |
| hesv: Solves Ax = b using symmetric indefinite factorization - no pivoting (driver) | |
| hetrf: Symmetric indefinite factorization | |
| hetrs: Symmetric indefinite forward and back solves | |
| hetri: Symmetric indefinite inverse | |
| herfs: Refine solution | |
| ►Auxiliary routines | |
| lahef: Partial factorization; used by hetrf | |
| ►Symmetric indefinite | |
| sysv: Solves Ax = b using symmetric indefinite factorization (driver) | |
| sysv: Solves Ax = b using symmetric indefinite factorization - no pivoting (driver) | |
| sytrf: Symmetric indefinite factorization | |
| sytrs: Symmetric indefinite forward and back solves | |
| sytri: Symmetric indefinite inverse | |
| syrfs: Refine solution | |
| ►Auxiliary routines | |
| lasyf: Partial factorization; used by sytrf | |
| ►Orthogonal/unitary factorizations | Factor \(A\) using \(QR, RQ, QL, LQ\) |
| ►QR factorization | Factor \(A = QR\) |
| geqrf: QR factorization | |
| or/unmqr: Multiply by Q from QR factorization | |
| or/ungqr: Generate Q from QR factorization | |
| ►Auxiliary routines | |
| geqr2: QR panel factorization | |
| copy V to R | |
| ►RQ factorization | |
| gerqf: RQ factorization | |
| or/unmrq: Multiply by Q from RQ factorization | |
| or/ungrq: Generate Q from RQ factorization | |
| ►QL factorization | |
| geqlf: QL factorization | |
| or/unmql: Multiply by Q from QL factorization | |
| or/ungql: Generate Q from QL factorization | |
| ►LQ factorization | |
| gelqf: LQ factorization | |
| or/unmlq: Multiply by Q from LQ factorization | |
| or/unglq: Generate Q from LQ factorization | |
| ►MAGMA BLAS and Auxiliary | Batched BLAS and Auxiliary functions |
| ►Level 1: vectors operations, O(n) work | Vector operations that perform \(O(n)\) work on \(O(n)\) data |
| asum: Sum vector | \(\sum_i |x_i|\) |
| axpy: Add vectors | \(y = \alpha x + y\) |
| copy: Copy vector | \(y = x\) |
| dot: Dot (inner) product | \(x^T y\) or \(x^H y\) |
| iamax: Find max element | \(\text{argmax}_i\; |x_i|\) |
| iamin: Find min element | \(\text{argmin}_i\; |x_i|\) |
| nrm2: Vector 2 norm | \(||x||_2\) |
| rot: Apply Givens rotation | |
| rotg: Generate Givens rotation | |
| rotm: Apply modified Givens rotation | |
| rotmg: Generate modified Givens rotation | |
| scal: Scale vector | \(x = \alpha x\) |
| swap: Swap vectors | \(x <=> y\) |
| ►Level 2: matrix-vector operations, O(n^2) work | |
| geadd: Add matrices | \(B = \alpha A + \beta B\) |
| gemv: General matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| ger: General matrix rank 1 update | \(A = \alpha xy^T + A\) |
| hemv: Hermitian matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| her: Hermitian rank 1 update | \(A = \alpha xx^T + A\) |
| her2: Hermitian rank 2 update | \(A = \alpha xy^T + \alpha yx^T + A\) |
| symv: Symmetric matrix-vector multiply | \(y = \alpha Ax + \beta y\) |
| syr: Symmetric rank 1 update | \(A = \alpha xx^T + A\) |
| syr2: Symmetric rank 2 update | \(A = \alpha xy^T + \alpha yx^T + A\) |
| trmv: Triangular matrix-vector multiply | \(x = Ax\) |
| trsv: Triangular matrix-vector solve | \(x = op(A^{-1})\; b\) |
| swapblk: Swap several rows | |
| swapdblk: Swap diagonal blocks | |
| symmetrize: Symmetrize matrix | \(\text{upper}(A) = \text{lower}(A)^T\) or \(\text{lower}(A) = \text{upper}(A)^T\) |
| transpose: Transpose matrix | \(B = A^T\) or \(B = A^H\) |
| lacgv: Conjugate vector | \(x = conj(x)\) |
| lacpy: Copy matrix | \(B = A\) |
| lascl: Scale matrix by scalar | \(A = \alpha A\) |
| lascl2: Scale matrix by diagonal | \(A = D A\) |
| laset: Set matrix to constants | \(A_{ij} =\) diag if \(i=j\); \(A_{ij} =\) offdiag otherwise |
| ►Level 3: matrix-matrix operations, O(n^3) work | |
| gemm: General matrix multiply: C = AB + C | \(C = \alpha \;op(A) \;op(B) + \beta C\) |
| hemm: Hermitian matrix multiply | \(C = \alpha A B + \beta C\) or \(C = \alpha B A + \beta C\) where \(A\) is Hermitian |
| herk: Hermitian rank k update | \(C = \alpha A A^T + \beta C\) where \(C\) is Hermitian |
| her2k: Hermitian rank 2k update | \(C = \alpha A B^T + \alpha B A^T + \beta C\) where \(C\) is Hermitian |
| symm: Symmetric matrix multiply | \(C = \alpha A B + \beta C\) or \(C = \alpha B A + \beta C\) where \(A\) is symmetric |
| syrk: Symmetric rank k update | \(C = \alpha A A^T + \beta C\) where \(C\) is symmetric |
| syr2k: Symmetric rank 2k update | \(C = \alpha A B^T + \alpha B A^T + \beta C\) where \(C\) is symmetric |
| trmm: Triangular matrix multiply | \(B = \alpha \;op(A)\; B\) or \(B = \alpha B \;op(A) \) where \(A\) is triangular |
| trsm: Triangular solve matrix | \(C = op(A)^{-1} B \) or \(C = B \;op(A)^{-1}\) where \(A\) is triangular |
| trtri: Triangular inverse; used in getri, potri | \(A = A^{-1}\) where \(A\) is triangular |
| trtri_diag: Invert diagonal blocks of triangular matrix; used in trsm | |
| ►Householder reflectors | |
| larf: Apply Householder reflector to general matrix | |
| larfy: Apply Householder reflector to symmetric/Hermitian matrix | |
| larfg: Generate Householder reflector | |
| larfb: Apply block of Householder reflectors (Level 3) | |
| larft: Generate T matrix for block of Householder reflectors | |
| ►Precision conversion | |
| <em>lag2</em>: Converts general matrix between single and double | |
| <em>lat2</em>: Converts triangular matrix between single and double | |
| ►Matrix norms | |
| lange: General matrix norm | 1, Frobenius, or Infinity norm; or largest element |
| lansy/he: Symmetric/Hermitian matrix norm | 1, Frobenius, or Infinity norm; or largest element |
| lantr: Triangular matrix norm | 1, Frobenius, or Infinity norm; or largest element |
| ▼Sparse LA | Routines for sparse linear algebra |
| ►Sparse linear systems | Solve \(Ax = b\) |
| ►General matrices | Solve \(Ax = b\), for general \(A\) |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Symmetric/Hermitian positive definite | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse eigenvalues | Solve \(Ax = \lambda x\) |
| ►Symmetric/Hermitian eigenvalues | Solve \(Ax = \lambda x\) for symmetric/Hermitian \(A\) |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse preconditioners | Preconditioner for solving \(Ax = \lambda x\) |
| ►General matrix preconditioner | Preconditioners for non-symmetric \(A\) |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Symmetric/Hermitian preconditioner | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse GPU kernels | |
| ►GPU kernels for non-symmetric sparse LA | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►GPU kernels for symmetric/Hermitian sparse LA | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse BLAS | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse auxiliary | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ►Sparse ** unfiled ** | |
| single precision | |
| double precision | |
| single-complex precision | |
| double-complex precision | |
| ▼Internal routines | |
| Error handling | |
| Testing routines | |
| Thread management | |
| Timer utilities | |
| QR panel to q, q to panel | |
| GPU Kernels | |
| ▼Deprecated | Deprecated routines will be removed in the next MAGMA release |
| Deprecated MAGMA v1 interface | |
| Deprecated MAGMA-sparse | |