MAGMA provides implementations for CUDA, Intel Xeon Phi, and OpenCL. The latest releases are MAGMA 2.2, MAGMA MIC 1.4.0, and clMAGMA 1.3, respectively. The libraries available for download are listed below in the order of their release dates.

MAGMA 2.2 release

MAGMA 2.2.0 is now released. Updates include:

  • Added variable size batched Cholesky factorization magma_[sdcz]potrf_vbatched
  • Added new fixed size batched BLAS routines {hemm, symm, hemv, symv, trmm}_batched
  • Added new variable size batched BLAS routines {hemm, symm, hemv, symv, trmm, trsm}_vbatched
  • Fixed memory leaks in {sy,he}evdx_2stage and getri_outofplace_batched.
  • Fixed bug for small matrices in {symm, hemm}_mgpu and updated tester.
  • Fixed libraries in examples for MKL with gcc.
  • More robust error checking for Batched BLAS routines.


  • Added Incomplete Sparse Approximate Inverse (ISAI) Preconditioner for sparse triangular solves, including batched generation.
  • Added Block-Jacobi triangular solves, including variable blocksize (based on supervariable amalgamation).
  • Added ParILUT, a parallel threshold ILU based on OpenMP.
  • Added CSR5 format and CSR5 SpMV kernel, a sparse matrix vector product often outperforming the cuSPARSE SpMV CSR and HYB.
magma-2.2.0.tar.gz   Download View License


MAGMA MIC 1.4.0 is now available. This release provides implementations for MAGMA's one-sided (LU, QR, and Cholesky) and two-sided (Hessenberg, bi- and tridiagonal reductions) dense matrix factorizations, as well as linear and eigenproblem solver for Intel Xeon Phi Coprocessors. More information on the approach is given in this presentation.

magmamic-1.4.0.tar.gz   Download View License

clMAGMA 1.3

clMAGMA is an OpenCL port of MAGMA. It supports AMD GPUs. The clMAGMA library dependancies, in particular optimized GPU OpenCL BLAS and CPU optimized BLAS and LAPACK for AMD hardware, can be found in the AMD clMath Libraries (formerly APPML).

Included in the clMAGMA 1.3 release are routines for the following algorithms:

  • LU, QR, and Cholesky factorizations in both real and complex  arithmetic (single and double);
  • Linear and least squares solvers based on correspondingly the LU/Cholesky and QR factorizations in both real and complex  arithmetic (single and double);
  • Reductions to Hessenberg, bidiagonal, and tridiagonal forms using orthgonal similarity transformationsin both real and complex arithmetic (single and double);
  • Eigen and singular value problem solvers in both real and complex arithmetic (single and double);
  • Orthogonal transformation routines.
clmagma-1.3.0.tar.gz   Download View License

MAGMA 2.1 release

MAGMA 2.1 for CUDA is now available. New features and updates include:

  • Variable size batched routines (gemm, gemv, syrk, syr2k).
  • Improved SVD performance for tall (m >> n) or wide (m << n) matrices.
  • Preconditioned QMR.
  • Expanded doxygen documentation.
  • For MAGMA v1 compatability, initializes default queue for each GPU on first use, instead of in magma_init.

Please take this survey to help improve MAGMA, LAPACK, and other dense linear algebra libraries. We estimate that it should take 10 minutes to fill it out.  Thank you very much.

magma-2.1.0.tar.gz   Download View License


