Summary of Current Features

  • Solution of dense systems of linear equations and least square problems in real space and complex space, using single precision and double precision, via the Cholesky, LU, QR and LQ factorizations

  • Solution of standard and generalized dense symmetric Eigenvalue Problem in real space and complex space, using single precision and double precision, via two-stage tridiagonal reduction followed by QR iteration (eigenvalues only, no eigenvectors yet)

  • Dense Singular Value Decomposition in real space and complex space, using single precision and double precision, via two-stage bidiagonal reduction followed by QR iteration (singular values only, no singular vectors yet)

  • Solution of dense linear systems of equations in real space and complex space using the mixed-precision algorithm based on the Cholesky, LU, QR and LQ factorizations

  • Tree-based QR and LQ factorizations and Q matrix generation and application (“tall and skinny”)

  • Tree-based bidiagonal reduction (“tall and skinny”)

  • Explicit matrix inversion based on Cholesky factorization (symmetric positive definite)

  • Parallel and cache-efficient in-place layout translations (Gustavson et at.)

  • Complete set of Level 3 BLAS routines for matrices stored in tile layout

  • Simple LAPACK-like interface for greater productivity and advanced (tile) interface for full control and maximum performance; Routines for conversion between LAPACK matrix layout and PLASMA’s tile layout

  • Dynamic scheduler QUARK (QUeuing And Runtime for Kernels) and dynamically scheduled versions of all computational routines (alongside statically scheduled ones)

  • Asynchronous interface for launching dynamically scheduled routines in a non-blocking mode. Sequence and request constructs for controlling progress and checking errors

  • Automatic handling of workspace allocation; A set of auxiliary functions to assist the user with workspace allocation

  • A simple set of "sanity" tests for all numerical routines including Level 3 BLAS routines for matrices in tile layout

  • An advanced testing suite for exhaustive numerical testing of all the routines in all precisions (based on the testing suite of the LAPACK library)

  • Basic timing suite for most of the routines in all precisions

  • Thread safety

  • Support for Make and CMake build systems

  • LAPACK-style comments in the source code using the Doxygen system

  • Native support for Microsoft Windows using WinThreads through a thin OS interaction layer

  • Installer capable of downloading from Netlib and installing missing components of PLASMA’s software stack (BLAS, CBLAS, LAPACK, LAPACKE C API)

  • Extensive documentation including Installation Guide, Users' Guide, Reference Manual and an HTML code browser, a guide on running PLASMA with the TAU package, Contributors' Guide, a README and Release Notes.

  • A comprehensive set of usage examples

New Features by Release

2.4.0, June 6th, 2011

  • Tree-based QR and LQ factorizations: routines for application of the Q matrix support all combinations of input parameters: Left/Right, NoTrans/Trans/ConjTrans

  • Symmetric Engenvalue Problem using: tile reduction to band-tridiagonal form, reduction to "standard" tridiagonal form by bulge chasing, finding eigenvalues using the QR algorithm (eigenvectors currently not supported)

  • Singular Value Decomposition using: tile reduction to band-bidiagonal form, reduction to “standard” bidiagonal form by bulge chasing, finding singular values using the QR algorithm (singular vectors currently no supported)

  • Gaussian Elimination with partial pivoting (as opposed to the incremental pivoting in the tile LU factorization) and parallel panel (using Quark extensions for nested parallelism) WARNING: Following the integration of this new feature, the interface to call LU factorization has changed. Now, PLASMA_zgetrf follows the LAPACK interface and corresponds to the new partial pivoting. Old interface related to LU factorization with incremental pivoting is now renamed PLASMA_zgetrf_incpiv.

2.3.1, November 30th, 2010

2.3.0, November 15th, 2010

  • Parallel and cache-efficient in-place layout translations (Gustavson et al.)

  • Tree-based QR factorization and Q matrix generation (“tall and skinny”)

  • Explicit matrix inversion based on Cholesky factorization (symmetric positive definite)

  • Replacement of LAPACK C Wrapper with LAPACKE C API by Intel

2.2.0, July 9th, 2010

  • Dynamic scheduler QUARK (QUeuing And Runtime for Kernels) and dynamically scheduled versions of all computational routines (alongside statically scheduled ones)

  • Asynchronous interface for launching dynamically scheduled routines in a non-blocking mode. Sequence and request constructs for controlling progress and checking errors

  • Removal of CBLAS and pieces of LAPACK from PLASMA’s source tree. BLAS, CBLAS, LAPACK and Netlib LAPACK C Wrapper become PLASMA’s software dependencies required prior to the installation of PLASMA

  • Installer capable of downloading from Netlib and installing missing components of PLASMA’s software stack (BLAS, CBLAS, LAPACK, LAPACK C Wrapper)

  • Complete set of Level 3 BLAS routines for matrices stored in tile layout

2.1.0, November 15th, 2009

  • Native support for Microsoft Windows using WinThreads

  • Support for Make and CMake build systems

  • Performance-optimized mixed-precision routine for the solution of linear systems of equations using the LU factorization

  • Initial timing code (PLASMA_dgesv only)

  • Release notes

2.0.0, July 4th, 2008

  • Support for real and complex arithmetic in single and double precision

  • Generation and application of the Q matrix from the QR and LQ factorizations

  • Prototype of mixed-precision routine for the solution of linear systems of equations using the LU factorization (not optimized for performance)

  • Simple interface and native interface

  • Major code cleanup and restructuring

  • Redesigned workspace allocation

  • LAPACK testing

  • Examples

  • Thread safety

  • Python installer

  • Documentation: Installation Guide, Users' Guide with routine reference and an HTML code browser, a guide on running PLASMA with the TAU package, initial draft of Contributors' Guide, a README file and a LICENSE file

1.0.0, November 15th, 2008

  • Double precision routines for the solution of linear systems of equations and least square problems using Cholesky, LU, QR and LQ factorizations