![]() |
MAGMA
2.0.2
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_spotrf (magma_uplo_t uplo, magma_int_t n, float *A, magma_int_t lda, magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix A. More... | |
magma_int_t | magma_spotrf3_mgpu (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t m, magma_int_t n, magma_int_t off_i, magma_int_t off_j, magma_int_t nb, magmaFloat_ptr d_lA[], magma_int_t ldda, magmaFloat_ptr d_lP[], magma_int_t lddp, float *A, magma_int_t lda, magma_int_t h, magma_queue_t queues[][3], magma_event_t events[][5], magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA. More... | |
magma_int_t | magma_spotrf_lg_batched (magma_uplo_t uplo, magma_int_t n, float **dA_array, magma_int_t ldda, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA. More... | |
magma_int_t | magma_spotrf_gpu (magma_uplo_t uplo, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA. More... | |
magma_int_t | magma_spotrf_m (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, float *A, magma_int_t lda, magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix A. More... | |
magma_int_t | magma_spotrf_mgpu (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, magmaFloat_ptr d_lA[], magma_int_t ldda, magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA. More... | |
magma_int_t | magma_spotrf_mgpu_right (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, magmaFloat_ptr d_lA[], magma_int_t ldda, magma_int_t *info) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA. More... | |
magma_int_t | magma_spotri (magma_uplo_t uplo, magma_int_t n, float *A, magma_int_t lda, magma_int_t *info) |
SPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by SPOTRF. More... | |
magma_int_t | magma_spotri_gpu (magma_uplo_t uplo, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magma_int_t *info) |
SPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by SPOTRF. More... | |
magma_int_t | magma_spotrs_batched (magma_uplo_t uplo, magma_int_t n, magma_int_t nrhs, float **dA_array, magma_int_t ldda, float **dB_array, magma_int_t lddb, magma_int_t batchCount, magma_queue_t queue) |
SPOTRS solves a system of linear equations A*X = B with a symmetric positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by SPOTRF. More... | |
magma_int_t | magma_spotrs_gpu (magma_uplo_t uplo, magma_int_t n, magma_int_t nrhs, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dB, magma_int_t lddb, magma_int_t *info) |
SPOTRS solves a system of linear equations A*X = B with a symmetric positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by SPOTRF. More... | |
magma_int_t magma_spotrf | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix A.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
The factorization has the form A = U**H * U, if uplo = MagmaUpper, or A = L * L**H, if uplo = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
This uses multiple queues to overlap communication and computation.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the symmetric matrix A. If uplo = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If uplo = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization A = U**H * U or A = L * L**H. Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_spotrf3_mgpu | ( | magma_int_t | ngpu, |
magma_uplo_t | uplo, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | off_i, | ||
magma_int_t | off_j, | ||
magma_int_t | nb, | ||
magmaFloat_ptr | d_lA[], | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | d_lP[], | ||
magma_int_t | lddp, | ||
float * | A, | ||
magma_int_t | lda, | ||
magma_int_t | h, | ||
magma_queue_t | queues[][3], | ||
magma_event_t | events[][5], | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA.
Auxiliary subroutine for spotrf2_ooc. It is multiple gpu interface to compute Cholesky of a "rectangular" matrix.
The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | uplo | magma_uplo_t
|
[in] | m | INTEGER The number of rows of the submatrix to be factorized. |
[in] | n | INTEGER The number of columns of the submatrix to be factorized. |
[in] | off_i | INTEGER The first row index of the submatrix to be factorized. |
[in] | off_j | INTEGER The first column index of the submatrix to be factorized. |
[in] | nb | INTEGER The block size used for the factorization and distribution. |
[in,out] | d_lA | REAL array of pointers on the GPU, dimension (ngpu). On entry, the symmetric matrix dA distributed over GPU. (d_lAT[d] points to the local matrix on d-th GPU). If UPLO = MagmaLower or MagmaUpper, it respectively uses a 1D block column or row cyclic format (with the block size nb), and each local matrix is stored by column. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H. |
[in,out] | d_lP | REAL array of pointers on the GPU, dimension (ngpu). d_LAT[d] points to workspace of size h*lddp*nb on d-th GPU. |
[in] | lddp | INTEGER The leading dimension of the array dP. LDDA >= max(1,N). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | A | REAL array on the CPU, dimension (LDA,H*NB) On exit, the panel is copied back to the CPU |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[in] | h | INTEGER It specifies the size of the CPU workspace, A. |
[in] | queues | magma_queue_t queues is of dimension (ngpu,3) and contains the queues used for the partial factorization. |
[in] | events | magma_event_t events is of dimension(ngpu,5) and contains the events used for the partial factorization. |
[out] | info | INTEGER
|
magma_int_t magma_spotrf_gpu | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA.
The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix dA. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the symmetric matrix dA. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | info | INTEGER
|
magma_int_t magma_spotrf_lg_batched | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA.
The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix dA. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, each pointer is a symmetric matrix dA. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced. On exit, if corresponding entry in info_array = 0, each pointer is the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H. |
[in] | ldda | INTEGER The leading dimension of each array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_spotrf_m | ( | magma_int_t | ngpu, |
magma_uplo_t | uplo, | ||
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix A.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. The matrix A may exceed the GPU memory.
The factorization has the form A = U**H * U, if UPLO = MagmaUpper, or A = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization A = U**H * U or A = L * L**H. Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_spotrf_mgpu | ( | magma_int_t | ngpu, |
magma_uplo_t | uplo, | ||
magma_int_t | n, | ||
magmaFloat_ptr | d_lA[], | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA.
The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix dA. N >= 0. |
[in,out] | d_lA | REAL array of pointers on the GPU, dimension (ngpu) On entry, the symmetric matrix dA distributed over GPUs (d_lA[d] points to the local matrix on the d-th GPU). It is distributed in 1D block column or row cyclic (with the block size of nb) if UPLO = MagmaUpper or MagmaLower, respectively. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H. |
[in] | ldda | INTEGER The leading dimension of the array d_lA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | info | INTEGER
|
magma_int_t magma_spotrf_mgpu_right | ( | magma_int_t | ngpu, |
magma_uplo_t | uplo, | ||
magma_int_t | n, | ||
magmaFloat_ptr | d_lA[], | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
SPOTRF computes the Cholesky factorization of a real symmetric positive definite matrix dA.
The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.
This is the block version of the algorithm, calling Level 3 BLAS.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix dA. N >= 0. |
[in,out] | d_lA | REAL array of pointers on the GPU, dimension (ngpu) On entry, the symmetric matrix dA distributed over GPUs (dl_A[d] points to the local matrix on the d-th GPU). It is distributed in 1D block column or row cyclic (with the block size of nb) if UPLO = MagmaUpper or MagmaLower, respectively. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced. On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | info | INTEGER
|
magma_int_t magma_spotri | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | info | ||
) |
SPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by SPOTRF.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the triangular factor U or L from the Cholesky factorization A = U**T*U or A = L*L**T, as computed by SPOTRF. On exit, the upper or lower triangle of the (symmetric) inverse of A, overwriting the input factor U or L. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_spotri_gpu | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
SPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by SPOTRF.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the triangular factor U or L from the Cholesky factorization A = U**T*U or A = L*L**T, as computed by SPOTRF. On exit, the upper or lower triangle of the (symmetric) inverse of A, overwriting the input factor U or L. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_spotrs_batched | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
float ** | dB_array, | ||
magma_int_t | lddb, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SPOTRS solves a system of linear equations A*X = B with a symmetric positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by SPOTRF.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) The triangular factor U or L from the Cholesky factorization A = U**H*U or A = L*L**H, as computed by SPOTRF. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,N). |
[in,out] | dB_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDB,NRHS) On entry, each pointer is a right hand side matrix B. On exit, the corresponding solution matrix X. |
[in] | lddb | INTEGER The leading dimension of each array B. LDDB >= max(1,N). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_spotrs_gpu | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dB, | ||
magma_int_t | lddb, | ||
magma_int_t * | info | ||
) |
SPOTRS solves a system of linear equations A*X = B with a symmetric positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by SPOTRF.
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in] | dA | REAL array on the GPU, dimension (LDDA,N) The triangular factor U or L from the Cholesky factorization A = U**H*U or A = L*L**H, as computed by SPOTRF. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,N). |
[in,out] | dB | REAL array on the GPU, dimension (LDDB,NRHS) On entry, the right hand side matrix B. On exit, the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDDB >= max(1,N). |
[out] | info | INTEGER
|