MAGMA  2.0.0
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
single-complex precision

Functions

magma_int_t magma_cpotrf (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix A. More...
 
magma_int_t magma_cpotrf3_mgpu (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t m, magma_int_t n, magma_int_t off_i, magma_int_t off_j, magma_int_t nb, magmaFloatComplex_ptr d_lA[], magma_int_t ldda, magmaFloatComplex_ptr d_lP[], magma_int_t lddp, magmaFloatComplex *A, magma_int_t lda, magma_int_t h, magma_queue_t queues[][3], magma_event_t events[][5], magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA. More...
 
magma_int_t magma_cpotrf_lg_batched (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA. More...
 
magma_int_t magma_cpotrf_gpu (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA. More...
 
magma_int_t magma_cpotrf_m (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix A. More...
 
magma_int_t magma_cpotrf_mgpu (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, magmaFloatComplex_ptr d_lA[], magma_int_t ldda, magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA. More...
 
magma_int_t magma_cpotrf_mgpu_right (magma_int_t ngpu, magma_uplo_t uplo, magma_int_t n, magmaFloatComplex_ptr d_lA[], magma_int_t ldda, magma_int_t *info)
 CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA. More...
 
magma_int_t magma_cpotri (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *info)
 CPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by CPOTRF. More...
 
magma_int_t magma_cpotri_gpu (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *info)
 CPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by CPOTRF. More...
 
magma_int_t magma_cpotrs_batched (magma_uplo_t uplo, magma_int_t n, magma_int_t nrhs, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dB_array, magma_int_t lddb, magma_int_t batchCount, magma_queue_t queue)
 CPOTRS solves a system of linear equations A*X = B with a Hermitian positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by CPOTRF. More...
 
magma_int_t magma_cpotrs_gpu (magma_uplo_t uplo, magma_int_t n, magma_int_t nrhs, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dB, magma_int_t lddb, magma_int_t *info)
 CPOTRS solves a system of linear equations A*X = B with a Hermitian positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by CPOTRF. More...
 

Detailed Description

Function Documentation

magma_int_t magma_cpotrf ( magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix A.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

The factorization has the form A = U**H * U, if uplo = MagmaUpper, or A = L * L**H, if uplo = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

This uses multiple queues to overlap communication and computation.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ACOMPLEX array, dimension (LDA,N) On entry, the Hermitian matrix A. If uplo = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If uplo = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization A = U**H * U or A = L * L**H.
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotrf3_mgpu ( magma_int_t  ngpu,
magma_uplo_t  uplo,
magma_int_t  m,
magma_int_t  n,
magma_int_t  off_i,
magma_int_t  off_j,
magma_int_t  nb,
magmaFloatComplex_ptr  d_lA[],
magma_int_t  ldda,
magmaFloatComplex_ptr  d_lP[],
magma_int_t  lddp,
magmaFloatComplex *  A,
magma_int_t  lda,
magma_int_t  h,
magma_queue_t  queues[][3],
magma_event_t  events[][5],
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA.

Auxiliary subroutine for cpotrf2_ooc. It is multiple gpu interface to compute Cholesky of a "rectangular" matrix.

The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of dA is stored;
  • = MagmaLower: Lower triangle of dA is stored.
[in]mINTEGER The number of rows of the submatrix to be factorized.
[in]nINTEGER The number of columns of the submatrix to be factorized.
[in]off_iINTEGER The first row index of the submatrix to be factorized.
[in]off_jINTEGER The first column index of the submatrix to be factorized.
[in]nbINTEGER The block size used for the factorization and distribution.
[in,out]d_lACOMPLEX array of pointers on the GPU, dimension (ngpu). On entry, the Hermitian matrix dA distributed over GPU. (d_lAT[d] points to the local matrix on d-th GPU). If UPLO = MagmaLower or MagmaUpper, it respectively uses a 1D block column or row cyclic format (with the block size nb), and each local matrix is stored by column. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H.
[in,out]d_lPCOMPLEX array of pointers on the GPU, dimension (ngpu). d_LAT[d] points to workspace of size h*lddp*nb on d-th GPU.
[in]lddpINTEGER The leading dimension of the array dP. LDDA >= max(1,N).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]ACOMPLEX array on the CPU, dimension (LDA,H*NB) On exit, the panel is copied back to the CPU
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[in]hINTEGER It specifies the size of the CPU workspace, A.
[in]queuesmagma_queue_t queues is of dimension (ngpu,3) and contains the queues used for the partial factorization.
[in]eventsmagma_event_t events is of dimension(ngpu,5) and contains the events used for the partial factorization.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotrf_gpu ( magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA.

The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of dA is stored;
  • = MagmaLower: Lower triangle of dA is stored.
[in]nINTEGER The order of the matrix dA. N >= 0.
[in,out]dACOMPLEX array on the GPU, dimension (LDDA,N) On entry, the Hermitian matrix dA. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotrf_lg_batched ( magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA.

The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of dA is stored;
  • = MagmaLower: Lower triangle of dA is stored.
[in]nINTEGER The order of the matrix dA. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, each pointer is a Hermitian matrix dA. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced.
On exit, if corresponding entry in info_array = 0, each pointer is the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H.
[in]lddaINTEGER The leading dimension of each array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cpotrf_m ( magma_int_t  ngpu,
magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix A.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. The matrix A may exceed the GPU memory.

The factorization has the form A = U**H * U, if UPLO = MagmaUpper, or A = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ACOMPLEX array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization A = U**H * U or A = L * L**H.
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotrf_mgpu ( magma_int_t  ngpu,
magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex_ptr  d_lA[],
magma_int_t  ldda,
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA.

The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of dA is stored;
  • = MagmaLower: Lower triangle of dA is stored.
[in]nINTEGER The order of the matrix dA. N >= 0.
[in,out]d_lACOMPLEX array of pointers on the GPU, dimension (ngpu) On entry, the Hermitian matrix dA distributed over GPUs (d_lA[d] points to the local matrix on the d-th GPU). It is distributed in 1D block column or row cyclic (with the block size of nb) if UPLO = MagmaUpper or MagmaLower, respectively. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H.
[in]lddaINTEGER The leading dimension of the array d_lA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotrf_mgpu_right ( magma_int_t  ngpu,
magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex_ptr  d_lA[],
magma_int_t  ldda,
magma_int_t *  info 
)

CPOTRF computes the Cholesky factorization of a complex Hermitian positive definite matrix dA.

The factorization has the form dA = U**H * U, if UPLO = MagmaUpper, or dA = L * L**H, if UPLO = MagmaLower, where U is an upper triangular matrix and L is lower triangular.

This is the block version of the algorithm, calling Level 3 BLAS.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of dA is stored;
  • = MagmaLower: Lower triangle of dA is stored.
[in]nINTEGER The order of the matrix dA. N >= 0.
[in,out]d_lACOMPLEX array of pointers on the GPU, dimension (ngpu) On entry, the Hermitian matrix dA distributed over GPUs (dl_A[d] points to the local matrix on the d-th GPU). It is distributed in 1D block column or row cyclic (with the block size of nb) if UPLO = MagmaUpper or MagmaLower, respectively. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of dA contains the upper triangular part of the matrix dA, and the strictly lower triangular part of dA is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of dA contains the lower triangular part of the matrix dA, and the strictly upper triangular part of dA is not referenced.
On exit, if INFO = 0, the factor U or L from the Cholesky factorization dA = U**H * U or dA = L * L**H.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the leading minor of order i is not positive definite, and the factorization could not be completed.
magma_int_t magma_cpotri ( magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magma_int_t *  info 
)

CPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by CPOTRF.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ACOMPLEX array, dimension (LDA,N) On entry, the triangular factor U or L from the Cholesky factorization A = U**T*U or A = L*L**T, as computed by CPOTRF. On exit, the upper or lower triangle of the (symmetric) inverse of A, overwriting the input factor U or L.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the (i,i) element of the factor U or L is zero, and the inverse could not be computed.
magma_int_t magma_cpotri_gpu ( magma_uplo_t  uplo,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magma_int_t *  info 
)

CPOTRI computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by CPOTRF.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]dACOMPLEX array on the GPU, dimension (LDA,N) On entry, the triangular factor U or L from the Cholesky factorization A = U**T*U or A = L*L**T, as computed by CPOTRF. On exit, the upper or lower triangle of the (symmetric) inverse of A, overwriting the input factor U or L.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
  • > 0: if INFO = i, the (i,i) element of the factor U or L is zero, and the inverse could not be computed.
magma_int_t magma_cpotrs_batched ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nrhs,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magmaFloatComplex **  dB_array,
magma_int_t  lddb,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CPOTRS solves a system of linear equations A*X = B with a Hermitian positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by CPOTRF.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in]nrhsINTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0.
[in]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) The triangular factor U or L from the Cholesky factorization A = U**H*U or A = L*L**H, as computed by CPOTRF.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,N).
[in,out]dB_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDB,NRHS) On entry, each pointer is a right hand side matrix B. On exit, the corresponding solution matrix X.
[in]lddbINTEGER The leading dimension of each array B. LDDB >= max(1,N).
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cpotrs_gpu ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nrhs,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dB,
magma_int_t  lddb,
magma_int_t *  info 
)

CPOTRS solves a system of linear equations A*X = B with a Hermitian positive definite matrix A using the Cholesky factorization A = U**H*U or A = L*L**H computed by CPOTRF.

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in]nrhsINTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0.
[in]dACOMPLEX array on the GPU, dimension (LDDA,N) The triangular factor U or L from the Cholesky factorization A = U**H*U or A = L*L**H, as computed by CPOTRF.
[in]lddaINTEGER The leading dimension of the array A. LDDA >= max(1,N).
[in,out]dBCOMPLEX array on the GPU, dimension (LDDB,NRHS) On entry, the right hand side matrix B. On exit, the solution matrix X.
[in]lddbINTEGER The leading dimension of the array B. LDDB >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value