![]() |
MAGMA
2.0.2
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_cgerbt_batched (magma_bool_t gen, magma_int_t n, magma_int_t nrhs, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dB_array, magma_int_t lddb, magmaFloatComplex *U, magmaFloatComplex *V, magma_int_t *info, magma_int_t batchCount, magma_queue_t queue) |
CGERBT solves a system of linear equations A * X = B where A is a general n-by-n matrix and X and B are n-by-nrhs matrices. More... | |
magma_int_t | magma_cgerbt_gpu (magma_bool_t gen, magma_int_t n, magma_int_t nrhs, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dB, magma_int_t lddb, magmaFloatComplex *U, magmaFloatComplex *V, magma_int_t *info) |
CGERBT solves a system of linear equations A * X = B where A is a general n-by-n matrix and X and B are n-by-nrhs matrices. More... | |
magma_int_t | magma_cgerfs_nopiv_gpu (magma_trans_t trans, magma_int_t n, magma_int_t nrhs, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dB, magma_int_t lddb, magmaFloatComplex_ptr dX, magma_int_t lddx, magmaFloatComplex_ptr dworkd, magmaFloatComplex_ptr dAF, magma_int_t *iter, magma_int_t *info) |
CGERFS improves the computed solution to a system of linear equations. More... | |
magma_int_t | magma_cgetrf (magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *ipiv, magma_int_t *info) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf2_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magma_int_t offset, magmaFloatComplex_ptr d_lAT[], magma_int_t lddat, magma_int_t *ipiv, magmaFloatComplex_ptr d_lAP[], magmaFloatComplex *W, magma_int_t ldw, magma_queue_t queues[][2], magma_int_t *info) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *ipiv, magma_int_t *info) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf_m (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *ipiv, magma_int_t *info) |
CGETRF_m computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr d_lA[], magma_int_t ldda, magma_int_t *ipiv, magma_int_t *info) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_cgetrf_nopiv (magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *info) |
CGETRF_NOPIV computes an LU factorization of a general M-by-N matrix A without pivoting. More... | |
magma_int_t | magma_cgetrf_nopiv_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGETRF computes an LU factorization of a general M-by-N matrix A without pivoting. More... | |
magma_int_t | magma_cgetrf_nopiv_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *info) |
CGETRF_NOPIV_GPU computes an LU factorization of a general M-by-N matrix A without any pivoting. More... | |
magma_int_t | magma_cgetrf_recpanel_batched (magma_int_t m, magma_int_t n, magma_int_t min_recpnb, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **dipiv_array, magma_int_t **dpivinfo_array, magmaFloatComplex **dX_array, magma_int_t dX_length, magmaFloatComplex **dinvA_array, magma_int_t dinvA_length, magmaFloatComplex **dW1_displ, magmaFloatComplex **dW2_displ, magmaFloatComplex **dW3_displ, magmaFloatComplex **dW4_displ, magmaFloatComplex **dW5_displ, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
This is an internal routine that might have many assumption. More... | |
magma_int_t | magma_cgetrf_panel_nopiv_batched (magma_int_t m, magma_int_t nb, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dX_array, magma_int_t dX_length, magmaFloatComplex **dinvA_array, magma_int_t dinvA_length, magmaFloatComplex **dW0_displ, magmaFloatComplex **dW1_displ, magmaFloatComplex **dW2_displ, magmaFloatComplex **dW3_displ, magmaFloatComplex **dW4_displ, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
this is an internal routine that might have many assumption. More... | |
magma_int_t | magma_cgetri_gpu (magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *ipiv, magmaFloatComplex_ptr dwork, magma_int_t lwork, magma_int_t *info) |
CGETRI computes the inverse of a matrix using the LU factorization computed by CGETRF. More... | |
magma_int_t | magma_cgetri_outofplace_batched (magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **dipiv_array, magmaFloatComplex **dinvA_array, magma_int_t lddia, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGETRI computes the inverse of a matrix using the LU factorization computed by CGETRF. More... | |
magma_int_t | magma_cgetrs_batched (magma_trans_t trans, magma_int_t n, magma_int_t nrhs, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **dipiv_array, magmaFloatComplex **dB_array, magma_int_t lddb, magma_int_t batchCount, magma_queue_t queue) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF. More... | |
magma_int_t | magma_cgetrs_gpu (magma_trans_t trans, magma_int_t n, magma_int_t nrhs, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *ipiv, magmaFloatComplex_ptr dB, magma_int_t lddb, magma_int_t *info) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF_GPU. More... | |
magma_int_t | magma_cgetrs_nopiv_batched (magma_trans_t trans, magma_int_t n, magma_int_t nrhs, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dB_array, magma_int_t lddb, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization without pivoting computed by CGETRF_NOPIV. More... | |
magma_int_t | magma_cgetrs_nopiv_gpu (magma_trans_t trans, magma_int_t n, magma_int_t nrhs, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dB, magma_int_t lddb, magma_int_t *info) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF_NOPIV_GPU. More... | |
magma_int_t | magma_ctrtri (magma_uplo_t uplo, magma_diag_t diag, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magma_int_t *info) |
CTRTRI computes the inverse of a real upper or lower triangular matrix A. More... | |
magma_int_t | magma_ctrtri_gpu (magma_uplo_t uplo, magma_diag_t diag, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *info) |
CTRTRI computes the inverse of a real upper or lower triangular matrix dA. More... | |
magma_int_t magma_cgerbt_batched | ( | magma_bool_t | gen, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dB_array, | ||
magma_int_t | lddb, | ||
magmaFloatComplex * | U, | ||
magmaFloatComplex * | V, | ||
magma_int_t * | info, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGERBT solves a system of linear equations A * X = B where A is a general n-by-n matrix and X and B are n-by-nrhs matrices.
Random Butterfly Tranformation is applied on A and B, then the LU decomposition with no pivoting is used to factor A as A = L * U, where L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the system of equations A * X = B.
This is a batched version that solves batchCount matrices in parallel. dA, dB, and info become arrays with one entry per matrix.
[in] | gen | magma_bool_t
|
[in] | n | INTEGER The order of the matrix A. n >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in,out] | dB_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDB,N). On entry, each pointer is an right hand side matrix B. On exit, each pointer is the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDB >= max(1,N). |
[in,out] | U | COMPLEX array, dimension (2,n) Random butterfly matrix, if gen = MagmaTrue U is generated and returned as output; else we use U given as input. CPU memory |
[in,out] | V | COMPLEX array, dimension (2,n) Random butterfly matrix, if gen = MagmaTrue V is generated and returned as output; else we use U given as input. CPU memory |
[out] | info | INTEGER
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgerbt_gpu | ( | magma_bool_t | gen, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dB, | ||
magma_int_t | lddb, | ||
magmaFloatComplex * | U, | ||
magmaFloatComplex * | V, | ||
magma_int_t * | info | ||
) |
CGERBT solves a system of linear equations A * X = B where A is a general n-by-n matrix and X and B are n-by-nrhs matrices.
Random Butterfly Tranformation is applied on A and B, then the LU decomposition with no pivoting is used to factor A as A = L * U, where L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the system of equations A * X = B.
[in] | gen | magma_bool_t
|
[in] | n | INTEGER The order of the matrix A. n >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. nrhs >= 0. |
[in,out] | dA | COMPLEX array, dimension (LDDA,n). On entry, the M-by-n matrix to be factored. On exit, the factors L and U from the factorization A = L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,n). |
[in,out] | dB | COMPLEX array, dimension (LDDB,nrhs) On entry, the right hand side matrix B. On exit, the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDDB >= max(1,n). |
[in,out] | U | COMPLEX array, dimension (2,n) Random butterfly matrix, if gen = MagmaTrue U is generated and returned as output; else we use U given as input. CPU memory |
[in,out] | V | COMPLEX array, dimension (2,n) Random butterfly matrix, if gen = MagmaTrue V is generated and returned as output; else we use U given as input. CPU memory |
[out] | info | INTEGER
|
magma_int_t magma_cgerfs_nopiv_gpu | ( | magma_trans_t | trans, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dB, | ||
magma_int_t | lddb, | ||
magmaFloatComplex_ptr | dX, | ||
magma_int_t | lddx, | ||
magmaFloatComplex_ptr | dworkd, | ||
magmaFloatComplex_ptr | dAF, | ||
magma_int_t * | iter, | ||
magma_int_t * | info | ||
) |
CGERFS improves the computed solution to a system of linear equations.
The iterative refinement process is stopped if ITER > ITERMAX or for all the RHS we have: RNRM < SQRT(n)*XNRM*ANRM*EPS*BWDMAX where o ITER is the number of the current iteration in the iterative refinement process o RNRM is the infinity-norm of the residual o XNRM is the infinity-norm of the solution o ANRM is the infinity-operator-norm of the matrix A o EPS is the machine epsilon returned by SLAMCH('Epsilon') The value ITERMAX and BWDMAX are fixed to 30 and 1.0D+00 respectively.
[in] | trans | magma_trans_t Specifies the form of the system of equations:
|
[in] | n | INTEGER The number of linear equations, i.e., the order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in] | dA | COMPLEX array on the GPU, dimension (ldda,N) the N-by-N coefficient matrix A. |
[in] | ldda | INTEGER The leading dimension of the array dA. ldda >= max(1,N). |
[in] | dB | COMPLEX array on the GPU, dimension (lddb,NRHS) The N-by-NRHS right hand side matrix B. |
[in] | lddb | INTEGER The leading dimension of the array dB. lddb >= max(1,N). |
[in,out] | dX | COMPLEX array on the GPU, dimension (lddx,NRHS) On entry, the solution matrix X, as computed by CGETRS_NOPIV. On exit, the improved solution matrix X. |
[in] | lddx | INTEGER The leading dimension of the array dX. lddx >= max(1,N). |
dworkd | (workspace) COMPLEX array on the GPU, dimension (N*NRHS) This array is used to hold the residual vectors. | |
dAF | COMPLEX array on the GPU, dimension (ldda,n) The factors L and U from the factorization A = L*U as computed by CGETRF_NOPIV. | |
[out] | iter | INTEGER
|
[out] | info | INTEGER
|
magma_int_t magma_cgetrf | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | ipiv, | ||
magma_int_t * | info | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
It uses 2 queues to overlap communication and computation.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | ipiv | INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf2_mgpu | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nb, | ||
magma_int_t | offset, | ||
magmaFloatComplex_ptr | d_lAT[], | ||
magma_int_t | lddat, | ||
magma_int_t * | ipiv, | ||
magmaFloatComplex_ptr | d_lAP[], | ||
magmaFloatComplex * | W, | ||
magma_int_t | ldw, | ||
magma_queue_t | queues[][2], | ||
magma_int_t * | info | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm. Use two buffer to send panels.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in] | nb | INTEGER The block size used for the matrix distribution. |
[in] | offset | INTEGER The first row and column indices of the submatrix that this routine will factorize. |
[in,out] | d_lAT | COMPLEX array of pointers on the GPU, dimension (ngpu). On entry, the M-by-N matrix A distributed over GPUs (d_lAT[d] points to the local matrix on d-th GPU). It uses a 1D block column cyclic format (with the block size nb), and each local matrix is stored by row. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | lddat | INTEGER The leading dimension of the array d_lAT[d]. LDDA >= max(1,M). |
[out] | ipiv | INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[in] | d_lAP | COMPLEX array of pointers on the GPU, dimension (ngpu). d_lAP[d] is the workspace on d-th GPU. Each local workspace must be of size (3+ngpu)*nb*maxm, where maxm is m rounded up to a multiple of 32 and nb is the block size. |
[in] | W | COMPLEX array, dimension (ngpu*nb*maxm). It is used to store panel on CPU. |
[in] | ldw | INTEGER The leading dimension of the workspace w. |
[in] | queues | magma_queue_t queues[d] points to the queues for the d-th GPU to execute in. Each GPU require two queues. |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetrf_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | ipiv, | ||
magma_int_t * | info | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX array on the GPU, dimension (LDDA,N). On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,M). |
[out] | ipiv | INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_m | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | ipiv, | ||
magma_int_t * | info | ||
) |
CGETRF_m computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. The matrix may exceed the GPU memory.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
Note: The factorization of big panel is done calling multiple-gpu-interface. Pivots are applied on GPU within the big panel.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | ipiv | INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_mgpu | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloatComplex_ptr | d_lA[], | ||
magma_int_t | ldda, | ||
magma_int_t * | ipiv, | ||
magma_int_t * | info | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | d_lA | COMPLEX array of pointers on the GPU, dimension (ngpu). On entry, the M-by-N matrix A distributed over GPUs (d_lA[d] points to the local matrix on d-th GPU). It uses 1D block column cyclic format with the block size of nb, and each local matrix is stored by column. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of the array d_lA. LDDA >= max(1,M). |
[out] | ipiv | INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_nopiv | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | info | ||
) |
CGETRF_NOPIV computes an LU factorization of a general M-by-N matrix A without pivoting.
The factorization has the form A = L * U where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a CPU-only (not accelerated) version.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_nopiv_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETRF computes an LU factorization of a general M-by-N matrix A without pivoting.
The factorization has the form A = L * U where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetrf_nopiv_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
CGETRF_NOPIV_GPU computes an LU factorization of a general M-by-N matrix A without any pivoting.
The factorization has the form A = L * U where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX array on the GPU, dimension (LDDA,N). On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,M). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrf_panel_nopiv_batched | ( | magma_int_t | m, |
magma_int_t | nb, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dX_array, | ||
magma_int_t | dX_length, | ||
magmaFloatComplex ** | dinvA_array, | ||
magma_int_t | dinvA_length, | ||
magmaFloatComplex ** | dW0_displ, | ||
magmaFloatComplex ** | dW1_displ, | ||
magmaFloatComplex ** | dW2_displ, | ||
magmaFloatComplex ** | dW3_displ, | ||
magmaFloatComplex ** | dW4_displ, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
this is an internal routine that might have many assumption.
No documentation is available today.
magma_int_t magma_cgetrf_recpanel_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | min_recpnb, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | dipiv_array, | ||
magma_int_t ** | dpivinfo_array, | ||
magmaFloatComplex ** | dX_array, | ||
magma_int_t | dX_length, | ||
magmaFloatComplex ** | dinvA_array, | ||
magma_int_t | dinvA_length, | ||
magmaFloatComplex ** | dW1_displ, | ||
magmaFloatComplex ** | dW2_displ, | ||
magmaFloatComplex ** | dW3_displ, | ||
magmaFloatComplex ** | dW4_displ, | ||
magmaFloatComplex ** | dW5_displ, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
This is an internal routine that might have many assumption.
Documentation is not fully completed
CGETRF_PANEL computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in] | min_recpnb | INTEGER. Internal use. The recursive nb |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | dipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | dpivinfo_array | Array of pointers, dimension (batchCount), for internal use. |
[in,out] | dX_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array X of dimension ( lddx, n ). On entry, should be set to 0 On exit, the solution matrix X |
[in] | dX_length | INTEGER. The size of each workspace matrix dX |
[in,out] | dinvA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array dinvA, a workspace on device. If side == MagmaLeft, dinvA must be of size >= ceil(m/TRI_NB)*TRI_NB*TRI_NB, If side == MagmaRight, dinvA must be of size >= ceil(n/TRI_NB)*TRI_NB*TRI_NB, where TRI_NB = 128. |
[in] | dinvA_length | INTEGER The size of each workspace matrix dinvA |
[in] | dW1_displ | Workspace array of pointers, for internal use. |
[in] | dW2_displ | Workspace array of pointers, for internal use. |
[in] | dW3_displ | Workspace array of pointers, for internal use. |
[in] | dW4_displ | Workspace array of pointers, for internal use. |
[in] | dW5_displ | Workspace array of pointers, for internal use. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | gbstep | INTEGER internal use. |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetri_gpu | ( | magma_int_t | n, |
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | ipiv, | ||
magmaFloatComplex_ptr | dwork, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
CGETRI computes the inverse of a matrix using the LU factorization computed by CGETRF.
This method inverts U and then computes inv(A) by solving the system inv(A)*L = inv(U) for inv(A).
Note that it is generally both faster and more accurate to use CGESV, or CGETRF and CGETRS, to solve the system AX = B, rather than inverting the matrix and multiplying to form X = inv(A)*B. Only in special instances should an explicit inverse be computed with this routine.
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX array on the GPU, dimension (LDDA,N) On entry, the factors L and U from the factorization A = P*L*U as computed by CGETRF_GPU. On exit, if INFO = 0, the inverse of the original matrix A. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,N). |
[in] | ipiv | INTEGER array, dimension (N) The pivot indices from CGETRF; for 1 <= i <= N, row i of the matrix was interchanged with row IPIV(i). |
[out] | dwork | (workspace) COMPLEX array on the GPU, dimension (MAX(1,LWORK)) |
[in] | lwork | INTEGER The dimension of the array DWORK. LWORK >= N*NB, where NB is the optimal blocksize returned by magma_get_cgetri_nb(n). Unlike LAPACK, this version does not currently support a workspace query, because the workspace is on the GPU. |
[out] | info | INTEGER
|
magma_int_t magma_cgetri_outofplace_batched | ( | magma_int_t | n, |
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | dipiv_array, | ||
magmaFloatComplex ** | dinvA_array, | ||
magma_int_t | lddia, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETRI computes the inverse of a matrix using the LU factorization computed by CGETRF.
This method inverts U and then computes inv(A) by solving the system inv(A)*L = inv(U) for inv(A).
Note that it is generally both faster and more accurate to use CGESV, or CGETRF and CGETRS, to solve the system AX = B, rather than inverting the matrix and multiplying to form X = inv(A)*B. Only in special instances should an explicit inverse be computed with this routine.
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the factors L and U from the factorization A = P*L*U as computed by CGETRF_GPU. On exit, if INFO = 0, the inverse of the original matrix A. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,N). |
[in] | dipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (N) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | dinvA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDIA,N) It contains the inverse of the matrix |
[in] | lddia | INTEGER The leading dimension of the array invA_array. LDDIA >= max(1,N). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetrs_batched | ( | magma_trans_t | trans, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | dipiv_array, | ||
magmaFloatComplex ** | dB_array, | ||
magma_int_t | lddb, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF.
This is a batched version that solves batchCount N-by-N matrices in parallel. dA, dB, and ipiv become arrays with one entry per matrix.
[in] | trans | magma_trans_t Specifies the form of the system of equations:
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | dipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[in,out] | dB_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDB,N). On entry, each pointer is an right hand side matrix B. On exit, each pointer is the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDB >= max(1,N). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetrs_gpu | ( | magma_trans_t | trans, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | ipiv, | ||
magmaFloatComplex_ptr | dB, | ||
magma_int_t | lddb, | ||
magma_int_t * | info | ||
) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF_GPU.
[in] | trans | magma_trans_t Specifies the form of the system of equations:
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in] | dA | COMPLEX array on the GPU, dimension (LDDA,N) The factors L and U from the factorization A = P*L*U as computed by CGETRF_GPU. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,N). |
[in] | ipiv | INTEGER array, dimension (N) The pivot indices from CGETRF; for 1 <= i <= N, row i of the matrix was interchanged with row IPIV(i). |
[in,out] | dB | COMPLEX array on the GPU, dimension (LDDB,NRHS) On entry, the right hand side matrix B. On exit, the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDDB >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_cgetrs_nopiv_batched | ( | magma_trans_t | trans, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dB_array, | ||
magma_int_t | lddb, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization without pivoting computed by CGETRF_NOPIV.
This is a batched version that solves batchCount N-by-N matrices in parallel. dA, dB, become arrays with one entry per matrix.
[in] | trans | magma_trans_t Specifies the form of the system of equations:
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in,out] | dB_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDB,N). On entry, each pointer is an right hand side matrix B. On exit, each pointer is the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDB >= max(1,N). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetrs_nopiv_gpu | ( | magma_trans_t | trans, |
magma_int_t | n, | ||
magma_int_t | nrhs, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dB, | ||
magma_int_t | lddb, | ||
magma_int_t * | info | ||
) |
CGETRS solves a system of linear equations A * X = B, A**T * X = B, or A**H * X = B with a general N-by-N matrix A using the LU factorization computed by CGETRF_NOPIV_GPU.
[in] | trans | magma_trans_t Specifies the form of the system of equations:
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nrhs | INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. |
[in] | dA | COMPLEX array on the GPU, dimension (LDDA,N) The factors L and U from the factorization A = P*L*U as computed by CGETRF_GPU. |
[in] | ldda | INTEGER The leading dimension of the array A. LDDA >= max(1,N). |
[in,out] | dB | COMPLEX array on the GPU, dimension (LDDB,NRHS) On entry, the right hand side matrix B. On exit, the solution matrix X. |
[in] | lddb | INTEGER The leading dimension of the array B. LDDB >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_ctrtri | ( | magma_uplo_t | uplo, |
magma_diag_t | diag, | ||
magma_int_t | n, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magma_int_t * | info | ||
) |
CTRTRI computes the inverse of a real upper or lower triangular matrix A.
This is the Level 3 BLAS version of the algorithm.
[in] | uplo | magma_uplo_t
|
[in] | diag | magma_diag_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the triangular matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of the array A contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of the array A contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced. If DIAG = MagmaUnit, the diagonal elements of A are also not referenced and are assumed to be 1. On exit, the (triangular) inverse of the original matrix, in the same storage format. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_ctrtri_gpu | ( | magma_uplo_t | uplo, |
magma_diag_t | diag, | ||
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magma_int_t * | info | ||
) |
CTRTRI computes the inverse of a real upper or lower triangular matrix dA.
This is the Level 3 BLAS version of the algorithm.
[in] | uplo | magma_uplo_t
|
[in] | diag | magma_diag_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX array ON THE GPU, dimension (LDDA,N) On entry, the triangular matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of the array dA contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of the array dA contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced. If DIAG = MagmaUnit, the diagonal elements of A are also not referenced and are assumed to be 1. On exit, the (triangular) inverse of the original matrix, in the same storage format. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,N). |
[out] | info | INTEGER
|