![]() |
MAGMA
2.2.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_cgetf2_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dW0_displ, magmaFloatComplex **dW1_displ, magmaFloatComplex **dW2_displ, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
CGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_dgetf2_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, double **dW0_displ, double **dW1_displ, double **dW2_displ, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
DGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_sgetf2_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ldda, float **dW0_displ, float **dW1_displ, float **dW2_displ, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
SGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t | magma_zgetf2_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dW0_displ, magmaDoubleComplex **dW1_displ, magmaDoubleComplex **dW2_displ, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue) |
ZGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
void | magma_cgetf2trsm_batched (magma_int_t ib, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t step, magma_int_t ldda, magma_int_t batchCount, magma_queue_t queue) |
cgetf2trsm solves one of the matrix equations on gpu More... | |
magma_int_t | magma_cgetf2_sm_batched (magma_int_t m, magma_int_t ib, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
void | magma_dgetf2trsm_batched (magma_int_t ib, magma_int_t n, double **dA_array, magma_int_t step, magma_int_t ldda, magma_int_t batchCount, magma_queue_t queue) |
dgetf2trsm solves one of the matrix equations on gpu More... | |
magma_int_t | magma_dgetf2_sm_batched (magma_int_t m, magma_int_t ib, double **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
DGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
void | magma_sgetf2trsm_batched (magma_int_t ib, magma_int_t n, float **dA_array, magma_int_t step, magma_int_t ldda, magma_int_t batchCount, magma_queue_t queue) |
sgetf2trsm solves one of the matrix equations on gpu More... | |
magma_int_t | magma_sgetf2_sm_batched (magma_int_t m, magma_int_t ib, float **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
void | magma_zgetf2trsm_batched (magma_int_t ib, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t step, magma_int_t ldda, magma_int_t batchCount, magma_queue_t queue) |
zgetf2trsm solves one of the matrix equations on gpu More... | |
magma_int_t | magma_zgetf2_sm_batched (magma_int_t m, magma_int_t ib, magmaDoubleComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More... | |
magma_int_t magma_cgetf2_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dW0_displ, | ||
magmaFloatComplex ** | dW1_displ, | ||
magmaFloatComplex ** | dW2_displ, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
dW0_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW1_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW2_displ | (workspace) Array of pointers, dimension (batchCount). | |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | gbstep | INTEGER internal use. |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
this is an internal routine that might have many assumption.
magma_int_t magma_dgetf2_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
double ** | dA_array, | ||
magma_int_t | ldda, | ||
double ** | dW0_displ, | ||
double ** | dW1_displ, | ||
double ** | dW2_displ, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
DGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
dW0_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW1_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW2_displ | (workspace) Array of pointers, dimension (batchCount). | |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | gbstep | INTEGER internal use. |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
this is an internal routine that might have many assumption.
magma_int_t magma_sgetf2_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
float ** | dW0_displ, | ||
float ** | dW1_displ, | ||
float ** | dW2_displ, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
dW0_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW1_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW2_displ | (workspace) Array of pointers, dimension (batchCount). | |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | gbstep | INTEGER internal use. |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
this is an internal routine that might have many assumption.
magma_int_t magma_zgetf2_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dW0_displ, | ||
magmaDoubleComplex ** | dW1_displ, | ||
magmaDoubleComplex ** | dW2_displ, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | gbstep, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
ZGETF2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of each matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
dW0_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW1_displ | (workspace) Array of pointers, dimension (batchCount). | |
dW2_displ | (workspace) Array of pointers, dimension (batchCount). | |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | gbstep | INTEGER internal use. |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
this is an internal routine that might have many assumption.
void magma_cgetf2trsm_batched | ( | magma_int_t | ib, |
magma_int_t | n, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | step, | ||
magma_int_t | ldda, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
cgetf2trsm solves one of the matrix equations on gpu
B = C^-1 * B
where C, B are part of the matrix A in dA_array,
This version load C, B into shared memory and solve it and copy back to GPU device memory. This is an internal routine that might have many assumption.
[in] | ib | INTEGER The number of rows/columns of each matrix C, and rows of B. ib >= 0. |
[in] | n | INTEGER The number of columns of each matrix B. n >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in] | step | INTEGER The starting address of matrix C in A. LDDA >= max(1,M). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgetf2_sm_batched | ( | magma_int_t | m, |
magma_int_t | ib, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
CGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
This version load entire matrix (m*ib) into shared memory and factorize it with pivoting and copy back to GPU device memory.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | ib | INTEGER The number of columns of each matrix A. ib >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
void magma_dgetf2trsm_batched | ( | magma_int_t | ib, |
magma_int_t | n, | ||
double ** | dA_array, | ||
magma_int_t | step, | ||
magma_int_t | ldda, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
dgetf2trsm solves one of the matrix equations on gpu
B = C^-1 * B
where C, B are part of the matrix A in dA_array,
This version load C, B into shared memory and solve it and copy back to GPU device memory. This is an internal routine that might have many assumption.
[in] | ib | INTEGER The number of rows/columns of each matrix C, and rows of B. ib >= 0. |
[in] | n | INTEGER The number of columns of each matrix B. n >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in] | step | INTEGER The starting address of matrix C in A. LDDA >= max(1,M). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgetf2_sm_batched | ( | magma_int_t | m, |
magma_int_t | ib, | ||
double ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
DGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
This version load entire matrix (m*ib) into shared memory and factorize it with pivoting and copy back to GPU device memory.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | ib | INTEGER The number of columns of each matrix A. ib >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
void magma_sgetf2trsm_batched | ( | magma_int_t | ib, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | step, | ||
magma_int_t | ldda, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
sgetf2trsm solves one of the matrix equations on gpu
B = C^-1 * B
where C, B are part of the matrix A in dA_array,
This version load C, B into shared memory and solve it and copy back to GPU device memory. This is an internal routine that might have many assumption.
[in] | ib | INTEGER The number of rows/columns of each matrix C, and rows of B. ib >= 0. |
[in] | n | INTEGER The number of columns of each matrix B. n >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in] | step | INTEGER The starting address of matrix C in A. LDDA >= max(1,M). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sgetf2_sm_batched | ( | magma_int_t | m, |
magma_int_t | ib, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
This version load entire matrix (m*ib) into shared memory and factorize it with pivoting and copy back to GPU device memory.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | ib | INTEGER The number of columns of each matrix A. ib >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
void magma_zgetf2trsm_batched | ( | magma_int_t | ib, |
magma_int_t | n, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | step, | ||
magma_int_t | ldda, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
zgetf2trsm solves one of the matrix equations on gpu
B = C^-1 * B
where C, B are part of the matrix A in dA_array,
This version load C, B into shared memory and solve it and copy back to GPU device memory. This is an internal routine that might have many assumption.
[in] | ib | INTEGER The number of rows/columns of each matrix C, and rows of B. ib >= 0. |
[in] | n | INTEGER The number of columns of each matrix B. n >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[in] | step | INTEGER The starting address of matrix C in A. LDDA >= max(1,M). |
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgetf2_sm_batched | ( | magma_int_t | m, |
magma_int_t | ib, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magma_int_t ** | ipiv_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
ZGETF2_SM computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.
The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
This is the right-looking Level 3 BLAS version of the algorithm.
This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.
This version load entire matrix (m*ib) into shared memory and factorize it with pivoting and copy back to GPU device memory.
[in] | m | INTEGER The number of rows of each matrix A. M >= 0. |
[in] | ib | INTEGER The number of columns of each matrix A. ib >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored. |
[in] | ldda | INTEGER The leading dimension of each array A. LDDA >= max(1,M). |
[out] | ipiv_array | Array of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |