![]() |
MAGMA
2.0.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_zgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dwork, magmaDoubleComplex *work, magma_int_t *info) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: More... | |
magma_int_t | magma_zgeqrf (magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info) |
ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf2_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magma_int_t *info) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf3_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dT, magma_int_t *info) |
ZGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf_expert_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dR_array, magma_int_t lddr, magmaDoubleComplex **dT_array, magma_int_t lddt, magmaDoubleComplex **dtau_array, magma_int_t provide_RT, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dT, magma_int_t *info) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf_m (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info) |
ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R using multiple GPUs. More... | |
magma_int_t | magma_zgeqrf2_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dlA[], magma_int_t ldda, magmaDoubleComplex *tau, magma_int_t *info) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zgeqrf_ooc (magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info) |
ZGEQRF_OOC computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_zungqr (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t *info) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_zungqr2 (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magma_int_t *info) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_zungqr_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t *info) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_zungqr_m (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *T, magma_int_t nb, magma_int_t *info) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_zunmqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info) |
ZUNMQR overwrites the general complex M-by-N matrix C with. More... | |
magma_int_t | magma_zunmqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, const magmaDoubleComplex *wA, magma_int_t ldwa, magma_int_t *info) |
ZUNMQR overwrites the general complex M-by-N matrix C with. More... | |
magma_int_t | magma_zunmqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex const *tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, magmaDoubleComplex *hwork, magma_int_t lwork, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t *info) |
ZUNMQR_GPU overwrites the general complex M-by-N matrix C with. More... | |
magma_int_t | magma_zunmqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info) |
ZUNMQR overwrites the general complex M-by-N matrix C with. More... | |
magma_int_t magma_zgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex_ptr | dwork, | ||
magmaDoubleComplex * | work, | ||
magma_int_t * | info | ||
) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) COMPLEX_16 array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) COMPLEX_16 array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|
magma_int_t magma_zgeqrf | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
This uses 2 queues to overlap communication and computation.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= max( N*NB, 2*NB*NB ), where NB can be obtained through magma_get_zgeqrf_nb( M, N ). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf2_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magma_int_t * | info | ||
) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This version has LAPACK-complaint arguments.
Other versions (magma_zgeqrf_gpu and magma_zgeqrf3_gpu) store the intermediate T matrices.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v^H
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf2_mgpu | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dlA[], | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magma_int_t * | info | ||
) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This is a GPU interface of the routine.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dlA | COMPLEX_16 array of pointers on the GPU, dimension (ngpu). On entry, the M-by-N matrix A distributed over GPUs (d_lA[d] points to the local matrix on d-th GPU). It uses 1D block column cyclic format with the block size of nb, and each local matrix is stored by column. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf3_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex_ptr | dT, | ||
magma_int_t * | info | ||
) |
ZGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are stored separately in dT.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | (workspace) COMPLEX_16 array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_zgeqrf_nb( M, N ). It starts with a MIN(M,N)*NB block that stores the triangular T matrices, followed by a MIN(M,N)*NB block that stores the diagonal blocks of the R matrix. The rest of the array is used as workspace. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v^H
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_expert_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dR_array, | ||
magma_int_t | lddr, | ||
magmaDoubleComplex ** | dT_array, | ||
magma_int_t | lddt, | ||
magmaDoubleComplex ** | dtau_array, | ||
magma_int_t | provide_RT, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | dR_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0. |
[in] | lddr | INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[in,out] | dT_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0. |
[in] | lddt | INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[in] | provide_RT | INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex_ptr | dT, | ||
magma_int_t * | info | ||
) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are inverted and stored separately in dT.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | (workspace) COMPLEX_16 array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_zgeqrf_nb( M, N ). It starts with a MIN(M,N)*NB block that stores the triangular T matrices, followed by a MIN(M,N)*NB block that stores inverses of the diagonal blocks of the R matrix. The rest of the array is used as workspace. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v^H
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_m | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R using multiple GPUs.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_zgeqrf_nb( M, N ). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_ooc | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
ZGEQRF_OOC computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. This is an out-of-core (ooc) version that is similar to magma_zgeqrf but the difference is that this version can use a GPU even if the matrix does not fit into the GPU memory at once.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | COMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_zgeqrf_nb( M, N ). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zungqr | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | COMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU. |
[in] | dT | COMPLEX_16 array on the GPU device. DT contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_zgeqrf_gpu. |
[in] | nb | INTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT. |
[out] | info | INTEGER
|
magma_int_t magma_zungqr2 | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magma_int_t * | info | ||
) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF.
This version recomputes the T matrices on the CPU and sends them to the GPU.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | COMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU. |
[out] | info | INTEGER
|
magma_int_t magma_zungqr_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF_GPU.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | dA | COMPLEX_16 array A on the GPU, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | ldda | INTEGER The first dimension of the array A. LDDA >= max(1,M). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU. |
[in] | dT | (workspace) COMPLEX_16 work space array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB. This must be the 6th argument of magma_zgeqrf_gpu [ note that if N here is bigger than N in magma_zgeqrf_gpu, the workspace requirement DT in magma_zgeqrf_gpu must be as specified in this routine ]. |
[in] | nb | INTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT. |
[out] | info | INTEGER
|
magma_int_t magma_zungqr_m | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | T, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | COMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU. |
[in] | T | COMPLEX_16 array, dimension (NB, min(M,N)). T contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_zgeqrf_gpu (except stored on the CPU, not the GPU). |
[in] | nb | INTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in T. |
[out] | info | INTEGER
|
magma_int_t magma_zunmqr | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | C, | ||
magma_int_t | ldc, | ||
magmaDoubleComplex * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
ZUNMQR overwrites the general complex M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = Magma_ConjTrans: Q**H * C C * Q**H
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | A | COMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. |
[in] | lda | INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. |
[in,out] | C | COMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. |
[in] | ldc | INTEGER The leading dimension of the array C. LDC >= max(1,M). |
[out] | work | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | info | INTEGER
|
magma_int_t magma_zunmqr2_gpu | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex_ptr | dC, | ||
magma_int_t | lddc, | ||
const magmaDoubleComplex * | wA, | ||
magma_int_t | ldwa, | ||
magma_int_t * | info | ||
) |
ZUNMQR overwrites the general complex M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = Magma_ConjTrans: Q**H * C C * Q**H
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified. |
[in] | ldda | INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. |
[in,out] | dC | COMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). |
[in] | lddc | INTEGER The leading dimension of the array dC. LDDC >= max(1,M). |
[in] | wA | COMPLEX_16 array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by ZHETRD_GPU. (A copy of the upper or lower part of dA, on the host.) |
[in] | ldwa | INTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N). |
[out] | info | INTEGER
|
magma_int_t magma_zunmqr_gpu | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex_const_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex const * | tau, | ||
magmaDoubleComplex_ptr | dC, | ||
magma_int_t | lddc, | ||
magmaDoubleComplex * | hwork, | ||
magma_int_t | lwork, | ||
magmaDoubleComplex_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
ZUNMQR_GPU overwrites the general complex M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = Magma_ConjTrans: Q**H * C C * Q**H
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | dA | COMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit. |
[in] | ldda | INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. |
[in,out] | dC | COMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). |
[in] | lddc | INTEGER The leading dimension of the array DC. LDDC >= max(1,M). |
[out] | hwork | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) Currently, zgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! |
[in] | lwork | INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. |
[in,out] | dT | COMPLEX_16 array on the GPU that is the output (the 9th argument) of magma_zgeqrf_gpu. Part used as workspace. |
[in] | nb | INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_zgeqrf_gpu. |
[out] | info | INTEGER
|
magma_int_t magma_zunmqr_m | ( | magma_int_t | ngpu, |
magma_side_t | side, | ||
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaDoubleComplex * | A, | ||
magma_int_t | lda, | ||
magmaDoubleComplex * | tau, | ||
magmaDoubleComplex * | C, | ||
magma_int_t | ldc, | ||
magmaDoubleComplex * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
ZUNMQR overwrites the general complex M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = Magma_ConjTrans: Q**H * C C * Q**H
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | A | COMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. |
[in] | lda | INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). |
[in] | tau | COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. |
[in,out] | C | COMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. |
[in] | ldc | INTEGER The leading dimension of the array C. LDC >= max(1,M). |
[out] | work | (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | info | INTEGER
|