single-complex precision
[QR factorization: computational]

Functions

magma_int_t magma_cgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dwork, magmaFloatComplex *work, magma_int_t *info)
 CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:.
magma_int_t magma_cgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info)
 CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
magma_int_t magma_cgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info)
 CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
magma_int_t magma_cgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info)
 CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
magma_int_t magma_cgeqrf4 (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
 CGEQRF4 computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R using multiple GPUs.
magma_int_t magma_cgeqrf (magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
 CGEQRF computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf2_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magma_int_t *info)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf3_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dT, magma_int_t *info)
 CGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf_expert_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dR_array, magma_int_t lddr, magmaFloatComplex **dT_array, magma_int_t lddt, magmaFloatComplex **dtau_array, magma_int_t provide_RT, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dT, magma_int_t *info)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf2_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dlA[], magma_int_t ldda, magmaFloatComplex *tau, magma_int_t *info)
 CGEQRF2_MGPU computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
magma_int_t magma_cgeqrf_ooc (magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
 CGEQRF_OOC computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R.
magma_int_t magma_cungqr (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info)
 CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
magma_int_t magma_cungqr2 (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magma_int_t *info)
 CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
magma_int_t magma_cungqr_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info)
 CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
magma_int_t magma_cungqr_m (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *T, magma_int_t nb, magma_int_t *info)
 CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
magma_int_t magma_cunmqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *C, magma_int_t ldc, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
 CUNMQR overwrites the general complex M-by-N matrix C with.
magma_int_t magma_cunmqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex *wA, magma_int_t ldwa, magma_int_t *info)
 CUNMQR overwrites the general complex M-by-N matrix C with.
magma_int_t magma_cunmqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex *hwork, magma_int_t lwork, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info)
 CUNMQR_GPU overwrites the general complex M-by-N matrix C with.
magma_int_t magma_cunmqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *C, magma_int_t ldc, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
 CUNMQR overwrites the general complex M-by-N matrix C with.
magma_int_t magma_cgeqr2_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t lda, magmaFloatComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
magma_int_t magma_cgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info)
 CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

Function Documentation

magma_int_t magma_cgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dwork,
magmaFloatComplex *  work,
magma_int_t *  info 
)

CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:.

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters:
[in] ikind INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: MGS 4. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in] m INTEGER The number of rows of the matrix A. m >= n >= 0.
[in] n INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork (GPU workspace) COMPLEX array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out] work (CPU workspace) COMPLEX array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
magma_int_t magma_cgeqr2_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  lda,
magmaFloatComplex **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.

This version implements the right-looking QR with non-blocking.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] dtau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
dwork (workspace) COMPLEX array, dimension (N) * ( sizeof(float) + sizeof(magmaFloatComplex))
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqr2x2_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dtau,
magmaFloatComplex_ptr  dT,
magmaFloatComplex_ptr  ddA,
magmaFloat_ptr  dwork,
magma_int_t *  info 
)

CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] dtau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out] ddA COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork (workspace) DOUBLE_PRECISION array, dimension (3 N)
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqr2x3_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dtau,
magmaFloatComplex_ptr  dT,
magmaFloatComplex_ptr  ddA,
magmaFloat_ptr  dwork,
magma_int_t *  info 
)

CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

This version adds internal blocking.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] dtau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out] ddA COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork (workspace) DOUBLE_PRECISION array, dimension (3 N)
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqr2x4_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dtau,
magmaFloatComplex_ptr  dT,
magmaFloatComplex_ptr  ddA,
magmaFloat_ptr  dwork,
magma_queue_t  queue,
magma_int_t *  info 
)

CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

This version adds internal blocking.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] dtau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out] ddA COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork (workspace) DOUBLE_PRECISION array, dimension (3 N)
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v**H

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqr2x_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dtau,
magmaFloatComplex_ptr  dT,
magmaFloatComplex_ptr  ddA,
magmaFloat_ptr  dwork,
magma_int_t *  info 
)

CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.

This version implements the right-looking QR. A hard-coded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128).
[in,out] dA COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] dtau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out] ddA COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork (workspace) COMPLEX array, dimension (N)
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

CGEQRF computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

If the current stream is NULL, this version replaces it with a new stream to overlap computation with communication.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] A COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lwork INTEGER The dimension of the array WORK. LWORK >= max( N*NB, 2*NB*NB ), where NB can be obtained through magma_get_cgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf2_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magma_int_t *  info 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version has LAPACK-complaint arguments.

If the current stream is NULL, this version replaces it with a new stream to overlap computation with communication.

Other versions (magma_cgeqrf_gpu and magma_cgeqrf3_gpu) store the intermediate T matrices.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf2_mgpu ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dlA[],
magma_int_t  ldda,
magmaFloatComplex *  tau,
magma_int_t *  info 
)

CGEQRF2_MGPU computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This is a GPU interface of the routine.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix dA. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf3_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dT,
magma_int_t *  info 
)

CGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them and the corresponding parts of the upper triangular R are stored separately in dT.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT (workspace) COMPLEX array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_cgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal matrices for the R matrix. The rest of the array is used as workspace.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf4 ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

CGEQRF4 computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R using multiple GPUs.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

Parameters:
[in] ngpu INTEGER Number of GPUs to use. ngpu > 0.
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] A COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lwork INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_cgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magmaFloatComplex **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magmaFloatComplex **  dR_array,
magma_int_t  lddr,
magmaFloatComplex **  dT_array,
magma_int_t  lddt,
magmaFloatComplex **  dtau_array,
magma_int_t  provide_RT,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA_array COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out] dR_array COMPLEX array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT>0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT>0.
[in] lddr INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out] dT_array COMPLEX array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT>0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT>0.
[in] lddt INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in] provide_RT INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_gpu ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dT,
magma_int_t *  info 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are inverted and stored separately in dT.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] dA COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] dT (workspace) COMPLEX array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_cgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal inverses for the R matrix. The rest of the array is used as workspace.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_ooc ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

CGEQRF_OOC computes a QR factorization of a COMPLEX M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. This is an out-of-core (ooc) version that is similar to magma_cgeqrf but the difference is that this version can use a GPU even if the matrix does not fit into the GPU memory at once.

Parameters:
[in] m INTEGER The number of rows of the matrix A. M >= 0.
[in] n INTEGER The number of columns of the matrix A. N >= 0.
[in,out] A COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M).
[out] tau COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in] lwork INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_cgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details --------------- The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cungqr ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF.

Parameters:
[in] m INTEGER The number of rows of the matrix Q. M >= 0.
[in] n INTEGER The number of columns of the matrix Q. M >= N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out] A COMPLEX array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in] lda INTEGER The first dimension of the array A. LDA >= max(1,M).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF_GPU.
[in] dT COMPLEX array on the GPU device. DT contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_cgeqrf_gpu.
[in] nb INTEGER This is the block size used in CGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_cungqr2 ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magma_int_t *  info 
)

CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF.

This version recomputes the T matrices on the CPU and sends them to the GPU.

Parameters:
[in] m INTEGER The number of rows of the matrix Q. M >= 0.
[in] n INTEGER The number of columns of the matrix Q. M >= N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out] A COMPLEX array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in] lda INTEGER The first dimension of the array A. LDA >= max(1,M).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF_GPU.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_cungqr_gpu ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF_GPU.

Parameters:
[in] m INTEGER The number of rows of the matrix Q. M >= 0.
[in] n INTEGER The number of columns of the matrix Q. M >= N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out] dA COMPLEX array A on the GPU, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in] ldda INTEGER The first dimension of the array A. LDDA >= max(1,M).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF_GPU.
[in] dT (workspace) COMPLEX work space array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB. This must be the 6th argument of magma_cgeqrf_gpu [ note that if N here is bigger than N in magma_cgeqrf_gpu, the workspace requirement DT in magma_cgeqrf_gpu must be as specified in this routine ].
[in] nb INTEGER This is the block size used in CGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_cungqr_m ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  T,
magma_int_t  nb,
magma_int_t *  info 
)

CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF.

Parameters:
[in] m INTEGER The number of rows of the matrix Q. M >= 0.
[in] n INTEGER The number of columns of the matrix Q. M >= N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out] A COMPLEX array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in] lda INTEGER The first dimension of the array A. LDA >= max(1,M).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF_GPU.
[in] T COMPLEX array, dimension (NB, min(M,N)). T contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_cgeqrf_gpu (except stored on the CPU, not the GPU).
[in] nb INTEGER This is the block size used in CGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in T.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_cunmqr ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  C,
magma_int_t  ldc,
magmaFloatComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

CUNMQR overwrites the general complex M-by-N matrix C with.

                              SIDE = MagmaLeft   SIDE = MagmaRight
    TRANS = MagmaNoTrans:     Q * C              C * Q
    TRANS = Magma_ConjTrans:  Q**H * C           C * Q**H
    

where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters:
[in] side magma_side_t

  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in] trans magma_trans_t

  • = MagmaNoTrans: No transpose, apply Q;
  • = Magma_ConjTrans: Conjugate transpose, apply Q**H.
[in] m INTEGER The number of rows of the matrix C. M >= 0.
[in] n INTEGER The number of columns of the matrix C. N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in] A COMPLEX array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit.
[in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF.
[in,out] C COMPLEX array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M).
[out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_cunmqr2_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dC,
magma_int_t  lddc,
magmaFloatComplex *  wA,
magma_int_t  ldwa,
magma_int_t *  info 
)

CUNMQR overwrites the general complex M-by-N matrix C with.

                               SIDE = MagmaLeft    SIDE = MagmaRight
    TRANS = MagmaNoTrans:      Q * C               C * Q
    TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H
    

where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters:
[in] side magma_side_t

  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in] trans magma_trans_t

  • = MagmaNoTrans: No transpose, apply Q;
  • = Magma_ConjTrans: Conjugate transpose, apply Q**H.
[in] m INTEGER The number of rows of the matrix C. M >= 0.
[in] n INTEGER The number of columns of the matrix C. N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in] dA COMPLEX array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument A. The diagonal and the upper part are destroyed, the reflectors are not modified.
[in] ldda INTEGER The leading dimension of the array DA. LDDA >= max(1,M) if SIDE = MagmaLeft; LDDA >= max(1,N) if SIDE = MagmaRight.
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF.
[in,out] dC COMPLEX array, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q).
[in] lddc INTEGER The leading dimension of the array C. LDDC >= max(1,M).
[in] wA (workspace) COMPLEX array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by CHETRD_GPU.
[in] ldwa INTEGER The leading dimension of the array wA. LDWA >= max(1,M) if SIDE = MagmaLeft; LDWA >= max(1,N) if SIDE = MagmaRight.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_cunmqr_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex *  tau,
magmaFloatComplex_ptr  dC,
magma_int_t  lddc,
magmaFloatComplex *  hwork,
magma_int_t  lwork,
magmaFloatComplex_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

CUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                               SIDE = MagmaLeft    SIDE = MagmaRight
    TRANS = MagmaNoTrans:      Q * C               C * Q
    TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H
    

where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters:
[in] side magma_side_t

  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in] trans magma_trans_t

  • = MagmaNoTrans: No transpose, apply Q;
  • = Magma_ConjTrans: Conjugate transpose, apply Q**H.
[in] m INTEGER The number of rows of the matrix C. M >= 0.
[in] n INTEGER The number of columns of the matrix C. N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in] dA COMPLEX array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit.
[in] ldda INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF.
[in,out] dC COMPLEX array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M).
[out] hwork (workspace) COMPLEX array, dimension (MAX(1,LWORK))
Currently, cgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and *should not be relied on*!
[in] lwork INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA.
[in] dT COMPLEX array on the GPU that is the output (the 9th argument) of magma_cgeqrf_gpu.
[in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_cgeqrf_gpu.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_cunmqr_m ( magma_int_t  ngpu,
magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaFloatComplex *  A,
magma_int_t  lda,
magmaFloatComplex *  tau,
magmaFloatComplex *  C,
magma_int_t  ldc,
magmaFloatComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

CUNMQR overwrites the general complex M-by-N matrix C with.

                                SIDE = MagmaLeft    SIDE = MagmaRight
    TRANS = MagmaNoTrans:       Q * C               C * Q
    TRANS = Magma_ConjTrans:    Q**H * C            C * Q**H
    

where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters:
[in] ngpu INTEGER Number of GPUs to use. ngpu > 0.
[in] side magma_side_t

  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in] trans magma_trans_t

  • = MagmaNoTrans: No transpose, apply Q;
  • = Magma_ConjTrans: Conjugate transpose, apply Q**H.
[in] m INTEGER The number of rows of the matrix C. M >= 0.
[in] n INTEGER The number of columns of the matrix C. N >= 0.
[in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in] A COMPLEX array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument A.
[in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF.
[in,out] C COMPLEX array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q.
[in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M).
[out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out] info INTEGER

  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Generated on 3 May 2015 for MAGMA by  doxygen 1.6.1