MAGMA  1.5.0
Matrix Algebra for GPU and Multicore Architectures
 All Files Functions Groups
double-complex precision

Functions

magma_int_t magma_zgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *dwork, magmaDoubleComplex *work, magma_int_t *info)
 ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: More...
 
magma_int_t magma_zgeqr2x2_gpu (magma_int_t *m, magma_int_t *n, magmaDoubleComplex *dA, magma_int_t *ldda, magmaDoubleComplex *dtau, magmaDoubleComplex *dT, magmaDoubleComplex *ddA, double *dwork, magma_int_t *info)
 ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqr2x3_gpu (magma_int_t *m, magma_int_t *n, magmaDoubleComplex *dA, magma_int_t *ldda, magmaDoubleComplex *dtau, magmaDoubleComplex *dT, magmaDoubleComplex *ddA, double *dwork, magma_int_t *info)
 ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqr2x_gpu (magma_int_t *m, magma_int_t *n, magmaDoubleComplex *dA, magma_int_t *ldda, magmaDoubleComplex *dtau, magmaDoubleComplex *dT, magmaDoubleComplex *ddA, double *dwork, magma_int_t *info)
 ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf4 (magma_int_t num_gpus, magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
 ZGEQRF4 computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R using multiple GPUs. More...
 
magma_int_t magma_zgeqrf (magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
 ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf2_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magma_int_t *info)
 ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf3_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex *dT, magma_int_t *info)
 ZGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex *dT, magma_int_t *info)
 ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf2_mgpu (magma_int_t num_gpus, magma_int_t m, magma_int_t n, magmaDoubleComplex **dlA, magma_int_t ldda, magmaDoubleComplex *tau, magma_int_t *info)
 ZGEQRF2_MGPU computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf_ooc (magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
 ZGEQRF_OOC computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zungqr (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *dT, magma_int_t nb, magma_int_t *info)
 ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_zungqr2 (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magma_int_t *info)
 ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_zungqr_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex *dT, magma_int_t nb, magma_int_t *info)
 ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_zungqr_m (magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *T, magma_int_t nb, magma_int_t *info)
 ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_zunmqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
 ZUNMQR overwrites the general complex M-by-N matrix C with. More...
 
magma_int_t magma_zunmqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex *dC, magma_int_t lddc, magmaDoubleComplex *wA, magma_int_t ldwa, magma_int_t *info)
 ZUNMQR overwrites the general complex M-by-N matrix C with. More...
 
magma_int_t magma_zunmqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex *dC, magma_int_t lddc, magmaDoubleComplex *hwork, magma_int_t lwork, magmaDoubleComplex *dT, magma_int_t nb, magma_int_t *info)
 ZUNMQR_GPU overwrites the general complex M-by-N matrix C with. More...
 
magma_int_t magma_zunmqr_m (magma_int_t nrgpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
 ZUNMQR overwrites the general complex M-by-N matrix C with. More...
 
magma_int_t magma_zgeqr2x4_gpu (magma_int_t *m, magma_int_t *n, magmaDoubleComplex *dA, magma_int_t *ldda, magmaDoubleComplex *dtau, magmaDoubleComplex *dT, magmaDoubleComplex *ddA, double *dwork, magma_int_t *info, magma_queue_t stream)
 ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...
 

Detailed Description

Function Documentation

magma_int_t magma_zgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  dwork,
magmaDoubleComplex *  work,
magma_int_t *  info 
)

ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

    A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory).

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the M-by-N matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) COMPLEX_16 array, dimension (N,N)
[out]work(CPU workspace) COMPLEX_16 array, dimension 3n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

magma_int_t magma_zgeqr2x2_gpu ( magma_int_t *  m,
magma_int_t *  n,
magmaDoubleComplex *  dA,
magma_int_t *  ldda,
magmaDoubleComplex *  dtau,
magmaDoubleComplex *  dT,
magmaDoubleComplex *  ddA,
double *  dwork,
magma_int_t *  info 
)

ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]dtauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dTCOMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out]ddACOMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork(workspace) DOUBLE_PRECISION array, dimension (3 N)
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqr2x3_gpu ( magma_int_t *  m,
magma_int_t *  n,
magmaDoubleComplex *  dA,
magma_int_t *  ldda,
magmaDoubleComplex *  dtau,
magmaDoubleComplex *  dT,
magmaDoubleComplex *  ddA,
double *  dwork,
magma_int_t *  info 
)

ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

This version adds internal blocking.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]dtauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dTCOMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out]ddACOMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork(workspace) DOUBLE_PRECISION array, dimension (3 N)
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqr2x4_gpu ( magma_int_t *  m,
magma_int_t *  n,
magmaDoubleComplex *  dA,
magma_int_t *  ldda,
magmaDoubleComplex *  dtau,
magmaDoubleComplex *  dT,
magmaDoubleComplex *  ddA,
double *  dwork,
magma_int_t *  info,
magma_queue_t  stream 
)

ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.

This version adds internal blocking.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]dtauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dTCOMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out]ddACOMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork(workspace) DOUBLE_PRECISION array, dimension (3 N)
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqr2x_gpu ( magma_int_t *  m,
magma_int_t *  n,
magmaDoubleComplex *  dA,
magma_int_t *  ldda,
magmaDoubleComplex *  dtau,
magmaDoubleComplex *  dT,
magmaDoubleComplex *  ddA,
double *  dwork,
magma_int_t *  info 
)

ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.

This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).

The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.

This version implements the right-looking QR.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX*16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details).
the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]dtauCOMPLEX*16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dTCOMPLEX*16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0.
[out]ddACOMPLEX*16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal.
dwork(workspace) COMPLEX*16 array, dimension (N)
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

ZGEQRF computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

If the current stream is NULL, this version replaces it with user defined stream to overlap computation with communication.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ACOMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= max( N*NB, 2*NB*NB ), where NB can be obtained through magma_get_zgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf2_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magma_int_t *  info 
)

ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version has LAPACK-complaint arguments. If the current stream is NULL, this version replaces it with user defined stream to overlap computation with communication.

Other versions (magma_zgeqrf_gpu and magma_zgeqrf3_gpu) store the intermediate T matrices.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

This version has LAPACK-complaint arguments. This version assumes the computation runs through the NULL stream and therefore is not overlapping some computation with communication.

Other versions (magma_zgeqrf_gpu and magma_zgeqrf3_gpu) store the intermediate T matrices.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf2_mgpu ( magma_int_t  num_gpus,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex **  dlA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magma_int_t *  info 
)

ZGEQRF2_MGPU computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This is a GPU interface of the routine.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix dA. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf3_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dT,
magma_int_t *  info 
)

ZGEQRF3 computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them and the corresponding parts of the upper triangular R are stored separately in dT.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dT(workspace) COMPLEX_16 array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB, where NB can be obtained through magma_get_zgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal matrices for the R matrix. The rest of the array is used as workspace.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf4 ( magma_int_t  num_gpus,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

ZGEQRF4 computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R using multiple GPUs.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

Parameters
[in]num_gpusINTEGER The number of GPUs to be used for the factorization.
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ACOMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_zgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dT,
magma_int_t *  info 
)

ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are inverted and stored separately in dT.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dT(workspace) COMPLEX_16 array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB, where NB can be obtained through magma_get_zgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal inverses for the R matrix. The rest of the array is used as workspace.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf_ooc ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

ZGEQRF_OOC computes a QR factorization of a COMPLEX_16 M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. This is an out-of-core (ooc) version that is similar to magma_zgeqrf but the difference is that this version can use a GPU even if the matrix does not fit into the GPU memory at once.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ACOMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauCOMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_zgeqrf_nb(M).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zungqr ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dT,
magma_int_t  nb,
magma_int_t *  info 
)

ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

  Q  =  H(1) H(2) . . . H(k)

as returned by ZGEQRF.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ACOMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU.
[in]dTCOMPLEX_16 array on the GPU device. DT contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_zgeqrf_gpu.
[in]nbINTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_zungqr2 ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magma_int_t *  info 
)

ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

  Q  =  H(1) H(2) . . . H(k)

as returned by ZGEQRF.

This version recomputes the T matrices on the CPU and sends them to the GPU.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ACOMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_zungqr_gpu ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dT,
magma_int_t  nb,
magma_int_t *  info 
)

ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

  Q  =  H(1) H(2) . . . H(k)

as returned by ZGEQRF_GPU.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]dACOMPLEX_16 array A on the GPU, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]lddaINTEGER The first dimension of the array A. LDDA >= max(1,M).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU.
[in]dT(workspace) COMPLEX_16 work space array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB. This must be the 6th argument of magma_zgeqrf_gpu [ note that if N here is bigger than N in magma_zgeqrf_gpu, the workspace requirement DT in magma_zgeqrf_gpu must be as specified in this routine ].
[in]nbINTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_zungqr_m ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  T,
magma_int_t  nb,
magma_int_t *  info 
)

ZUNGQR generates an M-by-N COMPLEX_16 matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q  =  H(1) H(2) . . . H(k)

as returned by ZGEQRF.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ACOMPLEX_16 array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF_GPU.
[in]TCOMPLEX_16 array, dimension (NB, min(M,N)). T contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_zgeqrf_gpu (except stored on the CPU, not the GPU).
[in]nbINTEGER This is the block size used in ZGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in T.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_zunmqr ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  C,
magma_int_t  ldc,
magmaDoubleComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

ZUNMQR overwrites the general complex M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = MagmaConjTrans:   Q**H * C           C * Q**H

where Q is a complex orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaConjTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]ACOMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit.
[in]ldaINTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF.
[in,out]CCOMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in]ldcINTEGER The leading dimension of the array C. LDC >= max(1,M).
[out]work(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(0) returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_zunmqr2_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dC,
magma_int_t  lddc,
magmaDoubleComplex *  wA,
magma_int_t  ldwa,
magma_int_t *  info 
)

ZUNMQR overwrites the general complex M-by-N matrix C with.

                        SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:   Q * C              C * Q
TRANS = MagmaTrans:     Q**H * C           C * Q**H

where Q is a complex orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]dACOMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. The diagonal and the upper part are destroyed, the reflectors are not modified.
[in]lddaINTEGER The leading dimension of the array DA. LDDA >= max(1,M) if SIDE = MagmaLeft; LDDA >= max(1,N) if SIDE = MagmaRight.
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF.
[in,out]dCCOMPLEX_16 array, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q).
[in]lddcINTEGER The leading dimension of the array C. LDDC >= max(1,M).
[in]wA(workspace) COMPLEX_16 array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by ZHETRD_GPU.
[in]ldwaINTEGER The leading dimension of the array wA. LDWA >= max(1,M) if SIDE = MagmaLeft; LDWA >= max(1,N) if SIDE = MagmaRight.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_zunmqr_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  dC,
magma_int_t  lddc,
magmaDoubleComplex *  hwork,
magma_int_t  lwork,
magmaDoubleComplex *  dT,
magma_int_t  nb,
magma_int_t *  info 
)

ZUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                        SIDE = MagmaLeft     SIDE = MagmaRight
TRANS = MagmaNoTrans:   Q * C                C * Q
TRANS = MagmaTrans:     Q**H * C             C * Q**H

where Q is a complex orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]dACOMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit.
[in]lddaINTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF.
[in,out]dCCOMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in]lddcINTEGER The leading dimension of the array DC. LDDC >= max(1,M).
[out]hwork(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK))
Currently, zgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on!
[in]lworkINTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA.
[in]dTCOMPLEX_16 array on the GPU that is the output (the 9th argument) of magma_zgeqrf_gpu.
[in]nbINTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_zgeqrf_gpu.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_zunmqr_m ( magma_int_t  nrgpu,
magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDoubleComplex *  A,
magma_int_t  lda,
magmaDoubleComplex *  tau,
magmaDoubleComplex *  C,
magma_int_t  ldc,
magmaDoubleComplex *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

ZUNMQR overwrites the general complex M-by-N matrix C with.

                SIDE = MagmaLeft     SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C          C * Q
TRANS = MagmaTrans:      Q**H * C       C * Q**H

where Q is a complex orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]nrgpuINTEGER Number of GPUs to use.
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]ACOMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A.
[in]ldaINTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in]tauCOMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF.
[in,out]CCOMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q.
[in]ldcINTEGER The leading dimension of the array C. LDC >= max(1,M).
[out]work(workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(0) returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value