MAGMA  2.0.0
Matrix Algebra for GPU and Multicore Architectures

Functions

magma_int_t magma_dgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dwork, double *work, magma_int_t *info)
 DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: More...
 
magma_int_t magma_dgeqrf (magma_int_t m, magma_int_t n, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magma_int_t *info)
 DGEQRF computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf2_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magma_int_t *info)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf3_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dT, magma_int_t *info)
 DGEQRF3 computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, double **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_expert_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, double **dR_array, magma_int_t lddr, double **dT_array, magma_int_t lddt, double **dtau_array, magma_int_t provide_RT, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dT, magma_int_t *info)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_m (magma_int_t ngpu, magma_int_t m, magma_int_t n, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magma_int_t *info)
 DGEQRF computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R using multiple GPUs. More...
 
magma_int_t magma_dgeqrf2_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaDouble_ptr dlA[], magma_int_t ldda, double *tau, magma_int_t *info)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_ooc (magma_int_t m, magma_int_t n, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magma_int_t *info)
 DGEQRF_OOC computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dorgqr (magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
 DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_dorgqr2 (magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, magma_int_t *info)
 DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_dorgqr_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
 DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_dorgqr_m (magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, double *T, magma_int_t nb, magma_int_t *info)
 DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_dormqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, double *C, magma_int_t ldc, double *work, magma_int_t lwork, magma_int_t *info)
 DORMQR overwrites the general real M-by-N matrix C with. More...
 
magma_int_t magma_dormqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dC, magma_int_t lddc, const double *wA, magma_int_t ldwa, magma_int_t *info)
 DORMQR overwrites the general real M-by-N matrix C with. More...
 
magma_int_t magma_dormqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_const_ptr dA, magma_int_t ldda, double const *tau, magmaDouble_ptr dC, magma_int_t lddc, double *hwork, magma_int_t lwork, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
 DORMQR_GPU overwrites the general real M-by-N matrix C with. More...
 
magma_int_t magma_dormqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, double *C, magma_int_t ldc, double *work, magma_int_t lwork, magma_int_t *info)
 DORMQR overwrites the general real M-by-N matrix C with. More...
 

Detailed Description

Function Documentation

magma_int_t magma_dgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dwork,
double *  work,
magma_int_t *  info 
)

DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]ikindINTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS)
  1. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in]mINTEGER The number of rows of the matrix A. m >= n >= 0.
[in]nINTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) DOUBLE PRECISION array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]work(CPU workspace) DOUBLE PRECISION array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: for ikind = 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed.
magma_int_t magma_dgeqrf ( magma_int_t  m,
magma_int_t  n,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

DGEQRF computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

This uses 2 queues to overlap communication and computation.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ADOUBLE PRECISION array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= max( N*NB, 2*NB*NB ), where NB can be obtained through magma_get_dgeqrf_nb( M, N ).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf2_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magma_int_t *  info 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This version has LAPACK-complaint arguments.

Other versions (magma_dgeqrf_gpu and magma_dgeqrf3_gpu) store the intermediate T matrices.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v^H

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf2_mgpu ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dlA[],
magma_int_t  ldda,
double *  tau,
magma_int_t *  info 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This is a GPU interface of the routine.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dlADOUBLE PRECISION array of pointers on the GPU, dimension (ngpu). On entry, the M-by-N matrix A distributed over GPUs (d_lA[d] points to the local matrix on d-th GPU). It uses 1D block column cyclic format with the block size of nb, and each local matrix is stored by column. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf3_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magmaDouble_ptr  dT,
magma_int_t *  info 
)

DGEQRF3 computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are stored separately in dT.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dT(workspace) DOUBLE PRECISION array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_dgeqrf_nb( M, N ). It starts with a MIN(M,N)*NB block that stores the triangular T matrices, followed by a MIN(M,N)*NB block that stores the diagonal blocks of the R matrix. The rest of the array is used as workspace.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v^H

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
double **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
double **  dR_array,
magma_int_t  lddr,
double **  dT_array,
magma_int_t  lddt,
double **  dtau_array,
magma_int_t  provide_RT,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]dR_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0.
[in]lddrINTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out]dT_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0.
[in]lddtINTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in]provide_RTINTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_gpu ( magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magmaDouble_ptr  dT,
magma_int_t *  info 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are inverted and stored separately in dT.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]dT(workspace) DOUBLE PRECISION array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB, where NB can be obtained through magma_get_dgeqrf_nb( M, N ). It starts with a MIN(M,N)*NB block that stores the triangular T matrices, followed by a MIN(M,N)*NB block that stores inverses of the diagonal blocks of the R matrix. The rest of the array is used as workspace.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v^H

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_m ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

DGEQRF computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R using multiple GPUs.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ADOUBLE PRECISION array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_dgeqrf_nb( M, N ).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_ooc ( magma_int_t  m,
magma_int_t  n,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

DGEQRF_OOC computes a QR factorization of a DOUBLE PRECISION M-by-N matrix A: A = Q * R.

This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. This is an out-of-core (ooc) version that is similar to magma_dgeqrf but the difference is that this version can use a GPU even if the matrix does not fit into the GPU memory at once.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]ADOUBLE PRECISION array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[out]tauDOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_dgeqrf_nb( M, N ).
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dorgqr ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
double *  A,
magma_int_t  lda,
double *  tau,
magmaDouble_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ADOUBLE PRECISION array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF_GPU.
[in]dTDOUBLE PRECISION array on the GPU device. DT contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_dgeqrf_gpu.
[in]nbINTEGER This is the block size used in DGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dorgqr2 ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
double *  A,
magma_int_t  lda,
double *  tau,
magma_int_t *  info 
)

DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF.

This version recomputes the T matrices on the CPU and sends them to the GPU.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ADOUBLE PRECISION array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF_GPU.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_dorgqr_gpu ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magmaDouble_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF_GPU.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]dADOUBLE PRECISION array A on the GPU, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]lddaINTEGER The first dimension of the array A. LDDA >= max(1,M).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF_GPU.
[in]dT(workspace) DOUBLE PRECISION work space array on the GPU, dimension (2*MIN(M, N) + ceil(N/32)*32 )*NB. This must be the 6th argument of magma_dgeqrf_gpu [ note that if N here is bigger than N in magma_dgeqrf_gpu, the workspace requirement DT in magma_dgeqrf_gpu must be as specified in this routine ].
[in]nbINTEGER This is the block size used in DGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dorgqr_m ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
double *  A,
magma_int_t  lda,
double *  tau,
double *  T,
magma_int_t  nb,
magma_int_t *  info 
)

DORGQR generates an M-by-N DOUBLE PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]ADOUBLE PRECISION array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]ldaINTEGER The first dimension of the array A. LDA >= max(1,M).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF_GPU.
[in]TDOUBLE PRECISION array, dimension (NB, min(M,N)). T contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_dgeqrf_gpu (except stored on the CPU, not the GPU).
[in]nbINTEGER This is the block size used in DGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in T.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dormqr ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
double *  A,
magma_int_t  lda,
double *  tau,
double *  C,
magma_int_t  ldc,
double *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

DORMQR overwrites the general real M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = MagmaTrans:  Q**H * C           C * Q**H

where Q is a real orthogonal matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]ADOUBLE PRECISION array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit.
[in]ldaINTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF.
[in,out]CDOUBLE PRECISION array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in]ldcINTEGER The leading dimension of the array C. LDC >= max(1,M).
[out]work(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dormqr2_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magmaDouble_ptr  dC,
magma_int_t  lddc,
const double *  wA,
magma_int_t  ldwa,
magma_int_t *  info 
)

DORMQR overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H

where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified.
[in]lddaINTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF.
[in,out]dCDOUBLE PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q).
[in]lddcINTEGER The leading dimension of the array dC. LDDC >= max(1,M).
[in]wADOUBLE PRECISION array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by DSYTRD_GPU. (A copy of the upper or lower part of dA, on the host.)
[in]ldwaINTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N).
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dormqr_gpu ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDouble_const_ptr  dA,
magma_int_t  ldda,
double const *  tau,
magmaDouble_ptr  dC,
magma_int_t  lddc,
double *  hwork,
magma_int_t  lwork,
magmaDouble_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

DORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H

where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]dADOUBLE PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit.
[in]lddaINTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF.
[in,out]dCDOUBLE PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q).
[in]lddcINTEGER The leading dimension of the array DC. LDDC >= max(1,M).
[out]hwork(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK))
Currently, dgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on!
[in]lworkINTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA.
[in,out]dTDOUBLE PRECISION array on the GPU that is the output (the 9th argument) of magma_dgeqrf_gpu. Part used as workspace.
[in]nbINTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_dgeqrf_gpu.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dormqr_m ( magma_int_t  ngpu,
magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
double *  A,
magma_int_t  lda,
double *  tau,
double *  C,
magma_int_t  ldc,
double *  work,
magma_int_t  lwork,
magma_int_t *  info 
)

DORMQR overwrites the general real M-by-N matrix C with.

                            SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:       Q * C               C * Q
TRANS = MagmaTrans:    Q**H * C            C * Q**H

where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)

as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]ngpuINTEGER Number of GPUs to use. ngpu > 0.
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]ADOUBLE PRECISION array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument A.
[in]ldaINTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N).
[in]tauDOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF.
[in,out]CDOUBLE PRECISION array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q.
[in]ldcINTEGER The leading dimension of the array C. LDC >= max(1,M).
[out]work(workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value