MAGMA  2.7.1
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
gegqr: QR factorization and generate Q

Functions

magma_int_t magma_cgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dwork, magmaFloatComplex *work, magma_int_t *info)
 CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: More...
 
magma_int_t magma_dgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dwork, double *work, magma_int_t *info)
 DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: More...
 
magma_int_t magma_sgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dwork, float *work, magma_int_t *info)
 SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: More...
 
magma_int_t magma_zgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dwork, magmaDoubleComplex *work, magma_int_t *info)
 ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: More...
 

Detailed Description

Function Documentation

magma_int_t magma_cgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloatComplex_ptr  dwork,
magmaFloatComplex *  work,
magma_int_t *  info 
)

CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]ikindINTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: Modified Gram-Schmidt (MGS)
  1. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in]mINTEGER The number of rows of the matrix A. m >= n >= 0.
[in]nINTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]dACOMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) COMPLEX array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]work(CPU workspace) COMPLEX array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
magma_int_t magma_dgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaDouble_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dwork,
double *  work,
magma_int_t *  info 
)

DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]ikindINTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS)
  1. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in]mINTEGER The number of rows of the matrix A. m >= n >= 0.
[in]nINTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) DOUBLE PRECISION array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]work(CPU workspace) DOUBLE PRECISION array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
magma_int_t magma_sgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaFloat_ptr  dA,
magma_int_t  ldda,
magmaFloat_ptr  dwork,
float *  work,
magma_int_t *  info 
)

SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]ikindINTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: Modified Gram-Schmidt (MGS)
  1. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in]mINTEGER The number of rows of the matrix A. m >= n >= 0.
[in]nINTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]dAREAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) REAL array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]work(CPU workspace) REAL array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
magma_int_t magma_zgegqr_gpu ( magma_int_t  ikind,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dwork,
magmaDoubleComplex *  work,
magma_int_t *  info 
)

ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters
[in]ikindINTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS)
  1. Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore ||I - Q'Q|| < O(eps cond(A)^2) ]
[in]mINTEGER The number of rows of the matrix A. m >= n >= 0.
[in]nINTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]dACOMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
dwork(GPU workspace) COMPLEX_16 array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]work(CPU workspace) COMPLEX_16 array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.