![]() |
MAGMA
1.6.1
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_sgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dwork, float *work, magma_int_t *info) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: More... | |
magma_int_t | magma_sgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf4 (magma_int_t ngpu, magma_int_t m, magma_int_t n, float *A, magma_int_t lda, float *tau, float *work, magma_int_t lwork, magma_int_t *info) |
SGEQRF4 computes a QR factorization of a REAL M-by-N matrix A: A = Q * R using multiple GPUs. More... | |
magma_int_t | magma_sgeqrf (magma_int_t m, magma_int_t n, float *A, magma_int_t lda, float *tau, float *work, magma_int_t lwork, magma_int_t *info) |
SGEQRF computes a QR factorization of a REAL M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf2_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magma_int_t *info) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf3_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dT, magma_int_t *info) |
SGEQRF3 computes a QR factorization of a real M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ldda, float **tau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dT, magma_int_t *info) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf2_mgpu (magma_int_t ngpu, magma_int_t m, magma_int_t n, magmaFloat_ptr dlA[], magma_int_t ldda, float *tau, magma_int_t *info) |
SGEQRF2_MGPU computes a QR factorization of a real M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqrf_ooc (magma_int_t m, magma_int_t n, float *A, magma_int_t lda, float *tau, float *work, magma_int_t lwork, magma_int_t *info) |
SGEQRF_OOC computes a QR factorization of a REAL M-by-N matrix A: A = Q * R. More... | |
magma_int_t | magma_sorgqr (magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, magmaFloat_ptr dT, magma_int_t nb, magma_int_t *info) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_sorgqr2 (magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, magma_int_t *info) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_sorgqr_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dT, magma_int_t nb, magma_int_t *info) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_sorgqr_m (magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, float *T, magma_int_t nb, magma_int_t *info) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_sormqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, float *C, magma_int_t ldc, float *work, magma_int_t lwork, magma_int_t *info) |
SORMQR overwrites the general real M-by-N matrix C with. More... | |
magma_int_t | magma_sormqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dC, magma_int_t lddc, float *wA, magma_int_t ldwa, magma_int_t *info) |
SORMQR overwrites the general real M-by-N matrix C with. More... | |
magma_int_t | magma_sormqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dC, magma_int_t lddc, float *hwork, magma_int_t lwork, magmaFloat_ptr dT, magma_int_t nb, magma_int_t *info) |
SORMQR_GPU overwrites the general real M-by-N matrix C with. More... | |
magma_int_t | magma_sormqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, float *C, magma_int_t ldc, float *work, magma_int_t lwork, magma_int_t *info) |
SORMQR overwrites the general real M-by-N matrix C with. More... | |
magma_int_t | magma_sgeqr2_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t lda, float **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More... | |
magma_int_t | magma_sgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More... | |
magma_int_t magma_sgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dwork, | ||
float * | work, | ||
magma_int_t * | info | ||
) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: MGS
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) REAL array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) REAL array, dimension 3 n^2. On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|
magma_int_t magma_sgeqr2_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | lda, | ||
float ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the right-looking QR with non-blocking.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | dtau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. |
dwork | (workspace) REAL array, dimension (N) * ( sizeof(float) + sizeof(float)) | |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x2_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dtau, | ||
magmaFloat_ptr | dT, | ||
magmaFloat_ptr | ddA, | ||
magmaFloat_ptr | dwork, | ||
magma_int_t * | info | ||
) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | dtau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. |
[out] | ddA | REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. |
dwork | (workspace) DOUBLE_PRECISION array, dimension (3 N) | |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x3_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dtau, | ||
magmaFloat_ptr | dT, | ||
magmaFloat_ptr | ddA, | ||
magmaFloat_ptr | dwork, | ||
magma_int_t * | info | ||
) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | dtau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. |
[out] | ddA | REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. |
dwork | (workspace) DOUBLE_PRECISION array, dimension (3 N) | |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x4_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dtau, | ||
magmaFloat_ptr | dT, | ||
magmaFloat_ptr | ddA, | ||
magmaFloat_ptr | dwork, | ||
magma_queue_t | queue, | ||
magma_int_t * | info | ||
) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | dtau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. |
[out] | ddA | REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. |
dwork | (workspace) DOUBLE_PRECISION array, dimension (3 N) | |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v**H
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dtau, | ||
magmaFloat_ptr | dT, | ||
magmaFloat_ptr | ddA, | ||
magmaFloat_ptr | dwork, | ||
magma_int_t * | info | ||
) |
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the right-looking QR. A hard-coded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128). |
[in,out] | dA | REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | dtau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. |
[out] | ddA | REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. |
dwork | (workspace) REAL array, dimension (N) | |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf | ( | magma_int_t | m, |
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
SGEQRF computes a QR factorization of a REAL M-by-N matrix A: A = Q * R.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
If the current stream is NULL, this version replaces it with a new stream to overlap computation with communication.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= max( N*NB, 2*NB*NB ), where NB can be obtained through magma_get_sgeqrf_nb(M). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf2_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magma_int_t * | info | ||
) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This version has LAPACK-complaint arguments.
If the current stream is NULL, this version replaces it with a new stream to overlap computation with communication.
Other versions (magma_sgeqrf_gpu and magma_sgeqrf3_gpu) store the intermediate T matrices.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf2_mgpu | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloat_ptr | dlA[], | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magma_int_t * | info | ||
) |
SGEQRF2_MGPU computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This is a GPU interface of the routine.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix dA. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf3_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magmaFloat_ptr | dT, | ||
magma_int_t * | info | ||
) |
SGEQRF3 computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them and the corresponding parts of the upper triangular R are stored separately in dT.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | (workspace) REAL array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB, where NB can be obtained through magma_get_sgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal matrices for the R matrix. The rest of the array is used as workspace. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf4 | ( | magma_int_t | ngpu, |
magma_int_t | m, | ||
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
SGEQRF4 computes a QR factorization of a REAL M-by-N matrix A: A = Q * R using multiple GPUs.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_sgeqrf_nb(M). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
float ** | tau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue | ||
) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magmaFloat_ptr | dT, | ||
magma_int_t * | info | ||
) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This version stores the triangular dT matrices used in the block QR factorization so that they can be applied directly (i.e., without being recomputed) later. As a result, the application of Q is much faster. Also, the upper triangular matrices for V have 0s in them. The corresponding parts of the upper triangular R are inverted and stored separately in dT.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | dT | (workspace) REAL array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB, where NB can be obtained through magma_get_sgeqrf_nb(M). It starts with MIN(M,N)*NB block that store the triangular T matrices, followed by the MIN(M,N)*NB block of the diagonal inverses for the R matrix. The rest of the array is used as workspace. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_ooc | ( | magma_int_t | m, |
magma_int_t | n, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
SGEQRF_OOC computes a QR factorization of a REAL M-by-N matrix A: A = Q * R.
This version does not require work space on the GPU passed as input. GPU memory is allocated in the routine. This is an out-of-core (ooc) version that is similar to magma_sgeqrf but the difference is that this version can use a GPU even if the matrix does not fit into the GPU memory at once.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | A | REAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). Higher performance is achieved if A is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,M). |
[out] | tau | REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. Higher performance is achieved if WORK is in pinned memory, e.g. allocated using magma_malloc_pinned. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= N*NB, where NB can be obtained through magma_get_sgeqrf_nb(M). If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued. |
[out] | info | INTEGER
|
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sorgqr | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
magmaFloat_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | REAL array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF_GPU. |
[in] | dT | REAL array on the GPU device. DT contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_sgeqrf_gpu. |
[in] | nb | INTEGER This is the block size used in SGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT. |
[out] | info | INTEGER
|
magma_int_t magma_sorgqr2 | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
magma_int_t * | info | ||
) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF.
This version recomputes the T matrices on the CPU and sends them to the GPU.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | REAL array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF_GPU. |
[out] | info | INTEGER
|
magma_int_t magma_sorgqr_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magmaFloat_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF_GPU.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | dA | REAL array A on the GPU, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | ldda | INTEGER The first dimension of the array A. LDDA >= max(1,M). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF_GPU. |
[in] | dT | (workspace) REAL work space array on the GPU, dimension (2*MIN(M, N) + (N+31)/32*32 )*NB. This must be the 6th argument of magma_sgeqrf_gpu [ note that if N here is bigger than N in magma_sgeqrf_gpu, the workspace requirement DT in magma_sgeqrf_gpu must be as specified in this routine ]. |
[in] | nb | INTEGER This is the block size used in SGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT. |
[out] | info | INTEGER
|
magma_int_t magma_sorgqr_m | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | T, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
SORGQR generates an M-by-N REAL matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | A | REAL array A, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | lda | INTEGER The first dimension of the array A. LDA >= max(1,M). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF_GPU. |
[in] | T | REAL array, dimension (NB, min(M,N)). T contains the T matrices used in blocking the elementary reflectors H(i), e.g., this can be the 6th argument of magma_sgeqrf_gpu (except stored on the CPU, not the GPU). |
[in] | nb | INTEGER This is the block size used in SGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in T. |
[out] | info | INTEGER
|
magma_int_t magma_sormqr | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | C, | ||
magma_int_t | ldc, | ||
float * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
SORMQR overwrites the general real M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = MagmaTrans: Q**H * C C * Q**H
where Q is a real unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | A | REAL array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. |
[in] | lda | INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. |
[in,out] | C | REAL array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. |
[in] | ldc | INTEGER The leading dimension of the array C. LDC >= max(1,M). |
[out] | work | (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | info | INTEGER
|
magma_int_t magma_sormqr2_gpu | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magmaFloat_ptr | dC, | ||
magma_int_t | lddc, | ||
float * | wA, | ||
magma_int_t | ldwa, | ||
magma_int_t * | info | ||
) |
SORMQR overwrites the general real M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = MagmaTrans: Q**H * C C * Q**H
where Q is a real unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | dA | REAL array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument A. The diagonal and the upper part are destroyed, the reflectors are not modified. |
[in] | ldda | INTEGER The leading dimension of the array DA. LDDA >= max(1,M) if SIDE = MagmaLeft; LDDA >= max(1,N) if SIDE = MagmaRight. |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. |
[in,out] | dC | REAL array, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). |
[in] | lddc | INTEGER The leading dimension of the array C. LDDC >= max(1,M). |
[in] | wA | (workspace) REAL array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by SSYTRD_GPU. |
[in] | ldwa | INTEGER The leading dimension of the array wA. LDWA >= max(1,M) if SIDE = MagmaLeft; LDWA >= max(1,N) if SIDE = MagmaRight. |
[out] | info | INTEGER
|
magma_int_t magma_sormqr_gpu | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
float * | tau, | ||
magmaFloat_ptr | dC, | ||
magma_int_t | lddc, | ||
float * | hwork, | ||
magma_int_t | lwork, | ||
magmaFloat_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
SORMQR_GPU overwrites the general real M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = MagmaTrans: Q**H * C C * Q**H
where Q is a real unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | dA | REAL array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. |
[in] | ldda | INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. |
[in,out] | dC | REAL array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. |
[in] | lddc | INTEGER The leading dimension of the array DC. LDDC >= max(1,M). |
[out] | hwork | (workspace) REAL array, dimension (MAX(1,LWORK)) Currently, sgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! |
[in] | lwork | INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. |
[in] | dT | REAL array on the GPU that is the output (the 9th argument) of magma_sgeqrf_gpu. |
[in] | nb | INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_sgeqrf_gpu. |
[out] | info | INTEGER
|
magma_int_t magma_sormqr_m | ( | magma_int_t | ngpu, |
magma_side_t | side, | ||
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
float * | A, | ||
magma_int_t | lda, | ||
float * | tau, | ||
float * | C, | ||
magma_int_t | ldc, | ||
float * | work, | ||
magma_int_t | lwork, | ||
magma_int_t * | info | ||
) |
SORMQR overwrites the general real M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = MagmaTrans: Q**H * C C * Q**H
where Q is a real unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | ngpu | INTEGER Number of GPUs to use. ngpu > 0. |
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | A | REAL array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument A. |
[in] | lda | INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). |
[in] | tau | REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. |
[in,out] | C | REAL array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. |
[in] | ldc | INTEGER The leading dimension of the array C. LDC >= max(1,M). |
[out] | work | (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | info | INTEGER
|