![]() |
MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_cgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, void *host_work, magma_int_t *lwork_host, void *device_work, magma_int_t *lwork_device, magma_int_t *info, magma_queue_t queue) |
CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: | |
magma_int_t | magma_cgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dwork, magmaFloatComplex *work, magma_int_t *info) |
CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: | |
magma_int_t | magma_dgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, void *host_work, magma_int_t *lwork_host, void *device_work, magma_int_t *lwork_device, magma_int_t *info, magma_queue_t queue) |
DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: | |
magma_int_t | magma_dgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dwork, double *work, magma_int_t *info) |
DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: | |
magma_int_t | magma_sgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, void *host_work, magma_int_t *lwork_host, void *device_work, magma_int_t *lwork_device, magma_int_t *info, magma_queue_t queue) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: | |
magma_int_t | magma_sgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dwork, float *work, magma_int_t *info) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A: | |
magma_int_t | magma_zgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, void *host_work, magma_int_t *lwork_host, void *device_work, magma_int_t *lwork_device, magma_int_t *info, magma_queue_t queue) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: | |
magma_int_t | magma_zgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dwork, magmaDoubleComplex *work, magma_int_t *info) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A: | |
magma_int_t magma_cgegqr_expert_gpu_work | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
void * | host_work, | ||
magma_int_t * | lwork_host, | ||
void * | device_work, | ||
magma_int_t * | lwork_device, | ||
magma_int_t * | info, | ||
magma_queue_t | queue ) |
CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
This is an expert API, exposing more controls to the user
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | COMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | host_work | CPU workspace, size determined by lwork_host On exit, the first n^2 COMPLEX elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[in,out] | lwork_host | INTEGER pointer The size of the CPU workspace (host_work) in bytes
|
device_work | GPU workspace, size determined by lwork_device | |
[in,out] | lwork_device | INTEGER pointer The size of the GPU workspace (device_work) in bytes
|
[out] | info | INTEGER
|
[in] | queue | magma_queue_t
|
magma_int_t magma_cgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dwork, | ||
magmaFloatComplex * | work, | ||
magma_int_t * | info ) |
CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | COMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) COMPLEX array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) COMPLEX array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|
magma_int_t magma_dgegqr_expert_gpu_work | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDouble_ptr | dA, | ||
magma_int_t | ldda, | ||
void * | host_work, | ||
magma_int_t * | lwork_host, | ||
void * | device_work, | ||
magma_int_t * | lwork_device, | ||
magma_int_t * | info, | ||
magma_queue_t | queue ) |
DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
This is an expert API, exposing more controls to the user
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | DOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | host_work | CPU workspace, size determined by lwork_host On exit, the first n^2 DOUBLE PRECISION elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[in,out] | lwork_host | INTEGER pointer The size of the CPU workspace (host_work) in bytes
|
device_work | GPU workspace, size determined by lwork_device | |
[in,out] | lwork_device | INTEGER pointer The size of the GPU workspace (device_work) in bytes
|
[out] | info | INTEGER
|
[in] | queue | magma_queue_t
|
magma_int_t magma_dgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDouble_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDouble_ptr | dwork, | ||
double * | work, | ||
magma_int_t * | info ) |
DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | DOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) DOUBLE PRECISION array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) DOUBLE PRECISION array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|
magma_int_t magma_sgegqr_expert_gpu_work | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
void * | host_work, | ||
magma_int_t * | lwork_host, | ||
void * | device_work, | ||
magma_int_t * | lwork_device, | ||
magma_int_t * | info, | ||
magma_queue_t | queue ) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
This is an expert API, exposing more controls to the user
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | host_work | CPU workspace, size determined by lwork_host On exit, the first n^2 REAL elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[in,out] | lwork_host | INTEGER pointer The size of the CPU workspace (host_work) in bytes
|
device_work | GPU workspace, size determined by lwork_device | |
[in,out] | lwork_device | INTEGER pointer The size of the GPU workspace (device_work) in bytes
|
[out] | info | INTEGER
|
[in] | queue | magma_queue_t
|
magma_int_t magma_sgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaFloat_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dwork, | ||
float * | work, | ||
magma_int_t * | info ) |
SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | REAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) REAL array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) REAL array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|
magma_int_t magma_zgegqr_expert_gpu_work | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
void * | host_work, | ||
magma_int_t * | lwork_host, | ||
void * | device_work, | ||
magma_int_t * | lwork_device, | ||
magma_int_t * | info, | ||
magma_queue_t | queue ) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
This is an expert API, exposing more controls to the user
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | host_work | CPU workspace, size determined by lwork_host On exit, the first n^2 COMPLEX_16 elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[in,out] | lwork_host | INTEGER pointer The size of the CPU workspace (host_work) in bytes
|
device_work | GPU workspace, size determined by lwork_device | |
[in,out] | lwork_device | INTEGER pointer The size of the GPU workspace (device_work) in bytes
|
[out] | info | INTEGER
|
[in] | queue | magma_queue_t
|
magma_int_t magma_zgegqr_gpu | ( | magma_int_t | ikind, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex_ptr | dwork, | ||
magmaDoubleComplex * | work, | ||
magma_int_t * | info ) |
ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:
A = Q * R.
On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.
This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.
[in] | ikind | INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS)
|
[in] | m | INTEGER The number of rows of the matrix A. m >= n >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. 128 >= n >= 0. |
[in,out] | dA | COMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
dwork | (GPU workspace) COMPLEX_16 array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4 | |
[out] | work | (CPU workspace) COMPLEX_16 array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory. |
[out] | info | INTEGER
|