Functions
magma_int_t	magma_cgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, void host_work, magma_int_t lwork_host, void device_work, magma_int_t lwork_device, magma_int_t *info, magma_queue_t queue)
	CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

magma_int_t	magma_cgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dwork, magmaFloatComplex work, magma_int_t info)
	CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

magma_int_t	magma_dgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, void host_work, magma_int_t lwork_host, void device_work, magma_int_t lwork_device, magma_int_t *info, magma_queue_t queue)
	DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

magma_int_t	magma_dgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dwork, double work, magma_int_t info)
	DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

magma_int_t	magma_sgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, void host_work, magma_int_t lwork_host, void device_work, magma_int_t lwork_device, magma_int_t *info, magma_queue_t queue)
	SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

magma_int_t	magma_sgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dwork, float work, magma_int_t info)
	SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

magma_int_t	magma_zgegqr_expert_gpu_work (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, void host_work, magma_int_t lwork_host, void device_work, magma_int_t lwork_device, magma_int_t *info, magma_queue_t queue)
	ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

magma_int_t	magma_zgegqr_gpu (magma_int_t ikind, magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dwork, magmaDoubleComplex work, magma_int_t info)
	ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

Detailed Description

Function Documentation

◆ magma_cgegqr_expert_gpu_work()

magma_int_t magma_cgegqr_expert_gpu_work	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaFloatComplex_ptr	dA,
		magma_int_t	ldda,
		void *	host_work,
		magma_int_t *	lwork_host,
		void *	device_work,
		magma_int_t *	lwork_device,
		magma_int_t *	info,
		magma_queue_t	queue )

CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

This is an expert API, exposing more controls to the user

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	COMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]	host_work	CPU workspace, size determined by lwork_host On exit, the first n^2 COMPLEX elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[in,out]	lwork_host	INTEGER pointer The size of the CPU workspace (host_work) in bytes lwork_host[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_host. The workspace itself is not referenced, and no computations is performed.

lwork[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_host.

Parameters

	device_work	GPU workspace, size determined by lwork_device
[in,out]	lwork_device	INTEGER pointer The size of the GPU workspace (device_work) in bytes lwork_device[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_device. The workspace itself is not referenced, and no computation is performed. lwork_device[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_device.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
[in]	queue	magma_queue_t created/destroyed by the user outside the routine

◆ magma_cgegqr_gpu()

magma_int_t magma_cgegqr_gpu	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaFloatComplex_ptr	dA,
		magma_int_t	ldda,
		magmaFloatComplex_ptr	dwork,
		magmaFloatComplex *	work,
		magma_int_t *	info )

CGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_cgeqr2x3_gpu) and magma_cungqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	COMPLEX array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
	dwork	(GPU workspace) COMPLEX array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]	work	(CPU workspace) COMPLEX array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.

◆ magma_dgegqr_expert_gpu_work()

magma_int_t magma_dgegqr_expert_gpu_work	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaDouble_ptr	dA,
		magma_int_t	ldda,
		void *	host_work,
		magma_int_t *	lwork_host,
		void *	device_work,
		magma_int_t *	lwork_device,
		magma_int_t *	info,
		magma_queue_t	queue )

DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

This is an expert API, exposing more controls to the user

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	DOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]	host_work	CPU workspace, size determined by lwork_host On exit, the first n^2 DOUBLE PRECISION elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[in,out]	lwork_host	INTEGER pointer The size of the CPU workspace (host_work) in bytes lwork_host[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_host. The workspace itself is not referenced, and no computations is performed.

lwork[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_host.

Parameters

	device_work	GPU workspace, size determined by lwork_device
[in,out]	lwork_device	INTEGER pointer The size of the GPU workspace (device_work) in bytes lwork_device[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_device. The workspace itself is not referenced, and no computation is performed. lwork_device[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_device.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
[in]	queue	magma_queue_t created/destroyed by the user outside the routine

◆ magma_dgegqr_gpu()

magma_int_t magma_dgegqr_gpu	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaDouble_ptr	dA,
		magma_int_t	ldda,
		magmaDouble_ptr	dwork,
		double *	work,
		magma_int_t *	info )

DGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_dgeqr2x3_gpu) and magma_dorgqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	DOUBLE PRECISION array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
	dwork	(GPU workspace) DOUBLE PRECISION array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]	work	(CPU workspace) DOUBLE PRECISION array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.

◆ magma_sgegqr_expert_gpu_work()

magma_int_t magma_sgegqr_expert_gpu_work	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaFloat_ptr	dA,
		magma_int_t	ldda,
		void *	host_work,
		magma_int_t *	lwork_host,
		void *	device_work,
		magma_int_t *	lwork_device,
		magma_int_t *	info,
		magma_queue_t	queue )

SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

This is an expert API, exposing more controls to the user

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	REAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]	host_work	CPU workspace, size determined by lwork_host On exit, the first n^2 REAL elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[in,out]	lwork_host	INTEGER pointer The size of the CPU workspace (host_work) in bytes lwork_host[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_host. The workspace itself is not referenced, and no computations is performed.

lwork[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_host.

Parameters

	device_work	GPU workspace, size determined by lwork_device
[in,out]	lwork_device	INTEGER pointer The size of the GPU workspace (device_work) in bytes lwork_device[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_device. The workspace itself is not referenced, and no computation is performed. lwork_device[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_device.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
[in]	queue	magma_queue_t created/destroyed by the user outside the routine

◆ magma_sgegqr_gpu()

magma_int_t magma_sgegqr_gpu	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaFloat_ptr	dA,
		magma_int_t	ldda,
		magmaFloat_ptr	dwork,
		float *	work,
		magma_int_t *	info )

SGEGQR orthogonalizes the N vectors given by a real M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_sgeqr2x3_gpu) and magma_sorgqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	REAL array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
	dwork	(GPU workspace) REAL array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]	work	(CPU workspace) REAL array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.

◆ magma_zgegqr_expert_gpu_work()

magma_int_t magma_zgegqr_expert_gpu_work	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaDoubleComplex_ptr	dA,
		magma_int_t	ldda,
		void *	host_work,
		magma_int_t *	lwork_host,
		void *	device_work,
		magma_int_t *	lwork_device,
		magma_int_t *	info,
		magma_queue_t	queue )

ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

This is an expert API, exposing more controls to the user

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	COMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]	host_work	CPU workspace, size determined by lwork_host On exit, the first n^2 COMPLEX_16 elements hold the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[in,out]	lwork_host	INTEGER pointer The size of the CPU workspace (host_work) in bytes lwork_host[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_host. The workspace itself is not referenced, and no computations is performed.

lwork[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_host.

Parameters

	device_work	GPU workspace, size determined by lwork_device
[in,out]	lwork_device	INTEGER pointer The size of the GPU workspace (device_work) in bytes lwork_device[0] < 0: a workspace query is assumed, the routine calculates the required amount of workspace and returns it in lwork_device. The workspace itself is not referenced, and no computation is performed. lwork_device[0] >= 0: the routine assumes that the user has provided a workspace with the size in lwork_device.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.
[in]	queue	magma_queue_t created/destroyed by the user outside the routine

◆ magma_zgegqr_gpu()

magma_int_t magma_zgegqr_gpu	(	magma_int_t	ikind,
		magma_int_t	m,
		magma_int_t	n,
		magmaDoubleComplex_ptr	dA,
		magma_int_t	ldda,
		magmaDoubleComplex_ptr	dwork,
		magmaDoubleComplex *	work,
		magma_int_t *	info )

ZGEGQR orthogonalizes the N vectors given by a complex M-by-N matrix A:

A = Q * R.

On exit, if successful, the orthogonal vectors Q overwrite A and R is given in work (on the CPU memory). The routine is designed for tall-and-skinny matrices: M >> N, N <= 128.

This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate.

Parameters

[in]	ikind	INTEGER Several versions are implemented indiceted by the ikind value: 1: This version uses normal equations and SVD in an iterative process that makes the computation numerically accurate. 2: This version uses a standard LAPACK-based orthogonalization through MAGMA's QR panel factorization (magma_zgeqr2x3_gpu) and magma_zungqr 3: Modified Gram-Schmidt (MGS) Cholesky QR [ Note: this method uses the normal equations which squares the condition number of A, therefore \|\|I - Q'Q\|\| < O(eps cond(A)^2) ]
[in]	m	INTEGER The number of rows of the matrix A. m >= n >= 0.
[in]	n	INTEGER The number of columns of the matrix A. 128 >= n >= 0.
[in,out]	dA	COMPLEX_16 array on the GPU, dimension (ldda,n) On entry, the m-by-n matrix A. On exit, the m-by-n matrix Q with orthogonal columns.
[in]	ldda	INTEGER The leading dimension of the array dA. LDDA >= max(1,m). To benefit from coalescent memory accesses LDDA must be divisible by 16.
	dwork	(GPU workspace) COMPLEX_16 array, dimension: n^2 for ikind = 1 3 n^2 + min(m, n) + 2 for ikind = 2 0 (not used) for ikind = 3 n^2 for ikind = 4
[out]	work	(CPU workspace) COMPLEX_16 array. The workspace size has changed for ikind = 1 since release 2.9.0 5 n^2 + 7n + 64 for ikind = 1 (not backward compatible) 3 n^2 otherwise (backward compatible) On exit, work(1:n^2) holds the rectangular matrix R. Preferably, for higher performance, work should be in pinned memory.
[out]	info	INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed. > 0: for ikind = 1 and 4, the normal equations were not positive definite, so the factorization could not be completed, and the solution has not been computed. For ikind = 3, the space is not linearly independent. For all these cases the rank (< n) of the space is returned.

Functions

Detailed Description

Function Documentation

◆ magma_cgegqr_expert_gpu_work()

◆ magma_cgegqr_gpu()

◆ magma_dgegqr_expert_gpu_work()

◆ magma_dgegqr_gpu()

◆ magma_sgegqr_expert_gpu_work()

◆ magma_sgegqr_gpu()

◆ magma_zgegqr_expert_gpu_work()

◆ magma_zgegqr_gpu()