![]() |
MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_cgeqrf_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_cgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dR_array, magma_int_t lddr, magmaFloatComplex **dT_array, magma_int_t lddt, magmaFloatComplex **dtau_array, magma_int_t provide_RT, magmaFloatComplex **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_dgeqrf_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, double **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_dgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, double **dA_array, magma_int_t ldda, double **dR_array, magma_int_t lddr, double **dT_array, magma_int_t lddt, double **dtau_array, magma_int_t provide_RT, double **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_sgeqrf_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ldda, float **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_sgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, float **dA_array, magma_int_t ldda, float **dR_array, magma_int_t lddr, float **dT_array, magma_int_t lddt, float **dtau_array, magma_int_t provide_RT, float **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_zgeqrf_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_zgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dR_array, magma_int_t lddr, magmaDoubleComplex **dT_array, magma_int_t lddt, magmaDoubleComplex **dtau_array, magma_int_t provide_RT, magmaDoubleComplex **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_cgeqrf_batched_smallsq (magma_int_t n, magmaFloatComplex **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, magmaFloatComplex **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_dgeqrf_batched_smallsq (magma_int_t n, double **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, double **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_sgeqrf_batched_smallsq (magma_int_t n, float **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, float **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. | |
magma_int_t | magma_zgeqrf_batched_smallsq (magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, magmaDoubleComplex **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. | |
magma_int_t magma_cgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqrf_expert_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magmaFloatComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dR_array, | ||
magma_int_t | lddr, | ||
magmaFloatComplex ** | dT_array, | ||
magma_int_t | lddt, | ||
magmaFloatComplex ** | dtau_array, | ||
magma_int_t | provide_RT, | ||
magmaFloatComplex ** | dW_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in] | nb | INTEGER The blocking size. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | dR_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0. |
[in] | lddr | INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[in,out] | dT_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0. |
[in] | lddt | INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[in] | provide_RT | INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output. |
[out] | dW_array | Array of pointers, dimension (2*batchCount). Each is a COMPLEX array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
double ** | dA_array, | ||
magma_int_t | ldda, | ||
double ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqrf_expert_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
double ** | dA_array, | ||
magma_int_t | ldda, | ||
double ** | dR_array, | ||
magma_int_t | lddr, | ||
double ** | dT_array, | ||
magma_int_t | lddt, | ||
double ** | dtau_array, | ||
magma_int_t | provide_RT, | ||
double ** | dW_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in] | nb | INTEGER The blocking size. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | dR_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0. |
[in] | lddr | INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[in,out] | dT_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0. |
[in] | lddt | INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[in] | provide_RT | INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output. |
[out] | dW_array | Array of pointers, dimension (2*batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
float ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_expert_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
float ** | dA_array, | ||
magma_int_t | ldda, | ||
float ** | dR_array, | ||
magma_int_t | lddr, | ||
float ** | dT_array, | ||
magma_int_t | lddt, | ||
float ** | dtau_array, | ||
magma_int_t | provide_RT, | ||
float ** | dW_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in] | nb | INTEGER The blocking size. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | dR_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0. |
[in] | lddr | INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[in,out] | dT_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0. |
[in] | lddt | INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[in] | provide_RT | INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output. |
[out] | dW_array | Array of pointers, dimension (2*batchCount). Each is a REAL array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dtau_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_expert_batched | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dR_array, | ||
magma_int_t | lddr, | ||
magmaDoubleComplex ** | dT_array, | ||
magma_int_t | lddt, | ||
magmaDoubleComplex ** | dtau_array, | ||
magma_int_t | provide_RT, | ||
magmaDoubleComplex ** | dW_array, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
[in] | m | INTEGER The number of rows of the matrix A. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix A. N >= 0. |
[in] | nb | INTEGER The blocking size. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[in,out] | dR_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0. |
[in] | lddr | INTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[in,out] | dT_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0. |
[in] | lddt | INTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[in] | provide_RT | INTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output. |
[out] | dW_array | Array of pointers, dimension (2*batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace. |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqrf_batched_smallsq | ( | magma_int_t | n, |
magmaFloatComplex ** | dA_array, | ||
magma_int_t | Ai, | ||
magma_int_t | Aj, | ||
magma_int_t | ldda, | ||
magmaFloatComplex ** | dtau_array, | ||
magma_int_t | taui, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This is a batched version of the routine, and works only for small square matrices of size up to 32.
[in] | n | INTEGER The size of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqrf_batched_smallsq | ( | magma_int_t | n, |
double ** | dA_array, | ||
magma_int_t | Ai, | ||
magma_int_t | Aj, | ||
magma_int_t | ldda, | ||
double ** | dtau_array, | ||
magma_int_t | taui, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This is a batched version of the routine, and works only for small square matrices of size up to 32.
[in] | n | INTEGER The size of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqrf_batched_smallsq | ( | magma_int_t | n, |
float ** | dA_array, | ||
magma_int_t | Ai, | ||
magma_int_t | Aj, | ||
magma_int_t | ldda, | ||
float ** | dtau_array, | ||
magma_int_t | taui, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.
This is a batched version of the routine, and works only for small square matrices of size up to 32.
[in] | n | INTEGER The size of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqrf_batched_smallsq | ( | magma_int_t | n, |
magmaDoubleComplex ** | dA_array, | ||
magma_int_t | Ai, | ||
magma_int_t | Aj, | ||
magma_int_t | ldda, | ||
magmaDoubleComplex ** | dtau_array, | ||
magma_int_t | taui, | ||
magma_int_t * | info_array, | ||
magma_int_t | batchCount, | ||
magma_queue_t | queue ) |
ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.
This is a batched version of the routine, and works only for small square matrices of size up to 32.
[in] | n | INTEGER The size of the matrix A. N >= 0. |
[in,out] | dA_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details). |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16. |
[out] | dtau_array | Array of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). |
[out] | info_array | Array of INTEGERs, dimension (batchCount), for corresponding matrices.
|
[in] | batchCount | INTEGER The number of matrices to operate on. |
[in] | queue | magma_queue_t Queue to execute in. |
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).