MAGMA  2.7.1
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
geqrf: QR factorization

Functions

magma_int_t magma_cgeqrf_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_cgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloatComplex **dA_array, magma_int_t ldda, magmaFloatComplex **dR_array, magma_int_t lddr, magmaFloatComplex **dT_array, magma_int_t lddt, magmaFloatComplex **dtau_array, magma_int_t provide_RT, magmaFloatComplex **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, double **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, double **dA_array, magma_int_t ldda, double **dR_array, magma_int_t lddr, double **dT_array, magma_int_t lddt, double **dtau_array, magma_int_t provide_RT, double **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_sgeqrf_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ldda, float **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_sgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, float **dA_array, magma_int_t ldda, float **dR_array, magma_int_t lddr, float **dT_array, magma_int_t lddt, float **dtau_array, magma_int_t provide_RT, float **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dtau_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf_expert_batched (magma_int_t m, magma_int_t n, magma_int_t nb, magmaDoubleComplex **dA_array, magma_int_t ldda, magmaDoubleComplex **dR_array, magma_int_t lddr, magmaDoubleComplex **dT_array, magma_int_t lddt, magmaDoubleComplex **dtau_array, magma_int_t provide_RT, magmaDoubleComplex **dW_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_cgeqrf_batched_smallsq (magma_int_t n, magmaFloatComplex **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, magmaFloatComplex **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_dgeqrf_batched_smallsq (magma_int_t n, double **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, double **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_sgeqrf_batched_smallsq (magma_int_t n, float **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, float **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R. More...
 
magma_int_t magma_zgeqrf_batched_smallsq (magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t Ai, magma_int_t Aj, magma_int_t ldda, magmaDoubleComplex **dtau_array, magma_int_t taui, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R. More...
 

Detailed Description

Function Documentation

magma_int_t magma_cgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magmaFloatComplex **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magmaFloatComplex **  dR_array,
magma_int_t  lddr,
magmaFloatComplex **  dT_array,
magma_int_t  lddt,
magmaFloatComplex **  dtau_array,
magma_int_t  provide_RT,
magmaFloatComplex **  dW_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in]nbINTEGER The blocking size.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]dR_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0.
[in]lddrINTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out]dT_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0.
[in]lddtINTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in]provide_RTINTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out]dW_arrayArray of pointers, dimension (2*batchCount). Each is a COMPLEX array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
double **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
double **  dA_array,
magma_int_t  ldda,
double **  dR_array,
magma_int_t  lddr,
double **  dT_array,
magma_int_t  lddt,
double **  dtau_array,
magma_int_t  provide_RT,
double **  dW_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in]nbINTEGER The blocking size.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]dR_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0.
[in]lddrINTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out]dT_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0.
[in]lddtINTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in]provide_RTINTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out]dW_arrayArray of pointers, dimension (2*batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_sgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
float **  dA_array,
magma_int_t  ldda,
float **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_sgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
float **  dA_array,
magma_int_t  ldda,
float **  dR_array,
magma_int_t  lddr,
float **  dT_array,
magma_int_t  lddt,
float **  dtau_array,
magma_int_t  provide_RT,
float **  dW_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in]nbINTEGER The blocking size.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]dR_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0.
[in]lddrINTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out]dT_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0.
[in]lddtINTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in]provide_RTINTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out]dW_arrayArray of pointers, dimension (2*batchCount). Each is a REAL array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf_batched ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  ldda,
magmaDoubleComplex **  dtau_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf_expert_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaDoubleComplex **  dA_array,
magma_int_t  ldda,
magmaDoubleComplex **  dR_array,
magma_int_t  lddr,
magmaDoubleComplex **  dT_array,
magma_int_t  lddt,
magmaDoubleComplex **  dtau_array,
magma_int_t  provide_RT,
magmaDoubleComplex **  dW_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0.
[in]nbINTEGER The blocking size.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[in,out]dR_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDR, N/NB) dR should be of size (LDDR, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of R are stored in dR only when provide_RT > 0.
[in]lddrINTEGER The leading dimension of the array dR. LDDR >= min(M,N) when provide_RT == 1 otherwise LDDR >= min(NB, min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[in,out]dT_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDT, N/NB) dT should be of size (LDDT, N) when provide_RT > 0 and of size (LDDT, NB) otherwise. NB is the local blocking size. On exit, the elements of T are stored in dT only when provide_RT > 0.
[in]lddtINTEGER The leading dimension of the array dT. LDDT >= min(NB,min(M,N)). NB is the local blocking size. To benefit from coalescent memory accesses LDDR must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[in]provide_RTINTEGER provide_RT = 0 no R and no T in output. dR and dT are used as local workspace to store the R and T of each step. provide_RT = 1 the whole R of size (min(M,N), N) and the nbxnb block of T are provided in output. provide_RT = 2 the nbxnb diag block of R and of T are provided in output.
[out]dW_arrayArray of pointers, dimension (2*batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDW, N), where LDDW >= NB. Used as a workspace.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_cgeqrf_batched_smallsq ( magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  Ai,
magma_int_t  Aj,
magma_int_t  ldda,
magmaFloatComplex **  dtau_array,
magma_int_t  taui,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This is a batched version of the routine, and works only for small square matrices of size up to 32.

Parameters
[in]nINTEGER The size of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_dgeqrf_batched_smallsq ( magma_int_t  n,
double **  dA_array,
magma_int_t  Ai,
magma_int_t  Aj,
magma_int_t  ldda,
double **  dtau_array,
magma_int_t  taui,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This is a batched version of the routine, and works only for small square matrices of size up to 32.

Parameters
[in]nINTEGER The size of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_sgeqrf_batched_smallsq ( magma_int_t  n,
float **  dA_array,
magma_int_t  Ai,
magma_int_t  Aj,
magma_int_t  ldda,
float **  dtau_array,
magma_int_t  taui,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

SGEQRF computes a QR factorization of a real M-by-N matrix A: A = Q * R.

This is a batched version of the routine, and works only for small square matrices of size up to 32.

Parameters
[in]nINTEGER The size of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).

magma_int_t magma_zgeqrf_batched_smallsq ( magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  Ai,
magma_int_t  Aj,
magma_int_t  ldda,
magmaDoubleComplex **  dtau_array,
magma_int_t  taui,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

ZGEQRF computes a QR factorization of a complex M-by-N matrix A: A = Q * R.

This is a batched version of the routine, and works only for small square matrices of size up to 32.

Parameters
[in]nINTEGER The size of the matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, the elements on and above the diagonal of the array contain the min(M,N)-by-N upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of min(m,n) elementary reflectors (see Further Details).
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M). To benefit from coalescent memory accesses LDDA must be divisible by 16.
[out]dtau_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(k), where k = min(m,n).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a complex scalar, and v is a complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).