MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
Loading...
Searching...
No Matches
laqps: Partial factorization; used by geqp3

Functions

magma_int_t magma_claqps (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaFloatComplex *tau, float *vn1, float *vn2, magmaFloatComplex *auxv, magmaFloatComplex *F, magma_int_t ldf, magmaFloatComplex_ptr dF, magma_int_t lddf)
 CLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.
 
magma_int_t magma_dlaqps (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, double *A, magma_int_t lda, magmaDouble_ptr dA, magma_int_t ldda, magma_int_t *jpvt, double *tau, double *vn1, double *vn2, double *auxv, double *F, magma_int_t ldf, magmaDouble_ptr dF, magma_int_t lddf)
 DLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.
 
magma_int_t magma_slaqps (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, float *A, magma_int_t lda, magmaFloat_ptr dA, magma_int_t ldda, magma_int_t *jpvt, float *tau, float *vn1, float *vn2, float *auxv, float *F, magma_int_t ldf, magmaFloat_ptr dF, magma_int_t lddf)
 SLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.
 
magma_int_t magma_zlaqps (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaDoubleComplex *tau, double *vn1, double *vn2, magmaDoubleComplex *auxv, magmaDoubleComplex *F, magma_int_t ldf, magmaDoubleComplex_ptr dF, magma_int_t lddf)
 ZLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.
 
magma_int_t magma_claqps2_gpu (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaFloatComplex_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaFloatComplex_ptr dtau, magmaFloat_ptr dvn1, magmaFloat_ptr dvn2, magmaFloatComplex_ptr dauxv, magmaFloatComplex_ptr dF, magma_int_t lddf, magmaFloat_ptr dlsticcs, magma_queue_t queue)
 CLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.
 
magma_int_t magma_dlaqps2_gpu (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaDouble_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaDouble_ptr dtau, magmaDouble_ptr dvn1, magmaDouble_ptr dvn2, magmaDouble_ptr dauxv, magmaDouble_ptr dF, magma_int_t lddf, magmaDouble_ptr dlsticcs, magma_queue_t queue)
 DLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.
 
magma_int_t magma_slaqps2_gpu (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaFloat_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaFloat_ptr dtau, magmaFloat_ptr dvn1, magmaFloat_ptr dvn2, magmaFloat_ptr dauxv, magmaFloat_ptr dF, magma_int_t lddf, magmaFloat_ptr dlsticcs, magma_queue_t queue)
 SLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.
 
magma_int_t magma_zlaqps2_gpu (magma_int_t m, magma_int_t n, magma_int_t offset, magma_int_t nb, magma_int_t *kb, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_int_t *jpvt, magmaDoubleComplex_ptr dtau, magmaDouble_ptr dvn1, magmaDouble_ptr dvn2, magmaDoubleComplex_ptr dauxv, magmaDoubleComplex_ptr dF, magma_int_t lddf, magmaDouble_ptr dlsticcs, magma_queue_t queue)
 ZLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.
 

Detailed Description

Function Documentation

◆ magma_claqps()

magma_int_t magma_claqps ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaFloatComplex * A,
magma_int_t lda,
magmaFloatComplex_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaFloatComplex * tau,
float * vn1,
float * vn2,
magmaFloatComplex * auxv,
magmaFloatComplex * F,
magma_int_t ldf,
magmaFloatComplex_ptr dF,
magma_int_t lddf )

CLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]ACOMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[in,out]dACOMPLEX array, dimension (LDA,N) Copy of A on the GPU. Portions of A are updated on the CPU; portions of dA are updated on the GPU. See code for details.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]tauCOMPLEX array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]vn1REAL array, dimension (N) The vector with the partial column norms.
[in,out]vn2REAL array, dimension (N) The vector with the exact column norms.
[in,out]auxvCOMPLEX array, dimension (NB) Auxiliar vector.
[in,out]FCOMPLEX array, dimension (LDF,NB) Matrix F' = L*Y'*A.
[in]ldfINTEGER The leading dimension of the array F. LDF >= max(1,N).
[in,out]dFCOMPLEX array, dimension (LDDF,NB) Copy of F on the GPU. See code for details.
[in]lddfINTEGER The leading dimension of the array dF. LDDF >= max(1,N).

◆ magma_dlaqps()

magma_int_t magma_dlaqps ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
double * A,
magma_int_t lda,
magmaDouble_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
double * tau,
double * vn1,
double * vn2,
double * auxv,
double * F,
magma_int_t ldf,
magmaDouble_ptr dF,
magma_int_t lddf )

DLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]ADOUBLE PRECISION array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[in,out]dADOUBLE PRECISION array, dimension (LDA,N) Copy of A on the GPU. Portions of A are updated on the CPU; portions of dA are updated on the GPU. See code for details.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]tauDOUBLE PRECISION array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]vn1DOUBLE PRECISION array, dimension (N) The vector with the partial column norms.
[in,out]vn2DOUBLE PRECISION array, dimension (N) The vector with the exact column norms.
[in,out]auxvDOUBLE PRECISION array, dimension (NB) Auxiliar vector.
[in,out]FDOUBLE PRECISION array, dimension (LDF,NB) Matrix F' = L*Y'*A.
[in]ldfINTEGER The leading dimension of the array F. LDF >= max(1,N).
[in,out]dFDOUBLE PRECISION array, dimension (LDDF,NB) Copy of F on the GPU. See code for details.
[in]lddfINTEGER The leading dimension of the array dF. LDDF >= max(1,N).

◆ magma_slaqps()

magma_int_t magma_slaqps ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
float * A,
magma_int_t lda,
magmaFloat_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
float * tau,
float * vn1,
float * vn2,
float * auxv,
float * F,
magma_int_t ldf,
magmaFloat_ptr dF,
magma_int_t lddf )

SLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]AREAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[in,out]dAREAL array, dimension (LDA,N) Copy of A on the GPU. Portions of A are updated on the CPU; portions of dA are updated on the GPU. See code for details.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]tauREAL array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]vn1REAL array, dimension (N) The vector with the partial column norms.
[in,out]vn2REAL array, dimension (N) The vector with the exact column norms.
[in,out]auxvREAL array, dimension (NB) Auxiliar vector.
[in,out]FREAL array, dimension (LDF,NB) Matrix F' = L*Y'*A.
[in]ldfINTEGER The leading dimension of the array F. LDF >= max(1,N).
[in,out]dFREAL array, dimension (LDDF,NB) Copy of F on the GPU. See code for details.
[in]lddfINTEGER The leading dimension of the array dF. LDDF >= max(1,N).

◆ magma_zlaqps()

magma_int_t magma_zlaqps ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaDoubleComplex * A,
magma_int_t lda,
magmaDoubleComplex_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaDoubleComplex * tau,
double * vn1,
double * vn2,
magmaDoubleComplex * auxv,
magmaDoubleComplex * F,
magma_int_t ldf,
magmaDoubleComplex_ptr dF,
magma_int_t lddf )

ZLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]ACOMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,M).
[in,out]dACOMPLEX_16 array, dimension (LDA,N) Copy of A on the GPU. Portions of A are updated on the CPU; portions of dA are updated on the GPU. See code for details.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]tauCOMPLEX_16 array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]vn1DOUBLE PRECISION array, dimension (N) The vector with the partial column norms.
[in,out]vn2DOUBLE PRECISION array, dimension (N) The vector with the exact column norms.
[in,out]auxvCOMPLEX_16 array, dimension (NB) Auxiliar vector.
[in,out]FCOMPLEX_16 array, dimension (LDF,NB) Matrix F' = L*Y'*A.
[in]ldfINTEGER The leading dimension of the array F. LDF >= max(1,N).
[in,out]dFCOMPLEX_16 array, dimension (LDDF,NB) Copy of F on the GPU. See code for details.
[in]lddfINTEGER The leading dimension of the array dF. LDDF >= max(1,N).

◆ magma_claqps2_gpu()

magma_int_t magma_claqps2_gpu ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaFloatComplex_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaFloatComplex_ptr dtau,
magmaFloat_ptr dvn1,
magmaFloat_ptr dvn2,
magmaFloatComplex_ptr dauxv,
magmaFloatComplex_ptr dF,
magma_int_t lddf,
magmaFloat_ptr dlsticcs,
magma_queue_t queue )

CLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]dACOMPLEX array, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]lddaINTEGER The leading dimension of the array A. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]dtauCOMPLEX array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]dvn1REAL array, dimension (N) The vector with the partial column norms.
[in,out]dvn2REAL array, dimension (N) The vector with the exact column norms.
[in,out]dauxvCOMPLEX array, dimension (NB) Auxiliar vector.
[in,out]dFCOMPLEX array, dimension (LDDF,NB) Matrix F**H = L * Y**H * A.
[in]lddfINTEGER The leading dimension of the array F. LDDF >= max(1,N).
dlsticcsTODO: undocumented
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dlaqps2_gpu()

magma_int_t magma_dlaqps2_gpu ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaDouble_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaDouble_ptr dtau,
magmaDouble_ptr dvn1,
magmaDouble_ptr dvn2,
magmaDouble_ptr dauxv,
magmaDouble_ptr dF,
magma_int_t lddf,
magmaDouble_ptr dlsticcs,
magma_queue_t queue )

DLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]dADOUBLE PRECISION array, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]lddaINTEGER The leading dimension of the array A. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]dtauDOUBLE PRECISION array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]dvn1DOUBLE PRECISION array, dimension (N) The vector with the partial column norms.
[in,out]dvn2DOUBLE PRECISION array, dimension (N) The vector with the exact column norms.
[in,out]dauxvDOUBLE PRECISION array, dimension (NB) Auxiliar vector.
[in,out]dFDOUBLE PRECISION array, dimension (LDDF,NB) Matrix F**H = L * Y**H * A.
[in]lddfINTEGER The leading dimension of the array F. LDDF >= max(1,N).
dlsticcsTODO: undocumented
[in]queuemagma_queue_t Queue to execute in.

◆ magma_slaqps2_gpu()

magma_int_t magma_slaqps2_gpu ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaFloat_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaFloat_ptr dtau,
magmaFloat_ptr dvn1,
magmaFloat_ptr dvn2,
magmaFloat_ptr dauxv,
magmaFloat_ptr dF,
magma_int_t lddf,
magmaFloat_ptr dlsticcs,
magma_queue_t queue )

SLAQPS computes a step of QR factorization with column pivoting of a real M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]dAREAL array, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]lddaINTEGER The leading dimension of the array A. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]dtauREAL array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]dvn1REAL array, dimension (N) The vector with the partial column norms.
[in,out]dvn2REAL array, dimension (N) The vector with the exact column norms.
[in,out]dauxvREAL array, dimension (NB) Auxiliar vector.
[in,out]dFREAL array, dimension (LDDF,NB) Matrix F**H = L * Y**H * A.
[in]lddfINTEGER The leading dimension of the array F. LDDF >= max(1,N).
dlsticcsTODO: undocumented
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zlaqps2_gpu()

magma_int_t magma_zlaqps2_gpu ( magma_int_t m,
magma_int_t n,
magma_int_t offset,
magma_int_t nb,
magma_int_t * kb,
magmaDoubleComplex_ptr dA,
magma_int_t ldda,
magma_int_t * jpvt,
magmaDoubleComplex_ptr dtau,
magmaDouble_ptr dvn1,
magmaDouble_ptr dvn2,
magmaDoubleComplex_ptr dauxv,
magmaDoubleComplex_ptr dF,
magma_int_t lddf,
magmaDouble_ptr dlsticcs,
magma_queue_t queue )

ZLAQPS computes a step of QR factorization with column pivoting of a complex M-by-N matrix A by using Blas-3.

It tries to factorize NB columns from A starting from the row OFFSET+1, and updates all of the matrix with Blas-3 xGEMM.

In some cases, due to catastrophic cancellations, it cannot factorize NB columns. Hence, the actual number of factorized columns is returned in KB.

Block A(1:OFFSET,1:N) is accordingly pivoted, but not factorized.

Parameters
[in]mINTEGER The number of rows of the matrix A. M >= 0.
[in]nINTEGER The number of columns of the matrix A. N >= 0
[in]offsetINTEGER The number of rows of A that have been factorized in previous steps.
[in]nbINTEGER The number of columns to factorize.
[out]kbINTEGER The number of columns actually factorized.
[in,out]dACOMPLEX*16 array, dimension (LDDA,N) On entry, the M-by-N matrix A. On exit, block A(OFFSET+1:M,1:KB) is the triangular factor obtained and block A(1:OFFSET,1:N) has been accordingly pivoted, but no factorized. The rest of the matrix, block A(OFFSET+1:M,KB+1:N) has been updated.
[in]lddaINTEGER The leading dimension of the array A. LDDA >= max(1,M).
[in,out]jpvtINTEGER array, dimension (N) JPVT(I) = K <==> Column K of the full matrix A has been permuted into position I in AP.
[out]dtauCOMPLEX*16 array, dimension (KB) The scalar factors of the elementary reflectors.
[in,out]dvn1DOUBLE PRECISION array, dimension (N) The vector with the partial column norms.
[in,out]dvn2DOUBLE PRECISION array, dimension (N) The vector with the exact column norms.
[in,out]dauxvCOMPLEX*16 array, dimension (NB) Auxiliar vector.
[in,out]dFCOMPLEX*16 array, dimension (LDDF,NB) Matrix F**H = L * Y**H * A.
[in]lddfINTEGER The leading dimension of the array F. LDDF >= max(1,N).
dlsticcsTODO: undocumented
[in]queuemagma_queue_t Queue to execute in.