MAGMA  2.0.0
Matrix Algebra for GPU and Multicore Architectures

Functions

magma_int_t magma_dlahr2 (magma_int_t n, magma_int_t k, magma_int_t nb, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dV, magma_int_t lddv, double *A, magma_int_t lda, double *tau, double *T, magma_int_t ldt, double *Y, magma_int_t ldy, magma_queue_t queue)
 DLAHR2 reduces the first NB columns of a real general n-BY-(n-k+1) matrix A so that elements below the k-th subdiagonal are zero. More...
 
magma_int_t magma_dlahr2_m (magma_int_t n, magma_int_t k, magma_int_t nb, double *A, magma_int_t lda, double *tau, double *T, magma_int_t ldt, double *Y, magma_int_t ldy, struct dgehrd_data *data)
 DLAHR2 reduces the first NB columns of a real general n-BY-(n-k+1) matrix A so that elements below the k-th subdiagonal are zero. More...
 
magma_int_t magma_dlahru (magma_int_t n, magma_int_t ihi, magma_int_t k, magma_int_t nb, double *A, magma_int_t lda, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dY, magma_int_t lddy, magmaDouble_ptr dV, magma_int_t lddv, magmaDouble_ptr dT, magmaDouble_ptr dwork, magma_queue_t queue)
 DLAHRU is an auxiliary MAGMA routine that is used in DGEHRD to update the trailing sub-matrices after the reductions of the corresponding panels. More...
 
magma_int_t magma_dlahru_m (magma_int_t n, magma_int_t ihi, magma_int_t k, magma_int_t nb, double *A, magma_int_t lda, struct dgehrd_data *data)
 DLAHRU is an auxiliary MAGMA routine that is used in DGEHRD to update the trailing sub-matrices after the reductions of the corresponding panels. More...
 
magma_int_t magma_dlaqtrsd (magma_trans_t trans, magma_int_t n, const double *T, magma_int_t ldt, double *x, magma_int_t ldx, const double *cnorm, magma_int_t *info)
 DLAQTRSD is used by DTREVC to solve one of the (singular) quasi-triangular systems with modified diagonal (T - lambda*I) * x = 0 or (T - lambda*I)**T * x = 0 with scaling to prevent overflow. More...
 

Detailed Description

Function Documentation

magma_int_t magma_dlahr2 ( magma_int_t  n,
magma_int_t  k,
magma_int_t  nb,
magmaDouble_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dV,
magma_int_t  lddv,
double *  A,
magma_int_t  lda,
double *  tau,
double *  T,
magma_int_t  ldt,
double *  Y,
magma_int_t  ldy,
magma_queue_t  queue 
)

DLAHR2 reduces the first NB columns of a real general n-BY-(n-k+1) matrix A so that elements below the k-th subdiagonal are zero.

The reduction is performed by an orthogonal similarity transformation Q' * A * Q. The routine returns the matrices V and T which determine Q as a block reflector I - V*T*V', and also the matrix Y = A * V. (Note this is different than LAPACK, which computes Y = A * V * T.)

This is an auxiliary routine called by DGEHRD.

Parameters
[in]nINTEGER The order of the matrix A.
[in]kINTEGER The offset for the reduction. Elements below the k-th subdiagonal in the first NB columns are reduced to zero. K < N.
[in]nbINTEGER The number of columns to be reduced.
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,N-K+1) On entry, the n-by-(n-k+1) general matrix A. On exit, the elements in rows K:N of the first NB columns are overwritten with the matrix Y.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N).
[out]dVDOUBLE PRECISION array on the GPU, dimension (LDDV, NB) On exit this n-by-nb array contains the Householder vectors of the transformation.
[in]lddvINTEGER The leading dimension of the array dV. LDDV >= max(1,N).
[in,out]ADOUBLE PRECISION array, dimension (LDA,N-K+1) On entry, the n-by-(n-k+1) general matrix A. On exit, the elements on and above the k-th subdiagonal in the first NB columns are overwritten with the corresponding elements of the reduced matrix; the elements below the k-th subdiagonal, with the array TAU, represent the matrix Q as a product of elementary reflectors. The other columns of A are unchanged. See Further Details.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]tauDOUBLE PRECISION array, dimension (NB) The scalar factors of the elementary reflectors. See Further Details.
[out]TDOUBLE PRECISION array, dimension (LDT,NB) The upper triangular matrix T.
[in]ldtINTEGER The leading dimension of the array T. LDT >= NB.
[out]YDOUBLE PRECISION array, dimension (LDY,NB) The n-by-nb matrix Y.
[in]ldyINTEGER The leading dimension of the array Y. LDY >= N.
[in]queuemagma_queue_t Queue to execute in.

Further Details

The matrix Q is represented as a product of nb elementary reflectors

Q = H(1) H(2) . . . H(nb).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in A(i+k+1:n,i), and tau in TAU(i).

The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with T and Y, to apply the transformation to the unreduced part of the matrix, using an update of the form: A := (I - V*T*V') * (A - Y*T*V').

The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb = 2:

   ( a   a   a   a   a )
   ( a   a   a   a   a )
   ( a   a   a   a   a )
   ( h   h   a   a   a )
   ( v1  h   a   a   a )
   ( v1  v2  a   a   a )
   ( v1  v2  a   a   a )

where "a" denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg matrix H, and vi denotes an element of the vector defining H(i).

This implementation follows the hybrid algorithm and notations described in

S. Tomov and J. Dongarra, "Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing," University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 24, 2009.

magma_int_t magma_dlahr2_m ( magma_int_t  n,
magma_int_t  k,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
double *  tau,
double *  T,
magma_int_t  ldt,
double *  Y,
magma_int_t  ldy,
struct dgehrd_data *  data 
)

DLAHR2 reduces the first NB columns of a real general n-BY-(n-k+1) matrix A so that elements below the k-th subdiagonal are zero.

The reduction is performed by an orthogonal similarity transformation Q' * A * Q. The routine returns the matrices V and T which determine Q as a block reflector I - V*T*V', and also the matrix Y = A * V. (Note this is different than LAPACK, which computes Y = A * V * T.)

This is an auxiliary routine called by DGEHRD.

Parameters
[in]nINTEGER The order of the matrix A.
[in]kINTEGER The offset for the reduction. Elements below the k-th subdiagonal in the first NB columns are reduced to zero. K < N.
[in]nbINTEGER The number of columns to be reduced.
[in,out]ADOUBLE PRECISION array, dimension (LDA,N-K+1) On entry, the n-by-(n-k+1) general matrix A. On exit, the elements on and above the k-th subdiagonal in the first NB columns are overwritten with the corresponding elements of the reduced matrix; the elements below the k-th subdiagonal, with the array TAU, represent the matrix Q as a product of elementary reflectors. The other columns of A are unchanged. See Further Details.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]tauDOUBLE PRECISION array, dimension (NB) The scalar factors of the elementary reflectors. See Further Details.
[out]TDOUBLE PRECISION array, dimension (LDT,NB) The upper triangular matrix T.
[in]ldtINTEGER The leading dimension of the array T. LDT >= NB.
[out]YDOUBLE PRECISION array, dimension (LDY,NB) The n-by-nb matrix Y.
[in]ldyINTEGER The leading dimension of the array Y. LDY >= N.
[in,out]dataStructure with pointers to dA, dT, dV, dW, dY which are distributed across multiple GPUs.

Further Details

The matrix Q is represented as a product of nb elementary reflectors

Q = H(1) H(2) . . . H(nb).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in A(i+k+1:n,i), and tau in TAU(i).

The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with T and Y, to apply the transformation to the unreduced part of the matrix, using an update of the form: A := (I - V*T*V') * (A - Y*T*V').

The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb = 2:

   ( a   a   a   a   a )
   ( a   a   a   a   a )
   ( a   a   a   a   a )
   ( h   h   a   a   a )
   ( v1  h   a   a   a )
   ( v1  v2  a   a   a )
   ( v1  v2  a   a   a )

where "a" denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg matrix H, and vi denotes an element of the vector defining H(i).

This implementation follows the hybrid algorithm and notations described in

S. Tomov and J. Dongarra, "Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing," University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 24, 2009.

magma_int_t magma_dlahru ( magma_int_t  n,
magma_int_t  ihi,
magma_int_t  k,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
magmaDouble_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dY,
magma_int_t  lddy,
magmaDouble_ptr  dV,
magma_int_t  lddv,
magmaDouble_ptr  dT,
magmaDouble_ptr  dwork,
magma_queue_t  queue 
)

DLAHRU is an auxiliary MAGMA routine that is used in DGEHRD to update the trailing sub-matrices after the reductions of the corresponding panels.

See further details below.

Parameters
[in]nINTEGER The order of the matrix A. N >= 0.
[in]ihiINTEGER Last row to update. Same as IHI in dgehrd.
[in]kINTEGER Number of rows of the matrix Am (see details below)
[in]nbINTEGER Block size
[out]ADOUBLE PRECISION array, dimension (LDA,N-K) On entry, the N-by-(N-K) general matrix to be updated. The computation is done on the GPU. After Am is updated on the GPU only Am(1:NB) is transferred to the CPU - to update the corresponding Am matrix. See Further Details below.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[in,out]dADOUBLE PRECISION array on the GPU, dimension (LDDA,N-K). On entry, the N-by-(N-K) general matrix to be updated. On exit, the 1st K rows (matrix Am) of A are updated by applying an orthogonal transformation from the right Am = Am (I-V T V'), and sub-matrix Ag is updated by Ag = (I - V T V') Ag (I - V T V(NB+1:)' ) where Q = I - V T V' represent the orthogonal matrix (as a product of elementary reflectors V) used to reduce the current panel of A to upper Hessenberg form. After Am is updated Am(:,1:NB) is sent to the CPU. See Further Details below.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,N).
[in,out]dY(workspace) DOUBLE PRECISION array on the GPU, dimension (LDDY, NB). On entry the (N-K)-by-NB Y = A V. It is used internally as workspace, so its value is changed on exit.
[in]lddyINTEGER The leading dimension of the array dY. LDDY >= max(1,N).
[in,out]dV(workspace) DOUBLE PRECISION array on the GPU, dimension (LDDV, NB). On entry the (N-K)-by-NB matrix V of elementary reflectors used to reduce the current panel of A to upper Hessenberg form. The rest K-by-NB part is used as workspace. V is unchanged on exit.
[in]lddvINTEGER The leading dimension of the array dV. LDDV >= max(1,N).
[in]dTDOUBLE PRECISION array on the GPU, dimension (NB, NB). On entry the NB-by-NB upper trinagular matrix defining the orthogonal Hessenberg reduction transformation matrix for the current panel. The lower triangular part are 0s.
dwork(workspace) DOUBLE PRECISION array on the GPU, dimension N*NB.
[in]queuemagma_queue_t Queue to execute in.

Further Details

This implementation follows the algorithm and notations described in:

S. Tomov and J. Dongarra, "Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing," University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 24, 2009.

The difference is that here Am is computed on the GPU. M is renamed Am, G is renamed Ag.

magma_int_t magma_dlahru_m ( magma_int_t  n,
magma_int_t  ihi,
magma_int_t  k,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
struct dgehrd_data *  data 
)

DLAHRU is an auxiliary MAGMA routine that is used in DGEHRD to update the trailing sub-matrices after the reductions of the corresponding panels.

See further details below.

Parameters
[in]nINTEGER The order of the matrix A. N >= 0.
[in]ihiINTEGER Last row to update. Same as IHI in dgehrd.
[in]kINTEGER Number of rows of the matrix Am (see details below)
[in]nbINTEGER Block size
[out]ADOUBLE PRECISION array, dimension (LDA,N-K) On entry, the N-by-(N-K) general matrix to be updated. The computation is done on the GPU. After Am is updated on the GPU only Am(1:NB) is transferred to the CPU - to update the corresponding Am matrix. See Further Details below.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[in,out]dataStructure with pointers to dA, dT, dV, dW, dY which are distributed across multiple GPUs.

Further Details

This implementation follows the algorithm and notations described in:

S. Tomov and J. Dongarra, "Accelerating the reduction to upper Hessenberg form through hybrid GPU-based computing," University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 24, 2009.

The difference is that here Am is computed on the GPU. M is renamed Am, G is renamed Ag.

magma_int_t magma_dlaqtrsd ( magma_trans_t  trans,
magma_int_t  n,
const double *  T,
magma_int_t  ldt,
double *  x,
magma_int_t  ldx,
const double *  cnorm,
magma_int_t *  info 
)

DLAQTRSD is used by DTREVC to solve one of the (singular) quasi-triangular systems with modified diagonal (T - lambda*I) * x = 0 or (T - lambda*I)**T * x = 0 with scaling to prevent overflow.

Here T is an upper quasi-triangular matrix with 1x1 or 2x2 diagonal blocks, A**T denotes the transpose of A, and x is an n-element real or complex vector. The eigenvalue lambda is computed from the block diagonal of T. It does not modify T during the computation.

If trans = MagmaNoTrans, lambda is an eigenvalue for the lower 1x1 or 2x2 block, and it solves ( [ That u ] - lambda*I ) * x = 0, ( [ 0 lambda ] ) which becomes (That - lambda*I) * w = -s*u, with x = [ w; 1 ] and scaling s. If the lower block is 1x1, lambda and x are real; if the lower block is 2x2, lambda and x are complex.

If trans = MagmaTrans, lambda is an eigenvalue for the upper 1x1 or 2x2 block, and it solves ( [ lambda v^T ] - lambda I )**T * x = 0, ( [ 0 That ] ) which becomes (That - lambda*I)**T * w = -s*v, with x = [ 1; w ] and scaling s. If the upper block is 1x1, lambda and x are real; if the upper block is 2x2, lambda and x are complex.

Parameters
[in]transmagma_trans_t Specifies the operation applied to T.
  • = MagmaNoTrans: Solve (T - lambda*I) * x = 0 (No transpose)
  • = MagmaTrans: Solve (T - lambda*I)**T * x = 0 (Transpose)
[in]nINTEGER The order of the matrix T. N >= 0.
[in]TDOUBLE PRECISION array, dimension (LDT,N) The triangular matrix T. The leading n by n upper triangular part of the array T contains the upper triangular matrix, and the strictly lower triangular part of T is not referenced.
[in]ldtINTEGER The leading dimension of the array T. LDT >= max (1,N).
[out]xDOUBLE PRECISION array, dimension (LDX,1) or (LDX,2). On exit, X is overwritten by the solution vector x. If LAMBDAI .EQ. 0, X is real and has dimension (LDX,1). If LAMBDAI .NE. 0, X is complex and has dimension (LDX,2); the real part is in X(:,0), the imaginary part in X(:,1).
[in]ldxINTEGER The leading dimension of the array X. LDX >= max(1,N).
[in,out]cnorm(input) DOUBLE PRECISION array, dimension (N) CNORM(j) contains the norm of the off-diagonal part of the j-th column of T. If TRANS = MagmaNoTrans, CNORM(j) must be greater than or equal to the infinity-norm, and if TRANS = MagmaTrans or MagmaConjTrans, CNORM(j) must be greater than or equal to the 1-norm.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -k, the k-th argument had an illegal value