MAGMA  1.6.3
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages

Functions

magma_int_t magma_dorgqr_2stage_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
 DORGQR generates an M-by-N DOUBLE_PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More...
 
magma_int_t magma_dormqr_gpu_2stages (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dC, magma_int_t lddc, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
 DORMQR_GPU overwrites the general real M-by-N matrix C with. More...
 
magma_int_t magma_dsytrd_sb2st (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, magma_int_t Vblksiz, double *A, magma_int_t lda, double *d, double *e, double *V, magma_int_t ldv, double *TAU, magma_int_t wantz, double *T, magma_int_t ldt)
 
magma_int_t magma_dsytrd_sy2sb (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magmaDouble_ptr dT, magma_int_t *info)
 DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More...
 
magma_int_t magma_dsytrd_sy2sb_mgpu (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magmaDouble_ptr dAmgpu[], magma_int_t ldda, magmaDouble_ptr dTmgpu[], magma_int_t lddt, magma_int_t ngpu, magma_int_t distblk, magma_queue_t queues[][20], magma_int_t nqueue, magma_int_t *info)
 DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More...
 
magma_int_t magma_dsytrd_sy2sb_mgpu_spec (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, double *A, magma_int_t lda, double *tau, double *work, magma_int_t lwork, magmaDouble_ptr dAmgpu[], magma_int_t ldda, magmaDouble_ptr dTmgpu[], magma_int_t lddt, magma_int_t ngpu, magma_int_t distblk, magma_queue_t queues[][20], magma_int_t nqueue, magma_int_t *info)
 DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More...
 

Detailed Description

Function Documentation

magma_int_t magma_dorgqr_2stage_gpu ( magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDouble_ptr  dA,
magma_int_t  ldda,
double *  tau,
magmaDouble_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

DORGQR generates an M-by-N DOUBLE_PRECISION matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF_GPU.

Parameters
[in]mINTEGER The number of rows of the matrix Q. M >= 0.
[in]nINTEGER The number of columns of the matrix Q. M >= N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0.
[in,out]dADOUBLE_PRECISION array A on the GPU device, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q.
[in]lddaINTEGER The first dimension of the array A. LDDA >= max(1,M).
[in]tauDOUBLE_PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF_GPU.
[in]dTDOUBLE_PRECISION work space array on the GPU device, dimension (MIN(M, N) )*NB. This must be the 6th argument of magma_dgeqrf_gpu [ note that if N here is bigger than N in magma_dgeqrf_gpu, the workspace requirement DT in magma_dgeqrf_gpu must be as specified in this routine ].
[in]nbINTEGER This is the block size used in DGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument has an illegal value
magma_int_t magma_dormqr_gpu_2stages ( magma_side_t  side,
magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magma_int_t  k,
magmaDouble_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dC,
magma_int_t  lddc,
magmaDouble_ptr  dT,
magma_int_t  nb,
magma_int_t *  info 
)

DORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H

where Q is a real unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)

as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
[in]sidemagma_side_t
  • = MagmaLeft: apply Q or Q**H from the Left;
  • = MagmaRight: apply Q or Q**H from the Right.
[in]transmagma_trans_t
  • = MagmaNoTrans: No transpose, apply Q;
  • = MagmaTrans: Conjugate transpose, apply Q**H.
[in]mINTEGER The number of rows of the matrix C. M >= 0.
[in]nINTEGER The number of columns of the matrix C. N >= 0.
[in]kINTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0.
[in]dADOUBLE_PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit.
[in]lddaINTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N).
[in,out]dCDOUBLE_PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q.
[in]lddcINTEGER The leading dimension of the array DC. LDDC >= max(1,M).
[in]dTDOUBLE_PRECISION array on the GPU that is the output (the 9th argument) of magma_dgeqrf_gpu.
[in]nbINTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_dgeqrf_gpu.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value
magma_int_t magma_dsytrd_sb2st ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nb,
magma_int_t  Vblksiz,
double *  A,
magma_int_t  lda,
double *  d,
double *  e,
double *  V,
magma_int_t  ldv,
double *  TAU,
magma_int_t  wantz,
double *  T,
magma_int_t  ldt 
)
Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangles of A is stored;
  • = MagmaLower: Lower triangles of A is stored.
[in]nINTEGER The order of the matrix A. n >= 0.
[in]nbINTEGER The order of the band matrix A. n >= nb >= 0.
[in]VblksizINTEGER The size of the block of householder vectors applied at once.
[in]A(workspace) DOUBLE_PRECISION array, dimension (lda, n) On entry the band matrix stored in the following way:
[in]ldaINTEGER The leading dimension of the array A. lda >= 2*nb.
[out]dDOUBLE array, dimension (n) The diagonal elements of the tridiagonal matrix T: D(i) = A(i,i).
[out]eDOUBLE array, dimension (n-1) The off-diagonal elements of the tridiagonal matrix T: E(i) = A(i,i+1) if UPLO = MagmaUpper, E(i) = A(i+1,i) if UPLO = MagmaLower.
[out]VDOUBLE_PRECISION array, dimension (BLKCNT, LDV, VBLKSIZ) On exit it contains the blocks of householder reflectors BLKCNT is the number of block and it is returned by the funtion MAGMA_BULGE_GET_BLKCNT.
[in]ldvINTEGER The leading dimension of V. LDV > nb + VBLKSIZ + 1
[out]TAUDOUBLE_PRECISION dimension(BLKCNT, VBLKSIZ) ???
[in]wantzINTEGER if COMPT = 0 T is not computed if COMPT = 1 T is computed
[out]TDOUBLE_PRECISION dimension(LDT *) if COMPT = 1 on exit contains the matrices T needed for Q2 if COMPT = 0 T is not referenced
[in]ldtINTEGER The leading dimension of T. LDT > Vblksiz
magma_int_t magma_dsytrd_sy2sb ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magmaDouble_ptr  dT,
magma_int_t *  info 
)

DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ADOUBLE_PRECISION array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]tauDOUBLE_PRECISION array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE_PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]dTDOUBLE_PRECISION array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

if UPLO = MagmaUpper: if UPLO = MagmaLower:

( d e v2 v3 v4 ) ( d ) ( d e v3 v4 ) ( e d ) ( d e v4 ) ( v1 e d ) ( d e ) ( v1 v2 e d ) ( d ) ( v1 v2 v3 e d )

where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).

magma_int_t magma_dsytrd_sy2sb_mgpu ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magmaDouble_ptr  dAmgpu[],
magma_int_t  ldda,
magmaDouble_ptr  dTmgpu[],
magma_int_t  lddt,
magma_int_t  ngpu,
magma_int_t  distblk,
magma_queue_t  queues[][20],
magma_int_t  nqueue,
magma_int_t *  info 
)

DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ADOUBLE_PRECISION array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]tauDOUBLE_PRECISION array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE_PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]dTDOUBLE_PRECISION array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

if UPLO = MagmaUpper: if UPLO = MagmaLower:

(  d   e   v2  v3  v4 )              (  d                  )
(      d   e   v3  v4 )              (  e   d              )
(          d   e   v4 )              (  v1  e   d          )
(              d   e  )              (  v1  v2  e   d      )
(                  d  )              (  v1  v2  v3  e   d  )

where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).

magma_int_t magma_dsytrd_sy2sb_mgpu_spec ( magma_uplo_t  uplo,
magma_int_t  n,
magma_int_t  nb,
double *  A,
magma_int_t  lda,
double *  tau,
double *  work,
magma_int_t  lwork,
magmaDouble_ptr  dAmgpu[],
magma_int_t  ldda,
magmaDouble_ptr  dTmgpu[],
magma_int_t  lddt,
magma_int_t  ngpu,
magma_int_t  distblk,
magma_queue_t  queues[][20],
magma_int_t  nqueue,
magma_int_t *  info 
)

DSYTRD_HE2HB reduces a real symmetric matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters
[in]uplomagma_uplo_t
  • = MagmaUpper: Upper triangle of A is stored;
  • = MagmaLower: Lower triangle of A is stored.
[in]nINTEGER The order of the matrix A. N >= 0.
[in,out]ADOUBLE_PRECISION array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.
[in]ldaINTEGER The leading dimension of the array A. LDA >= max(1,N).
[out]tauDOUBLE_PRECISION array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details).
[out]work(workspace) DOUBLE_PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK.
[in]lworkINTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.
[out]dTDOUBLE_PRECISION array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.
[out]infoINTEGER
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

if UPLO = MagmaUpper: if UPLO = MagmaLower:

( d e v2 v3 v4 ) ( d ) ( d e v3 v4 ) ( e d ) ( d e v4 ) ( v1 e d ) ( d e ) ( v1 v2 e d ) ( d ) ( v1 v2 v3 e d )

where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).