![]() |
MAGMA
1.6.1
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_chetrd_hb2st (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, magma_int_t Vblksiz, magmaFloatComplex *A, magma_int_t lda, float *d, float *e, magmaFloatComplex *V, magma_int_t ldv, magmaFloatComplex *TAU, magma_int_t compT, magmaFloatComplex *T, magma_int_t ldt) |
magma_int_t | magma_chetrd_he2hb (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magmaFloatComplex_ptr dT, magma_int_t *info) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More... | |
magma_int_t | magma_chetrd_he2hb_mgpu (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magmaFloatComplex_ptr dAmgpu[], magma_int_t ldda, magmaFloatComplex_ptr dTmgpu[], magma_int_t lddt, magma_int_t ngpu, magma_int_t distblk, magma_queue_t queues[][20], magma_int_t nqueue, magma_int_t *info) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More... | |
magma_int_t | magma_chetrd_he2hb_mgpu_spec (magma_uplo_t uplo, magma_int_t n, magma_int_t nb, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *work, magma_int_t lwork, magmaFloatComplex_ptr dAmgpu[], magma_int_t ldda, magmaFloatComplex_ptr dTmgpu[], magma_int_t lddt, magma_int_t ngpu, magma_int_t distblk, magma_queue_t queues[][20], magma_int_t nqueue, magma_int_t *info) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T. More... | |
magma_int_t | magma_cungqr_2stage_gpu (magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info) |
CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M. More... | |
magma_int_t | magma_cunmqr_gpu_2stages (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info) |
CUNMQR_GPU overwrites the general complex M-by-N matrix C with. More... | |
magma_int_t magma_chetrd_hb2st | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magma_int_t | Vblksiz, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
float * | d, | ||
float * | e, | ||
magmaFloatComplex * | V, | ||
magma_int_t | ldv, | ||
magmaFloatComplex * | TAU, | ||
magma_int_t | compT, | ||
magmaFloatComplex * | T, | ||
magma_int_t | ldt | ||
) |
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in] | nb | INTEGER The order of the band matrix A. N >= NB >= 0. |
[in] | Vblksiz | INTEGER The size of the block of householder vectors applied at once. |
[in] | A | (workspace) COMPLEX array, dimension (LDA, N) On entry the band matrix stored in the following way: |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= 2*NB. |
[out] | d | DOUBLE array, dimension (N) The diagonal elements of the tridiagonal matrix T: D(i) = A(i,i). |
[out] | e | DOUBLE array, dimension (N-1) The off-diagonal elements of the tridiagonal matrix T: E(i) = A(i,i+1) if UPLO = MagmaUpper, E(i) = A(i+1,i) if UPLO = MagmaLower. |
[out] | V | COMPLEX array, dimension (BLKCNT, LDV, VBLKSIZ) On exit it contains the blocks of householder reflectors BLKCNT is the number of block and it is returned by the funtion MAGMA_BULGE_GET_BLKCNT. |
[in] | ldv | INTEGER The leading dimension of V. LDV > NB + VBLKSIZ + 1 |
[out] | TAU | COMPLEX dimension(BLKCNT, VBLKSIZ) ??? |
[in] | compT | INTEGER if COMPT = 0 T is not computed if COMPT = 1 T is computed |
[out] | T | COMPLEX dimension(LDT *) if COMPT = 1 on exit contains the matrices T needed for Q2 if COMPT = 0 T is not referenced |
[in] | ldt | INTEGER The leading dimension of T. LDT > Vblksiz |
magma_int_t magma_chetrd_he2hb | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magmaFloatComplex * | tau, | ||
magmaFloatComplex * | work, | ||
magma_int_t | lwork, | ||
magmaFloatComplex_ptr | dT, | ||
magma_int_t * | info | ||
) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.
This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the Hermitian matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | tau | COMPLEX array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | dT | COMPLEX array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another. |
[out] | info | INTEGER
|
If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors
Q = H(n-1) . . . H(2) H(1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).
If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(n-1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).
The contents of A on exit are illustrated by the following examples with n = 5:
if UPLO = MagmaUpper: if UPLO = MagmaLower:
( d e v2 v3 v4 ) ( d ) ( d e v3 v4 ) ( e d ) ( d e v4 ) ( v1 e d ) ( d e ) ( v1 v2 e d ) ( d ) ( v1 v2 v3 e d )
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).
magma_int_t magma_chetrd_he2hb_mgpu | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magmaFloatComplex * | tau, | ||
magmaFloatComplex * | work, | ||
magma_int_t | lwork, | ||
magmaFloatComplex_ptr | dAmgpu[], | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dTmgpu[], | ||
magma_int_t | lddt, | ||
magma_int_t | ngpu, | ||
magma_int_t | distblk, | ||
magma_queue_t | queues[][20], | ||
magma_int_t | nqueue, | ||
magma_int_t * | info | ||
) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.
This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the Hermitian matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | tau | COMPLEX array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | dT | COMPLEX array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another. |
[out] | info | INTEGER
|
If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors
Q = H(n-1) . . . H(2) H(1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).
If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(n-1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).
The contents of A on exit are illustrated by the following examples with n = 5:
if UPLO = MagmaUpper: if UPLO = MagmaLower:
( d e v2 v3 v4 ) ( d ) ( d e v3 v4 ) ( e d ) ( d e v4 ) ( v1 e d ) ( d e ) ( v1 v2 e d ) ( d ) ( v1 v2 v3 e d )
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).
magma_int_t magma_chetrd_he2hb_mgpu_spec | ( | magma_uplo_t | uplo, |
magma_int_t | n, | ||
magma_int_t | nb, | ||
magmaFloatComplex * | A, | ||
magma_int_t | lda, | ||
magmaFloatComplex * | tau, | ||
magmaFloatComplex * | work, | ||
magma_int_t | lwork, | ||
magmaFloatComplex_ptr | dAmgpu[], | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dTmgpu[], | ||
magma_int_t | lddt, | ||
magma_int_t | ngpu, | ||
magma_int_t | distblk, | ||
magma_queue_t | queues[][20], | ||
magma_int_t | nqueue, | ||
magma_int_t * | info | ||
) |
CHETRD_HE2HB reduces a complex Hermitian matrix A to real symmetric band-diagonal form T by an orthogonal similarity transformation: Q**H * A * Q = T.
This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').
[in] | uplo | magma_uplo_t
|
[in] | n | INTEGER The order of the matrix A. N >= 0. |
[in,out] | A | COMPLEX array, dimension (LDA,N) On entry, the Hermitian matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details. |
[in] | lda | INTEGER The leading dimension of the array A. LDA >= max(1,N). |
[out] | tau | COMPLEX array, dimension (N-1) The scalar factors of the elementary reflectors (see Further Details). |
[out] | work | (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. |
[in] | lwork | INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. |
[out] | dT | COMPLEX array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another. |
[out] | info | INTEGER
|
If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors
Q = H(n-1) . . . H(2) H(1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).
If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(n-1).
Each H(i) has the form
H(i) = I - tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).
The contents of A on exit are illustrated by the following examples with n = 5:
if UPLO = MagmaUpper: if UPLO = MagmaLower:
( d e v2 v3 v4 ) ( d ) ( d e v3 v4 ) ( e d ) ( d e v4 ) ( v1 e d ) ( d e ) ( v1 v2 e d ) ( d ) ( v1 v2 v3 e d )
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector defining H(i).
magma_int_t magma_cungqr_2stage_gpu | ( | magma_int_t | m, |
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex * | tau, | ||
magmaFloatComplex_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
CUNGQR generates an M-by-N COMPLEX matrix Q with orthonormal columns, which is defined as the first N columns of a product of K elementary reflectors of order M.
Q = H(1) H(2) . . . H(k)
as returned by CGEQRF_GPU.
[in] | m | INTEGER The number of rows of the matrix Q. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix Q. M >= N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. N >= K >= 0. |
[in,out] | dA | COMPLEX array A on the GPU device, dimension (LDDA,N). On entry, the i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF_GPU in the first k columns of its array argument A. On exit, the M-by-N matrix Q. |
[in] | ldda | INTEGER The first dimension of the array A. LDDA >= max(1,M). |
[in] | tau | COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF_GPU. |
[in] | dT | COMPLEX work space array on the GPU device, dimension (MIN(M, N) )*NB. This must be the 6th argument of magma_cgeqrf_gpu [ note that if N here is bigger than N in magma_cgeqrf_gpu, the workspace requirement DT in magma_cgeqrf_gpu must be as specified in this routine ]. |
[in] | nb | INTEGER This is the block size used in CGEQRF_GPU, and correspondingly the size of the T matrices, used in the factorization, and stored in DT. |
[out] | info | INTEGER
|
magma_int_t magma_cunmqr_gpu_2stages | ( | magma_side_t | side, |
magma_trans_t | trans, | ||
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | k, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloatComplex_ptr | dC, | ||
magma_int_t | lddc, | ||
magmaFloatComplex_ptr | dT, | ||
magma_int_t | nb, | ||
magma_int_t * | info | ||
) |
CUNMQR_GPU overwrites the general complex M-by-N matrix C with.
SIDE = MagmaLeft SIDE = MagmaRight TRANS = MagmaNoTrans: Q * C C * Q TRANS = Magma_ConjTrans: Q**H * C C * Q**H
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.
[in] | side | magma_side_t
|
[in] | trans | magma_trans_t
|
[in] | m | INTEGER The number of rows of the matrix C. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix C. N >= 0. |
[in] | k | INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. |
[in] | dA | COMPLEX array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. |
[in] | ldda | INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). |
[in,out] | dC | COMPLEX array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. |
[in] | lddc | INTEGER The leading dimension of the array DC. LDDC >= max(1,M). |
[in] | dT | COMPLEX array on the GPU that is the output (the 9th argument) of magma_cgeqrf_gpu. |
[in] | nb | INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_cgeqrf_gpu. |
[out] | info | INTEGER
|