MAGMA  2.0.0
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
double-complex precision

Functions

void magma_zgemv (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 Perform matrix-vector product. More...
 
void magma_zgerc (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda)
 Perform rank-1 update, \( A = \alpha x y^H + A \). More...
 
void magma_zgeru (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda)
 Perform rank-1 update (unconjugated), \( A = \alpha x y^H + A \). More...
 
void magma_zhemv (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 Perform Hermitian matrix-vector product, \( y = \alpha A x + \beta y \). More...
 
void magma_zher (magma_uplo_t uplo, magma_int_t n, double alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_ptr dA, magma_int_t ldda)
 Perform Hermitian rank-1 update, \( A = \alpha x x^H + A \). More...
 
void magma_zher2 (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda)
 Perform Hermitian rank-2 update, \( A = \alpha x y^H + conj(\alpha) y x^H + A \). More...
 
void magma_ztrmv (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dx, magma_int_t incx)
 Perform triangular matrix-vector product. More...
 
void magma_ztrsv (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dx, magma_int_t incx)
 Solve triangular matrix-vector system (one right-hand side). More...
 
void magma_zgemv_q (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform matrix-vector product. More...
 
void magma_zgerc_q (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_queue_t queue)
 Perform rank-1 update, \( A = \alpha x y^H + A \). More...
 
void magma_zgeru_q (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_queue_t queue)
 Perform rank-1 update (unconjugated), \( A = \alpha x y^H + A \). More...
 
void magma_zhemv_q (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform Hermitian matrix-vector product, \( y = \alpha A x + \beta y \). More...
 
void magma_zher_q (magma_uplo_t uplo, magma_int_t n, double alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_queue_t queue)
 Perform Hermitian rank-1 update, \( A = \alpha x x^H + A \). More...
 
void magma_zher2_q (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex_const_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dA, magma_int_t ldda, magma_queue_t queue)
 Perform Hermitian rank-2 update, \( A = \alpha x y^H + conj(\alpha) y x^H + A \). More...
 
void magma_ztrmv_q (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dx, magma_int_t incx, magma_queue_t queue)
 Perform triangular matrix-vector product. More...
 
void magma_ztrsv_q (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dx, magma_int_t incx, magma_queue_t queue)
 Solve triangular matrix-vector system (one right-hand side). More...
 
void magmablas_zgemv_conj_q (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 ZGEMV_CONJ performs the matrix-vector operation. More...
 
void magmablas_zgemv_conj (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 
void magmablas_zgemv_q (magma_trans_t trans, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 ZGEMV performs one of the matrix-vector operations. More...
 
void magmablas_zgemv (magma_trans_t trans, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 
magma_int_t magmablas_zhemv_work (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dwork, magma_int_t lwork, magma_queue_t queue)
 magmablas_zhemv_work performs the matrix-vector operation: More...
 
magma_int_t magmablas_zhemv (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 magmablas_zhemv performs the matrix-vector operation: More...
 
magma_int_t magmablas_zhemv_mgpu (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr const d_lA[], magma_int_t ldda, magma_int_t offset, magmaDoubleComplex const *x, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex *y, magma_int_t incy, magmaDoubleComplex *hwork, magma_int_t lhwork, magmaDoubleComplex_ptr dwork[], magma_int_t ldwork, magma_int_t ngpu, magma_int_t nb, magma_queue_t queues[])
 magmablas_zhemv_mgpu performs the matrix-vector operation: More...
 
magma_int_t magmablas_zhemv_mgpu_sync (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr const d_lA[], magma_int_t ldda, magma_int_t offset, magmaDoubleComplex const *x, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex *y, magma_int_t incy, magmaDoubleComplex *hwork, magma_int_t lhwork, magmaDoubleComplex_ptr dwork[], magma_int_t ldwork, magma_int_t ngpu, magma_int_t nb, magma_queue_t queues[])
 Synchronizes and acculumates final zhemv result. More...
 
void magmablas_zswapblk_q (magma_order_t order, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dB, magma_int_t lddb, magma_int_t i1, magma_int_t i2, const magma_int_t *ipiv, magma_int_t inci, magma_int_t offset, magma_queue_t queue)
 
 
 
void magmablas_zswapblk (magma_order_t order, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dB, magma_int_t lddb, magma_int_t i1, magma_int_t i2, const magma_int_t *ipiv, magma_int_t inci, magma_int_t offset)
 
magma_int_t magmablas_zsymv_work (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dwork, magma_int_t lwork, magma_queue_t queue)
 magmablas_zsymv_work performs the matrix-vector operation: More...
 
magma_int_t magmablas_zsymv (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy)
 magmablas_zsymv performs the matrix-vector operation: More...
 
void magmablas_ztrsv (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr db, magma_int_t incb, magma_queue_t queue)
 ztrsv solves one of the matrix equations on gpu More...
 
void magmablas_ztrsv_batched (magma_uplo_t uplo, magma_trans_t trans, magma_diag_t diag, magma_int_t n, magmaDoubleComplex **A_array, magma_int_t lda, magmaDoubleComplex **b_array, magma_int_t incb, magma_int_t batchCount, magma_queue_t queue)
 ztrsv solves one of the matrix equations on gpu More...
 

Detailed Description

Function Documentation

void magma_zgemv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)

Perform matrix-vector product.

\( y = \alpha A x + \beta y \) (transA == MagmaNoTrans), or
\( y = \alpha A^T x + \beta y \) (transA == MagmaTrans), or
\( y = \alpha A^H x + \beta y \) (transA == MagmaConjTrans).

Parameters
[in]transAOperation to perform on A.
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,m). The m-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. If transA == MagmaNoTrans, the n element vector x of dimension (1 + (n-1)*incx);
otherwise, the m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX_16 array on GPU device. If transA == MagmaNoTrans, the m element vector y of dimension (1 + (m-1)*incy);
otherwise, the n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
void magma_zgemv_q ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magma_queue_t  queue 
)

Perform matrix-vector product.

\( y = \alpha A x + \beta y \) (transA == MagmaNoTrans), or
\( y = \alpha A^T x + \beta y \) (transA == MagmaTrans), or
\( y = \alpha A^H x + \beta y \) (transA == MagmaConjTrans).

Parameters
[in]transAOperation to perform on A.
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,m). The m-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. If transA == MagmaNoTrans, the n element vector x of dimension (1 + (n-1)*incx);
otherwise, the m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX_16 array on GPU device. If transA == MagmaNoTrans, the m element vector y of dimension (1 + (m-1)*incy);
otherwise, the n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
void magma_zgerc ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda 
)

Perform rank-1 update, \( A = \alpha x y^H + A \).

Parameters
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array on GPU device. The m-by-n matrix A of dimension (ldda,n), ldda >= max(1,m).
[in]lddaLeading dimension of dA.
void magma_zgerc_q ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magma_queue_t  queue 
)

Perform rank-1 update, \( A = \alpha x y^H + A \).

Parameters
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array on GPU device. The m-by-n matrix A of dimension (ldda,n), ldda >= max(1,m).
[in]lddaLeading dimension of dA.
void magma_zgeru ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda 
)

Perform rank-1 update (unconjugated), \( A = \alpha x y^H + A \).

Parameters
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,m). The m-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_zgeru_q ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magma_queue_t  queue 
)

Perform rank-1 update (unconjugated), \( A = \alpha x y^H + A \).

Parameters
[in]mNumber of rows of A. m >= 0.
[in]nNumber of columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,m). The m-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_zhemv ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)

Perform Hermitian matrix-vector product, \( y = \alpha A x + \beta y \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
void magma_zhemv_q ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magma_queue_t  queue 
)

Perform Hermitian matrix-vector product, \( y = \alpha A x + \beta y \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
void magma_zher ( magma_uplo_t  uplo,
magma_int_t  n,
double  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda 
)

Perform Hermitian rank-1 update, \( A = \alpha x x^H + A \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_zher2 ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda 
)

Perform Hermitian rank-2 update, \( A = \alpha x y^H + conj(\alpha) y x^H + A \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_zher2_q ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_const_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magma_queue_t  queue 
)

Perform Hermitian rank-2 update, \( A = \alpha x y^H + conj(\alpha) y x^H + A \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_zher_q ( magma_uplo_t  uplo,
magma_int_t  n,
double  alpha,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magma_queue_t  queue 
)

Perform Hermitian rank-1 update, \( A = \alpha x x^H + A \).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in,out]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
void magma_ztrmv ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dx,
magma_int_t  incx 
)

Perform triangular matrix-vector product.

\( x = A x \) (trans == MagmaNoTrans), or
\( x = A^T x \) (trans == MagmaTrans), or
\( x = A^H x \) (trans == MagmaConjTrans).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]transOperation to perform on A.
[in]diagWhether the diagonal of A is assumed to be unit or non-unit.
[in]nNumber of rows and columns of A. n >= 0.
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
void magma_ztrmv_q ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dx,
magma_int_t  incx,
magma_queue_t  queue 
)

Perform triangular matrix-vector product.

\( x = A x \) (trans == MagmaNoTrans), or
\( x = A^T x \) (trans == MagmaTrans), or
\( x = A^H x \) (trans == MagmaConjTrans).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]transOperation to perform on A.
[in]diagWhether the diagonal of A is assumed to be unit or non-unit.
[in]nNumber of rows and columns of A. n >= 0.
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. The n element vector x of dimension (1 + (n-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
void magma_ztrsv ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dx,
magma_int_t  incx 
)

Solve triangular matrix-vector system (one right-hand side).

\( A x = b \) (trans == MagmaNoTrans), or
\( A^T x = b \) (trans == MagmaTrans), or
\( A^H x = b \) (trans == MagmaConjTrans).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]transOperation to perform on A.
[in]diagWhether the diagonal of A is assumed to be unit or non-unit.
[in]nNumber of rows and columns of A. n >= 0.
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in,out]dxCOMPLEX_16 array on GPU device. On entry, the n element RHS vector b of dimension (1 + (n-1)*incx). On exit, overwritten with the solution vector x.
[in]incxStride between consecutive elements of dx. incx != 0.
void magma_ztrsv_q ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dx,
magma_int_t  incx,
magma_queue_t  queue 
)

Solve triangular matrix-vector system (one right-hand side).

\( A x = b \) (trans == MagmaNoTrans), or
\( A^T x = b \) (trans == MagmaTrans), or
\( A^H x = b \) (trans == MagmaConjTrans).

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]transOperation to perform on A.
[in]diagWhether the diagonal of A is assumed to be unit or non-unit.
[in]nNumber of rows and columns of A. n >= 0.
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in,out]dxCOMPLEX_16 array on GPU device. On entry, the n element RHS vector b of dimension (1 + (n-1)*incx). On exit, overwritten with the solution vector x.
[in]incxStride between consecutive elements of dx. incx != 0.
void magmablas_zgemv ( magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)
void magmablas_zgemv_conj ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)
void magmablas_zgemv_conj_q ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magma_queue_t  queue 
)

ZGEMV_CONJ performs the matrix-vector operation.

y := alpha*A*conj(x) + beta*y,

where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.

Parameters
[in]mINTEGER On entry, m specifies the number of rows of the matrix A.
[in]nINTEGER On entry, n specifies the number of columns of the matrix A
[in]alphaCOMPLEX_16 On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of dimension ( LDDA, n ) on the GPU.
[in]lddaINTEGER LDDA specifies the leading dimension of A.
[in]dxCOMPLEX_16 array of dimension n
[in]incxSpecifies the increment for the elements of X. INCX must not be zero.
[in]betaDOUBLE REAL On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[out]dyDOUBLE PRECISION array of dimension m
[in]incySpecifies the increment for the elements of Y. INCY must not be zero.
void magmablas_zgemv_q ( magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magma_queue_t  queue 
)

ZGEMV performs one of the matrix-vector operations.

y := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y, or y := alpha*A**H*x + beta*y,

where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.

Parameters
[in]transmagma_trans_t On entry, TRANS specifies the operation to be performed as follows:
  • = MagmaNoTrans: y := alpha*A *x + beta*y
  • = MagmaTrans: y := alpha*A^T*x + beta*y
  • = MagmaConjTrans: y := alpha*A^H*x + beta*y
[in]mINTEGER On entry, m specifies the number of rows of the matrix A.
[in]nINTEGER On entry, n specifies the number of columns of the matrix A
[in]alphaCOMPLEX_16 On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of dimension ( LDDA, n ) on the GPU.
[in]lddaINTEGER LDDA specifies the leading dimension of A.
[in]dxCOMPLEX_16 array of dimension n if trans == MagmaNoTrans m if trans == MagmaTrans or MagmaConjTrans
[in]incxSpecifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16 On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[out]dyCOMPLEX_16 array of dimension m if trans == MagmaNoTrans n if trans == MagmaTrans or MagmaConjTrans
[in]incySpecifies the increment for the elements of Y. INCY must not be zero.
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magmablas_zhemv ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)

magmablas_zhemv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n Hermitian matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the Hermitian matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the Hermitian matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
magma_int_t magmablas_zhemv_mgpu ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr const  d_lA[],
magma_int_t  ldda,
magma_int_t  offset,
magmaDoubleComplex const *  x,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex *  y,
magma_int_t  incy,
magmaDoubleComplex *  hwork,
magma_int_t  lhwork,
magmaDoubleComplex_ptr  dwork[],
magma_int_t  ldwork,
magma_int_t  ngpu,
magma_int_t  nb,
magma_queue_t  queues[] 
)

magmablas_zhemv_mgpu performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n Hermitian matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced. Not currently supported.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]d_lAArray of pointers, dimension (ngpu), to block-column distributed matrix A, with block size nb. d_lA[dev] is a COMPLEX_16 array on GPU dev, of dimension (LDDA, nlocal), where
{ floor(n/nb/ngpu)*nb + nb if dev < floor(n/nb) % ngpu, nlocal = { floor(n/nb/ngpu)*nb + nnb if dev == floor(n/nb) % ngpu, { floor(n/nb/ngpu)*nb otherwise.
Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the Hermitian matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the Hermitian matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]offsetINTEGER. Row & column offset to start of matrix A within the distributed d_lA structure. Note that N is the size of this multiply, excluding the offset, so the size of the original parent matrix is N+offset. Also, x and y do not have an offset.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n + offset ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]xCOMPLEX_16 array on the CPU (not the GPU), of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]yCOMPLEX_16 array on the CPU (not the GPU), of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
hwork(workspace) COMPLEX_16 array on the CPU, of dimension (lhwork).
[in]lhworkINTEGER. The dimension of the array hwork. lhwork >= ngpu*nb.
dwork(workspaces) Array of pointers, dimension (ngpu), to workspace on each GPU. dwork[dev] is a COMPLEX_16 array on GPU dev, of dimension (ldwork).
[in]ldworkINTEGER. The dimension of each array dwork[dev]. ldwork >= ldda*( ceil((n + offset % nb) / nb) + 1 ).
[in]ngpuINTEGER. The number of GPUs to use.
[in]nbINTEGER. The block size used for distributing d_lA. Must be 64.
[in]queuesmagma_queue_t array of dimension (ngpu). queues[dev] is an execution queue on GPU dev.
magma_int_t magmablas_zhemv_mgpu_sync ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr const  d_lA[],
magma_int_t  ldda,
magma_int_t  offset,
magmaDoubleComplex const *  x,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex *  y,
magma_int_t  incy,
magmaDoubleComplex *  hwork,
magma_int_t  lhwork,
magmaDoubleComplex_ptr  dwork[],
magma_int_t  ldwork,
magma_int_t  ngpu,
magma_int_t  nb,
magma_queue_t  queues[] 
)

Synchronizes and acculumates final zhemv result.

For convenience, the parameters are identical to magmablas_zhemv_mgpu (though some are unused here).

See also
magmablas_zhemv_mgpu
magma_int_t magmablas_zhemv_work ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dwork,
magma_int_t  lwork,
magma_queue_t  queue 
)

magmablas_zhemv_work performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n Hermitian matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the Hermitian matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the Hermitian matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]dwork(workspace) COMPLEX_16 array on the GPU, dimension (MAX(1, LWORK)),
[in]lworkINTEGER. The dimension of the array DWORK. LWORK >= LDDA * ceil( N / NB_X ), where NB_X = 64.
[in]queuemagma_queue_t. Queue to execute in.

MAGMA implements zhemv through two steps: 1) perform the multiplication in each thread block and put the intermediate value in dwork. 2) sum the intermediate values and store the final result in y.

magamblas_zhemv_work requires users to provide a workspace, while magmablas_zhemv is a wrapper routine allocating the workspace inside the routine and provides the same interface as cublas.

If users need to call zhemv frequently, we suggest using magmablas_zhemv_work instead of magmablas_zhemv. As the overhead to allocate and free in device memory in magmablas_zhemv would hurt performance. Our tests show that this penalty is about 10 Gflop/s when the matrix size is around 10000.

void magmablas_zswapblk ( magma_order_t  order,
magma_int_t  n,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  dB,
magma_int_t  lddb,
magma_int_t  i1,
magma_int_t  i2,
const magma_int_t *  ipiv,
magma_int_t  inci,
magma_int_t  offset 
)
magma_int_t magmablas_zsymv ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy 
)

magmablas_zsymv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
magma_int_t magmablas_zsymv_work ( magma_uplo_t  uplo,
magma_int_t  n,
magmaDoubleComplex  alpha,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_const_ptr  dx,
magma_int_t  incx,
magmaDoubleComplex  beta,
magmaDoubleComplex_ptr  dy,
magma_int_t  incy,
magmaDoubleComplex_ptr  dwork,
magma_int_t  lwork,
magma_queue_t  queue 
)

magmablas_zsymv_work performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]dwork(workspace) COMPLEX_16 array on the GPU, dimension (MAX(1, LWORK)),
[in]lworkINTEGER. The dimension of the array DWORK. LWORK >= LDDA * ceil( N / NB_X ), where NB_X = 64.
[in]queuemagma_queue_t. Queue to execute in.

MAGMA implements zsymv through two steps: 1) perform the multiplication in each thread block and put the intermediate value in dwork. 2) sum the intermediate values and store the final result in y.

magamblas_zsymv_work requires users to provide a workspace, while magmablas_zsymv is a wrapper routine allocating the workspace inside the routine and provides the same interface as cublas.

If users need to call zsymv frequently, we suggest using magmablas_zsymv_work instead of magmablas_zsymv. As the overhead to allocate and free in device memory in magmablas_zsymv would hurt performance. Our tests show that this penalty is about 10 Gflop/s when the matrix size is around 10000.

void magmablas_ztrsv ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex_const_ptr  dA,
magma_int_t  ldda,
magmaDoubleComplex_ptr  db,
magma_int_t  incb,
magma_queue_t  queue 
)

ztrsv solves one of the matrix equations on gpu

op(A)*x = B, or x*op(A) = B,

where alpha is a scalar, X and B are vectors, A is a unit, or non-unit, upper or lower triangular matrix and op(A) is one of

op(A) = A,    or
op(A) = A^T,  or
op(A) = A^H.

The vector x is overwritten on b.

Parameters
[in]uplomagma_uplo_t. On entry, uplo specifies whether the matrix A is an upper or lower triangular matrix as follows:
  • = MagmaUpper: A is an upper triangular matrix.
  • = MagmaLower: A is a lower triangular matrix.
[in]transmagma_trans_t. On entry, trans specifies the form of op(A) to be used in the matrix multiplication as follows:
  • = MagmaNoTrans: op(A) = A.
  • = MagmaTrans: op(A) = A^T.
  • = MagmaConjTrans: op(A) = A^H.
[in]diagmagma_diag_t. On entry, diag specifies whether or not A is unit triangular as follows:
  • = MagmaUnit: A is assumed to be unit triangular.
  • = MagmaNonUnit: A is not assumed to be unit triangular.
[in]nINTEGER. On entry, n N specifies the order of the matrix A. n >= 0.
[in]dACOMPLEX_16 array of dimension ( lda, n ) Before entry with uplo = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular matrix and the strictly lower triangular part of A is not referenced. Before entry with uplo = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular matrix and the strictly upper triangular part of A is not referenced. Note that when diag = MagmaUnit, the diagonal elements of A are not referenced either, but are assumed to be unity.
[in]lddaINTEGER. On entry, lda specifies the first dimension of A. lda >= max( 1, n ).
[in]dbCOMPLEX_16 array of dimension n On exit, b is overwritten with the solution vector X.
[in]incbINTEGER. On entry, incb specifies the increment for the elements of b. incb must not be zero. Unchanged on exit.
[in]queuemagma_queue_t Queue to execute in.
void magmablas_ztrsv_batched ( magma_uplo_t  uplo,
magma_trans_t  trans,
magma_diag_t  diag,
magma_int_t  n,
magmaDoubleComplex **  A_array,
magma_int_t  lda,
magmaDoubleComplex **  b_array,
magma_int_t  incb,
magma_int_t  batchCount,
magma_queue_t  queue 
)

ztrsv solves one of the matrix equations on gpu

op(A)*x = b, or x*op(A) = b,

where alpha is a scalar, X and B are vectors, A is a unit, or non-unit, upper or lower triangular matrix and op(A) is one of

op(A) = A,    or
op(A) = A^T,  or
op(A) = A^H.

The vector x is overwritten on b.

Parameters
[in]uplomagma_uplo_t. On entry, uplo specifies whether the matrix A is an upper or lower triangular matrix as follows:
  • = MagmaUpper: A is an upper triangular matrix.
  • = MagmaLower: A is a lower triangular matrix.
[in]transmagma_trans_t. On entry, trans specifies the form of op(A) to be used in the matrix multiplication as follows:
  • = MagmaNoTrans: op(A) = A.
  • = MagmaTrans: op(A) = A^T.
  • = MagmaConjTrans: op(A) = A^H.
[in]diagmagma_diag_t. On entry, diag specifies whether or not A is unit triangular as follows:
  • = MagmaUnit: A is assumed to be unit triangular.
  • = MagmaNonUnit: A is not assumed to be unit triangular.
[in]nINTEGER. On entry, n N specifies the order of the matrix A. n >= 0.
[in]A_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array A of dimension ( lda, n ), Before entry with uplo = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular matrix and the strictly lower triangular part of A is not referenced. Before entry with uplo = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular matrix and the strictly upper triangular part of A is not referenced. Note that when diag = MagmaUnit, the diagonal elements of A are not referenced either, but are assumed to be unity.
[in]ldaINTEGER. On entry, lda specifies the first dimension of A. lda >= max( 1, n ).
[in]b_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array of dimension n On exit, b is overwritten with the solution vector X.
[in]incbINTEGER. On entry, incb specifies the increment for the elements of b. incb must not be zero. Unchanged on exit.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.