MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
Loading...
Searching...
No Matches

\(y = \alpha Ax + \beta y\) More...

Functions

void magma_csymv (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex alpha, magmaFloatComplex_const_ptr dA, magma_int_t ldda, magmaFloatComplex_const_ptr dx, magma_int_t incx, magmaFloatComplex beta, magmaFloatComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.
 
void magma_dsymv (magma_uplo_t uplo, magma_int_t n, double alpha, magmaDouble_const_ptr dA, magma_int_t ldda, magmaDouble_const_ptr dx, magma_int_t incx, double beta, magmaDouble_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.
 
void magma_ssymv (magma_uplo_t uplo, magma_int_t n, float alpha, magmaFloat_const_ptr dA, magma_int_t ldda, magmaFloat_const_ptr dx, magma_int_t incx, float beta, magmaFloat_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.
 
void magma_zsymv (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.
 
magma_int_t magmablas_csymv_work (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex alpha, magmaFloatComplex_const_ptr dA, magma_int_t ldda, magmaFloatComplex_const_ptr dx, magma_int_t incx, magmaFloatComplex beta, magmaFloatComplex_ptr dy, magma_int_t incy, magmaFloatComplex_ptr dwork, magma_int_t lwork, magma_queue_t queue)
 magmablas_csymv_work performs the matrix-vector operation:
 
magma_int_t magmablas_csymv (magma_uplo_t uplo, magma_int_t n, magmaFloatComplex alpha, magmaFloatComplex_const_ptr dA, magma_int_t ldda, magmaFloatComplex_const_ptr dx, magma_int_t incx, magmaFloatComplex beta, magmaFloatComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 magmablas_csymv performs the matrix-vector operation:
 
magma_int_t magmablas_zsymv_work (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magmaDoubleComplex_ptr dwork, magma_int_t lwork, magma_queue_t queue)
 magmablas_zsymv_work performs the matrix-vector operation:
 
magma_int_t magmablas_zsymv (magma_uplo_t uplo, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex_const_ptr dx, magma_int_t incx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t incy, magma_queue_t queue)
 magmablas_zsymv performs the matrix-vector operation:
 

Detailed Description

\(y = \alpha Ax + \beta y\)

Function Documentation

◆ magma_csymv()

void magma_csymv ( magma_uplo_t uplo,
magma_int_t n,
magmaFloatComplex alpha,
magmaFloatComplex_const_ptr dA,
magma_int_t ldda,
magmaFloatComplex_const_ptr dx,
magma_int_t incx,
magmaFloatComplex beta,
magmaFloatComplex_ptr dy,
magma_int_t incy,
magma_queue_t queue )

Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dsymv()

void magma_dsymv ( magma_uplo_t uplo,
magma_int_t n,
double alpha,
magmaDouble_const_ptr dA,
magma_int_t ldda,
magmaDouble_const_ptr dx,
magma_int_t incx,
double beta,
magmaDouble_ptr dy,
magma_int_t incy,
magma_queue_t queue )

Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dADOUBLE PRECISION array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxDOUBLE PRECISION array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyDOUBLE PRECISION array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_ssymv()

void magma_ssymv ( magma_uplo_t uplo,
magma_int_t n,
float alpha,
magmaFloat_const_ptr dA,
magma_int_t ldda,
magmaFloat_const_ptr dx,
magma_int_t incx,
float beta,
magmaFloat_ptr dy,
magma_int_t incy,
magma_queue_t queue )

Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dAREAL array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxREAL array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyREAL array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zsymv()

void magma_zsymv ( magma_uplo_t uplo,
magma_int_t n,
magmaDoubleComplex alpha,
magmaDoubleComplex_const_ptr dA,
magma_int_t ldda,
magmaDoubleComplex_const_ptr dx,
magma_int_t incx,
magmaDoubleComplex beta,
magmaDoubleComplex_ptr dy,
magma_int_t incy,
magma_queue_t queue )

Perform symmetric matrix-vector product, \( y = \alpha A x + \beta y, \) where \( A \) is symmetric.

Parameters
[in]uploWhether the upper or lower triangle of A is referenced.
[in]nNumber of rows and columns of A. n >= 0.
[in]alphaScalar \( \alpha \)
[in]dACOMPLEX_16 array of dimension (ldda,n), ldda >= max(1,n). The n-by-n matrix A, on GPU device.
[in]lddaLeading dimension of dA.
[in]dxCOMPLEX_16 array on GPU device. The m element vector x of dimension (1 + (m-1)*incx).
[in]incxStride between consecutive elements of dx. incx != 0.
[in]betaScalar \( \beta \)
[in,out]dyCOMPLEX_16 array on GPU device. The n element vector y of dimension (1 + (n-1)*incy).
[in]incyStride between consecutive elements of dy. incy != 0.
[in]queuemagma_queue_t Queue to execute in.

◆ magmablas_csymv_work()

magma_int_t magmablas_csymv_work ( magma_uplo_t uplo,
magma_int_t n,
magmaFloatComplex alpha,
magmaFloatComplex_const_ptr dA,
magma_int_t ldda,
magmaFloatComplex_const_ptr dx,
magma_int_t incx,
magmaFloatComplex beta,
magmaFloatComplex_ptr dy,
magma_int_t incy,
magmaFloatComplex_ptr dwork,
magma_int_t lwork,
magma_queue_t queue )

magmablas_csymv_work performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]dwork(workspace) COMPLEX array on the GPU, dimension (MAX(1, LWORK)),
[in]lworkINTEGER. The dimension of the array DWORK. LWORK >= LDDA * ceil( N / NB_X ), where NB_X = 64.
[in]queuemagma_queue_t. Queue to execute in.

MAGMA implements csymv through two steps: 1) perform the multiplication in each thread block and put the intermediate value in dwork. 2) sum the intermediate values and store the final result in y.

magamblas_csymv_work requires users to provide a workspace, while magmablas_csymv is a wrapper routine allocating the workspace inside the routine and provides the same interface as cublas.

If users need to call csymv frequently, we suggest using magmablas_csymv_work instead of magmablas_csymv. As the overhead to allocate and free in device memory in magmablas_csymv would hurt performance. Our tests show that this penalty is about 10 Gflop/s when the matrix size is around 10000.

◆ magmablas_csymv()

magma_int_t magmablas_csymv ( magma_uplo_t uplo,
magma_int_t n,
magmaFloatComplex alpha,
magmaFloatComplex_const_ptr dA,
magma_int_t ldda,
magmaFloatComplex_const_ptr dx,
magma_int_t incx,
magmaFloatComplex beta,
magmaFloatComplex_ptr dy,
magma_int_t incy,
magma_queue_t queue )

magmablas_csymv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]queuemagma_queue_t Queue to execute in.

◆ magmablas_zsymv_work()

magma_int_t magmablas_zsymv_work ( magma_uplo_t uplo,
magma_int_t n,
magmaDoubleComplex alpha,
magmaDoubleComplex_const_ptr dA,
magma_int_t ldda,
magmaDoubleComplex_const_ptr dx,
magma_int_t incx,
magmaDoubleComplex beta,
magmaDoubleComplex_ptr dy,
magma_int_t incy,
magmaDoubleComplex_ptr dwork,
magma_int_t lwork,
magma_queue_t queue )

magmablas_zsymv_work performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]dwork(workspace) COMPLEX_16 array on the GPU, dimension (MAX(1, LWORK)),
[in]lworkINTEGER. The dimension of the array DWORK. LWORK >= LDDA * ceil( N / NB_X ), where NB_X = 64.
[in]queuemagma_queue_t. Queue to execute in.

MAGMA implements zsymv through two steps: 1) perform the multiplication in each thread block and put the intermediate value in dwork. 2) sum the intermediate values and store the final result in y.

magamblas_zsymv_work requires users to provide a workspace, while magmablas_zsymv is a wrapper routine allocating the workspace inside the routine and provides the same interface as cublas.

If users need to call zsymv frequently, we suggest using magmablas_zsymv_work instead of magmablas_zsymv. As the overhead to allocate and free in device memory in magmablas_zsymv would hurt performance. Our tests show that this penalty is about 10 Gflop/s when the matrix size is around 10000.

◆ magmablas_zsymv()

magma_int_t magmablas_zsymv ( magma_uplo_t uplo,
magma_int_t n,
magmaDoubleComplex alpha,
magmaDoubleComplex_const_ptr dA,
magma_int_t ldda,
magmaDoubleComplex_const_ptr dx,
magma_int_t incx,
magmaDoubleComplex beta,
magmaDoubleComplex_ptr dy,
magma_int_t incy,
magma_queue_t queue )

magmablas_zsymv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n complex symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaCOMPLEX_16. On entry, ALPHA specifies the scalar alpha.
[in]dACOMPLEX_16 array of DIMENSION ( LDDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]lddaINTEGER. On entry, LDDA specifies the first dimension of A as declared in the calling (sub) program. LDDA must be at least max( 1, n ). It is recommended that ldda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]dxCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaCOMPLEX_16. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]dyCOMPLEX_16 array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
[in]queuemagma_queue_t Queue to execute in.