![]() |
MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_zcustomspmv (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex beta, magmaDoubleComplex *x, magmaDoubleComplex *y, magma_queue_t queue) |
This is an interface to any custom sparse matrix vector product. | |
magma_int_t | magma_dsgecsrmv_mixed_prec (magma_trans_t transA, magma_int_t m, magma_int_t n, double alpha, magmaDouble_ptr ddiagval, magmaFloat_ptr doffdiagval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_z_spmv (magmaDoubleComplex alpha, magma_z_matrix A, magma_z_matrix x, magmaDoubleComplex beta, magma_z_matrix y, magma_queue_t queue) |
For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * A * x + beta * y. | |
magma_int_t | magma_z_spmv_shift (magmaDoubleComplex alpha, magma_z_matrix A, magmaDoubleComplex lambda, magma_z_matrix x, magmaDoubleComplex beta, magma_int_t offset, magma_int_t blocksize, magma_index_t *add_rows, magma_z_matrix y, magma_queue_t queue) |
For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * ( A - lambda I ) * x + beta * y. | |
magma_int_t | magma_z_spmm (magmaDoubleComplex alpha, magma_z_matrix A, magma_z_matrix B, magma_z_matrix *C, magma_queue_t queue) |
For a given input matrix A and B and scalar alpha, the wrapper determines the suitable SpMV computing C = alpha * A * B. | |
magma_int_t | magma_zcuspaxpy (magmaDoubleComplex *alpha, magma_z_matrix A, magmaDoubleComplex *beta, magma_z_matrix B, magma_z_matrix *AB, magma_queue_t queue) |
This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format: | |
magma_int_t | magma_zcuspmm (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *AB, magma_queue_t queue) |
This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format. | |
magma_int_t | magma_zcgecsrmv_mixed_prec (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_ptr ddiagval, magmaFloatComplex_ptr doffdiagval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zge3pt (magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex beta, magmaDoubleComplex_ptr dx, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine is a 3-pt-stencil operator derived from a FD-scheme in 2D with Dirichlet boundary. | |
magma_int_t | magma_zgeaxpy (magmaDoubleComplex alpha, magma_z_matrix X, magmaDoubleComplex beta, magma_z_matrix *Y, magma_queue_t queue) |
This routine computes Y = alpha * X + beta * Y on the GPU. | |
magma_int_t | magma_zgecsr5mv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t p, magmaDoubleComplex alpha, magma_int_t sigma, magma_int_t bit_y_offset, magma_int_t bit_scansum_offset, magma_int_t num_packet, magmaUIndex_ptr dtile_ptr, magmaUIndex_ptr dtile_desc, magmaIndex_ptr dtile_desc_offset_ptr, magmaIndex_ptr dtile_desc_offset, magmaDoubleComplex_ptr dcalibrator, magma_int_t tail_tile_start, magmaDoubleComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zgecsrmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex lambda, magmaDoubleComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magma_int_t offset, magma_int_t blocksize, magma_index_t *addrows, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU. | |
magma_int_t | magma_zgecsrreimsplit (magma_z_matrix A, magma_z_matrix *ReA, magma_z_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_zgedensereimsplit (magma_z_matrix A, magma_z_matrix *ReA, magma_z_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_zgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zgeellmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex lambda, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magma_int_t offset, magma_int_t blocksize, magmaIndex_ptr addrows, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. | |
magma_int_t | magma_zgeellrtmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowlength, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_int_t alignment, magma_int_t blocksize, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zgeelltmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex lambda, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magma_int_t offset, magma_int_t blocksize, magmaIndex_ptr addrows, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. | |
magma_int_t | magma_zmdotc (magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr v, magmaDoubleComplex_ptr r, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr skp, magma_queue_t queue) |
Computes the scalar product of a set of vectors v_i such that. | |
magma_int_t | magma_zgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A^t * x + beta * y on the GPU. | |
magma_int_t | magma_zgesellcmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A^t * x + beta * y on the GPU. | |
magma_int_t | magma_zmdotc_shfl (magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr v, magmaDoubleComplex_ptr r, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr skp, magma_queue_t queue) |
Computes the scalar product of a set of vectors v_i such that. | |
magma_int_t | magma_zmdotc4 (magma_int_t n, magmaDoubleComplex_ptr v0, magmaDoubleComplex_ptr w0, magmaDoubleComplex_ptr v1, magmaDoubleComplex_ptr w1, magmaDoubleComplex_ptr v2, magmaDoubleComplex_ptr w2, magmaDoubleComplex_ptr v3, magmaDoubleComplex_ptr w3, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr skp, magma_queue_t queue) |
Computes the scalar product of a set of 4 vectors such that. | |
magma_int_t | magma_zmgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. | |
magma_int_t | magma_zmgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. | |
magma_int_t | magma_zmgeelltmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. | |
magma_int_t | magma_zmgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, magmaDoubleComplex alpha, magmaDoubleComplex_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A^t * X + beta * Y on the GPU. | |
magma_int_t magma_zcustomspmv | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex * | x, | ||
magmaDoubleComplex * | y, | ||
magma_queue_t | queue ) |
This is an interface to any custom sparse matrix vector product.
It should compute y = alpha*FUNCTION(x) + beta*y The vectors are located on the device, the scalars on the CPU.
[in] | m | magma_int_t number of rows |
[in] | n | magma_int_t number of columns |
[in] | alpha | magmaDoubleComplex scalar alpha |
[in] | x | magmaDoubleComplex * input vector x |
[in] | beta | magmaDoubleComplex scalar beta |
[out] | y | magmaDoubleComplex * output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dsgecsrmv_mixed_prec | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
double | alpha, | ||
magmaDouble_ptr | ddiagval, | ||
magmaFloat_ptr | doffdiagval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
A is a matrix in mixed precision, i.e. the diagonal values are stored in high precision, the offdiagonal values in low precision. The input format is a CSR (val, row, col) in FloatComplex storing all offdiagonal elements and an array containing the diagonal values in DoubleComplex.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | double scalar multiplier |
[in] | ddiagval | magmaDouble_ptr array containing diagonal values of A in DoubleComplex |
[in] | doffdiagval | magmaFloat_ptr array containing offdiag values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_z_spmv | ( | magmaDoubleComplex | alpha, |
magma_z_matrix | A, | ||
magma_z_matrix | x, | ||
magmaDoubleComplex | beta, | ||
magma_z_matrix | y, | ||
magma_queue_t | queue ) |
For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * A * x + beta * y.
[in] | alpha | magmaDoubleComplex scalar alpha |
[in] | A | magma_z_matrix sparse matrix A |
[in] | x | magma_z_matrix input vector x |
[in] | beta | magmaDoubleComplex scalar beta |
[out] | y | magma_z_matrix output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_z_spmv_shift | ( | magmaDoubleComplex | alpha, |
magma_z_matrix | A, | ||
magmaDoubleComplex | lambda, | ||
magma_z_matrix | x, | ||
magmaDoubleComplex | beta, | ||
magma_int_t | offset, | ||
magma_int_t | blocksize, | ||
magma_index_t * | add_rows, | ||
magma_z_matrix | y, | ||
magma_queue_t | queue ) |
For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * ( A - lambda I ) * x + beta * y.
alpha | magmaDoubleComplex scalar alpha | |
A | magma_z_matrix sparse matrix A | |
lambda | magmaDoubleComplex scalar lambda | |
x | magma_z_matrix input vector x | |
beta | magmaDoubleComplex scalar beta | |
offset | magma_int_t in case not the main diagonal is scaled | |
blocksize | magma_int_t in case of processing multiple vectors | |
add_rows | magma_int_t* in case the matrixpowerskernel is used | |
y | magma_z_matrix output vector y | |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_z_spmm | ( | magmaDoubleComplex | alpha, |
magma_z_matrix | A, | ||
magma_z_matrix | B, | ||
magma_z_matrix * | C, | ||
magma_queue_t | queue ) |
For a given input matrix A and B and scalar alpha, the wrapper determines the suitable SpMV computing C = alpha * A * B.
[in] | alpha | magmaDoubleComplex scalar alpha |
[in] | A | magma_z_matrix sparse matrix A |
[in] | B | magma_z_matrix sparse matrix C |
[out] | C | magma_z_matrix * outpur sparse matrix C |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zcuspaxpy | ( | magmaDoubleComplex * | alpha, |
magma_z_matrix | A, | ||
magmaDoubleComplex * | beta, | ||
magma_z_matrix | B, | ||
magma_z_matrix * | AB, | ||
magma_queue_t | queue ) |
This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format:
C = alpha * A + beta * B
[in] | alpha | magmaDoubleComplex* scalar |
[in] | A | magma_z_matrix input matrix |
[in] | beta | magmaDoubleComplex* scalar |
[in] | B | magma_z_matrix input matrix |
[out] | AB | magma_z_matrix* output matrix AB = alpha * A + beta * B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zcuspmm | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | AB, | ||
magma_queue_t | queue ) |
This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format.
[in] | A | magma_z_matrix input matrix |
[in] | B | magma_z_matrix input matrix |
[out] | AB | magma_z_matrix* output matrix AB = A * B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zcgecsrmv_mixed_prec | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | ddiagval, | ||
magmaFloatComplex_ptr | doffdiagval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
A is a matrix in mixed precision, i.e. the diagonal values are stored in high precision, the offdiagonal values in low precision. The input format is a CSR (val, row, col) in FloatComplex storing all offdiagonal elements and an array containing the diagonal values in DoubleComplex.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | ddiagval | magmaDoubleComplex_ptr array containing diagonal values of A in DoubleComplex |
[in] | doffdiagval | magmaFloatComplex_ptr array containing offdiag values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zge3pt | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine is a 3-pt-stencil operator derived from a FD-scheme in 2D with Dirichlet boundary.
It computes y_i = -2 x_i + x_{i-1} + x_{i+1}
[in] | m | magma_int_t number of rows in x and y |
[in] | n | magma_int_t number of columns in x and y |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | beta | magmaDoubleComplex scalar multiplier |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[out] | dy | magmaDoubleComplex_ptr output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeaxpy | ( | magmaDoubleComplex | alpha, |
magma_z_matrix | X, | ||
magmaDoubleComplex | beta, | ||
magma_z_matrix * | Y, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * X + beta * Y on the GPU.
The input format is magma_z_matrix. It can handle both, dense matrix (vector block) and CSR matrices. For the latter, it interfaces the cuSPARSE library.
[in] | alpha | magmaDoubleComplex scalar multiplier. |
[in] | X | magma_z_matrix input/output matrix Y. |
[in] | beta | magmaDoubleComplex scalar multiplier. |
[in,out] | Y | magma_z_matrix* input matrix X. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsr5mv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | p, | ||
magmaDoubleComplex | alpha, | ||
magma_int_t | sigma, | ||
magma_int_t | bit_y_offset, | ||
magma_int_t | bit_scansum_offset, | ||
magma_int_t | num_packet, | ||
magmaUIndex_ptr | dtile_ptr, | ||
magmaUIndex_ptr | dtile_desc, | ||
magmaIndex_ptr | dtile_desc_offset_ptr, | ||
magmaIndex_ptr | dtile_desc_offset, | ||
magmaDoubleComplex_ptr | dcalibrator, | ||
magma_int_t | tail_tile_start, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR5 (val (tile-wise column-major), row_pointer, col (tile-wise column-major), tile_pointer, tile_desc).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | p | magma_int_t number of tiles in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | sigma | magma_int_t sigma in A in CSR5 |
[in] | bit_y_offset | magma_int_t bit_y_offset in A in CSR5 |
[in] | bit_scansum_offset | magma_int_t bit_scansum_offset in A in CSR5 |
[in] | num_packet | magma_int_t num_packet in A in CSR5 |
[in] | dtile_ptr | magmaUIndex_ptr tilepointer of A in CSR5 |
[in] | dtile_desc | magmaUIndex_ptr tiledescriptor of A in CSR5 |
[in] | dtile_desc_offset_ptr | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dtile_desc_offset | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dcalibrator | magmaDoubleComplex_ptr calibrator of A in CSR5 |
[in] | tail_tile_start | magma_int_t start of the last tile in A |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsrmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR (val, row, col).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsrmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex | lambda, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magma_int_t | offset, | ||
magma_int_t | blocksize, | ||
magma_index_t * | addrows, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU.
It is a shifted version of the CSR-SpMV.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | lambda | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDoubleComplex_ptr output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsrreimsplit | ( | magma_z_matrix | A, |
magma_z_matrix * | ReA, | ||
magma_z_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_z_matrix input matrix A. |
[out] | ReA | magma_z_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_z_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgedensereimsplit | ( | magma_z_matrix | A, |
magma_z_matrix * | ReA, | ||
magma_z_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_z_matrix input matrix A. |
[out] | ReA | magma_z_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_z_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeellmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
Input format is ELLPACK.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeellmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex | lambda, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magma_int_t | offset, | ||
magma_int_t | blocksize, | ||
magmaIndex_ptr | addrows, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.
Input format is ELLPACK. It is the shifted version of the ELLPACK SpMV.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | lambda | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeellrtmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowlength, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_int_t | alignment, | ||
magma_int_t | blocksize, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
Input format is ELLRT. The ideas are taken from "Improving the performance of the sparse matrix vector product with GPUs", (CIT 2010), and modified to provide correct values.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows |
[in] | n | magma_int_t number of columns |
[in] | nnz_per_row | magma_int_t max number of nonzeros in a row |
[in] | alpha | magmaDoubleComplex scalar alpha |
[in] | dval | magmaDoubleComplex_ptr val array |
[in] | dcolind | magmaIndex_ptr col indices |
[in] | drowlength | magmaIndex_ptr number of elements in each row |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar beta |
[out] | dy | magmaDoubleComplex_ptr output vector y |
[in] | blocksize | magma_int_t threads per block |
[in] | alignment | magma_int_t threads assigned to each row |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeelltmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex | lambda, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magma_int_t | offset, | ||
magma_int_t | blocksize, | ||
magmaIndex_ptr | addrows, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.
Input format is ELL.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | lambda | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in ELL |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELL |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmdotc | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaDoubleComplex_ptr | v, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the scalar product of a set of vectors v_i such that.
skp = ( <v_0,r>, <v_1,r>, .. )
Returns the vector skp.
[in] | n | int length of v_i and r |
[in] | k | int |
[in] | v | magmaDoubleComplex_ptr v = (v_0 .. v_i.. v_k) |
[in] | r | magmaDoubleComplex_ptr r |
[in] | d1 | magmaDoubleComplex_ptr workspace |
[in] | d2 | magmaDoubleComplex_ptr workspace |
[out] | skp | magmaDoubleComplex_ptr vector[k] of scalar products (<v_i,r>...) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgesellpmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A^t * x + beta * y on the GPU.
Input format is SELLP.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in SELLP |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLP |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgesellcmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A^t * x + beta * y on the GPU.
Input format is SELLC/SELLP.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row (=1) |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in SELLC/P |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLC/P |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmdotc_shfl | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaDoubleComplex_ptr | v, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the scalar product of a set of vectors v_i such that.
skp = ( <v_0,r>, <v_1,r>, .. )
Returns the vector skp.
[in] | n | int length of v_i and r |
[in] | k | int |
[in] | v | magmaDoubleComplex_ptr v = (v_0 .. v_i.. v_k) |
[in] | r | magmaDoubleComplex_ptr r |
[in] | d1 | magmaDoubleComplex_ptr workspace |
[in] | d2 | magmaDoubleComplex_ptr workspace |
[out] | skp | magmaDoubleComplex_ptr vector[k] of scalar products (<v_i,r>...) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmdotc4 | ( | magma_int_t | n, |
magmaDoubleComplex_ptr | v0, | ||
magmaDoubleComplex_ptr | w0, | ||
magmaDoubleComplex_ptr | v1, | ||
magmaDoubleComplex_ptr | w1, | ||
magmaDoubleComplex_ptr | v2, | ||
magmaDoubleComplex_ptr | w2, | ||
magmaDoubleComplex_ptr | v3, | ||
magmaDoubleComplex_ptr | w3, | ||
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the scalar product of a set of 4 vectors such that.
skp[0,1,2,3] = [ <v_0,w_0>, <v_1,w_1>, <v_2,w_2>, <v3,w_3> ]
Returns the vector skp. In case there are less dot products required, an easy workaround is given by doubling input.
[in] | n | int length of v_i and w_i |
[in] | v0 | magmaDoubleComplex_ptr input vector |
[in] | w0 | magmaDoubleComplex_ptr input vector |
[in] | v1 | magmaDoubleComplex_ptr input vector |
[in] | w1 | magmaDoubleComplex_ptr input vector |
[in] | v2 | magmaDoubleComplex_ptr input vector |
[in] | w2 | magmaDoubleComplex_ptr input vector |
[in] | v3 | magmaDoubleComplex_ptr input vector |
[in] | w3 | magmaDoubleComplex_ptr input vector |
[in] | d1 | magmaDoubleComplex_ptr workspace |
[in] | d2 | magmaDoubleComplex_ptr workspace |
[out] | skp | magmaDoubleComplex_ptr vector[4] of scalar products [<v_i, w_i>] This vector is located on the host |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmgecsrmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is CSR.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmgeellmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is ELLPACK.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmgeelltmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | nnz_per_row, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is ELL.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in ELL |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELL |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmgesellpmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * A^t * X + beta * Y on the GPU.
Input format is SELLP. Note, that the input format for X is row-major while the output format for Y is column major!
[in] | transA | magma_trans_t transpose A? |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | magma_int_t number of columns in X and Y |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in SELLP |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLP |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |