![]() |
MAGMA
1.6.1
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_dcuspaxpy (double *alpha, magma_d_sparse_matrix A, double *beta, magma_d_sparse_matrix B, magma_d_sparse_matrix *AB, magma_queue_t queue) |
This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format: More... | |
magma_int_t | magma_dcuspmm (magma_d_sparse_matrix A, magma_d_sparse_matrix B, magma_d_sparse_matrix *AB, magma_queue_t queue) |
This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format. More... | |
magma_int_t | magma_dcustomspmv (double alpha, magma_d_vector x, double beta, magma_d_vector y, magma_queue_t queue) |
This is an interface to any custom sparse matrix vector product. More... | |
magma_int_t | magma_dgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, double alpha, magmaDouble_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgecsrmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, double alpha, double lambda, magmaDouble_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, int offset, int blocksize, magma_index_t *addrows, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgeellmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, double alpha, double lambda, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, int offset, int blocksize, magmaIndex_ptr addrows, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgeellrtmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowlength, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_int_t alignment, magma_int_t blocksize, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgeelltmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, double alpha, double lambda, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, int offset, int blocksize, magmaIndex_ptr addrows, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A^t * x + beta * y on the GPU. More... | |
magma_int_t | magma_dgesellcmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A^t * x + beta * y on the GPU. More... | |
magma_int_t | magma_dmdotc (int n, int k, magmaDouble_ptr v, magmaDouble_ptr r, magmaDouble_ptr d1, magmaDouble_ptr d2, magmaDouble_ptr skp, magma_queue_t queue) |
Computes the scalar product of a set of vectors v_i such that. More... | |
magma_int_t | magma_dmgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, double alpha, magmaDouble_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More... | |
magma_int_t | magma_dmgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More... | |
magma_int_t | magma_dmgeelltmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More... | |
magma_int_t | magma_dmgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, double alpha, magmaDouble_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes Y = alpha * A^t * X + beta * Y on the GPU. More... | |
magma_int_t magma_dcuspaxpy | ( | double * | alpha, |
magma_d_sparse_matrix | A, | ||
double * | beta, | ||
magma_d_sparse_matrix | B, | ||
magma_d_sparse_matrix * | AB, | ||
magma_queue_t | queue | ||
) |
This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format:
C = alpha * A + beta * B
[in] | alpha | double* scalar |
[in] | A | magma_d_sparse_matrix input matrix |
[in] | beta | double* scalar |
[in] | B | magma_d_sparse_matrix input matrix |
[out] | AB | magma_d_sparse_matrix* output matrix AB = alpha * A + beta * B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dcuspmm | ( | magma_d_sparse_matrix | A, |
magma_d_sparse_matrix | B, | ||
magma_d_sparse_matrix * | AB, | ||
magma_queue_t | queue | ||
) |
This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format.
[in] | A | magma_d_sparse_matrix input matrix |
[in] | B | magma_d_sparse_matrix input matrix |
[out] | AB | magma_d_sparse_matrix* output matrix AB = A * B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dcustomspmv | ( | double | alpha, |
magma_d_vector | x, | ||
double | beta, | ||
magma_d_vector | y, | ||
magma_queue_t | queue | ||
) |
This is an interface to any custom sparse matrix vector product.
It should compute y = alpha*FUNCTION(x) + beta*y The vectors are located on the device, the scalars on the CPU.
[in] | alpha | double scalar alpha |
[in] | x | magma_d_vector input vector x |
[in] | beta | double scalar beta |
[out] | y | magma_d_vector output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgecsrmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR (val, row, col).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgecsrmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
double | alpha, | ||
double | lambda, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
int | offset, | ||
int | blocksize, | ||
magma_index_t * | addrows, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU.
It is a shifted version of the CSR-SpMV.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | double scalar multiplier |
[in] | lambda | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDouble_ptr output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgeellmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * A * x + beta * y on the GPU.
Input format is ELLPACK.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgeellmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
double | lambda, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
int | offset, | ||
int | blocksize, | ||
magmaIndex_ptr | addrows, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.
Input format is ELLPACK. It is the shifted version of the ELLPACK SpMV.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | double scalar multiplier |
[in] | lambda | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgeellrtmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowlength, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_int_t | alignment, | ||
magma_int_t | blocksize, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * A * x + beta * y on the GPU.
Input format is ELLRT. The ideas are taken from "Improving the performance of the sparse matrix vector product with GPUs", (CIT 2010), and modified to provide correct values.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows |
[in] | n | magma_int_t number of columns |
[in] | nnz_per_row | magma_int_t max number of nonzeros in a row |
[in] | alpha | double scalar alpha |
[in] | dval | magmaDouble_ptr val array |
[in] | dcolind | magmaIndex_ptr col indices |
[in] | drowlength | magmaIndex_ptr number of elements in each row |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar beta |
[out] | dy | magmaDouble_ptr output vector y |
[in] | blocksize | magma_int_t threads per block |
[in] | alignment | magma_int_t threads assigned to each row |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgeelltmv_shift | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
double | lambda, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
int | offset, | ||
int | blocksize, | ||
magmaIndex_ptr | addrows, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.
Input format is ELL.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | double scalar multiplier |
[in] | lambda | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in ELL |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELL |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[in] | offset | magma_int_t in case not the main diagonal is scaled |
[in] | blocksize | magma_int_t in case of processing multiple vectors |
[in] | addrows | magmaIndex_ptr in case the matrixpowerskernel is used |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgesellcmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * A^t * x + beta * y on the GPU.
Input format is SELLC/SELLP.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row (=1) |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in SELLC/P |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLC/P |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgesellpmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes y = alpha * A^t * x + beta * y on the GPU.
Input format is SELLP.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in SELLP |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLP |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmdotc | ( | int | n, |
int | k, | ||
magmaDouble_ptr | v, | ||
magmaDouble_ptr | r, | ||
magmaDouble_ptr | d1, | ||
magmaDouble_ptr | d2, | ||
magmaDouble_ptr | skp, | ||
magma_queue_t | queue | ||
) |
Computes the scalar product of a set of vectors v_i such that.
skp = ( <v_0,r>, <v_1,r>, .. )
Returns the vector skp.
[in] | n | int length of v_i and r |
[in] | k | int vectors v_i |
[in] | v | magmaDouble_ptr v = (v_0 .. v_i.. v_k) |
[in] | r | magmaDouble_ptr r |
[in] | d1 | magmaDouble_ptr workspace |
[in] | d2 | magmaDouble_ptr workspace |
[out] | skp | magmaDouble_ptr vector[k] of scalar products (<v_i,r>...) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmgecsrmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is CSR.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmgeellmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is ELLPACK.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in ELLPACK |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELLPACK |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmgeelltmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | nnz_per_row, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.
Input format is ELL.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | mama_int_t number of vectors |
[in] | nnz_per_row | magma_int_t number of elements in the longest row |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in ELL |
[in] | dcolind | magmaIndex_ptr columnindices of A in ELL |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmgesellpmv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | num_vecs, | ||
magma_int_t | blocksize, | ||
magma_int_t | slices, | ||
magma_int_t | alignment, | ||
double | alpha, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | dcolind, | ||
magmaIndex_ptr | drowptr, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue | ||
) |
This routine computes Y = alpha * A^t * X + beta * Y on the GPU.
Input format is SELLP. Note, that the input format for X is row-major while the output format for Y is column major!
[in] | transA | magma_trans_t transpose A? |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | num_vecs | magma_int_t number of columns in X and Y |
[in] | blocksize | magma_int_t number of rows in one ELL-slice |
[in] | slices | magma_int_t number of slices in matrix |
[in] | alignment | magma_int_t number of threads assigned to one row |
[in] | alpha | double scalar multiplier |
[in] | dval | magmaDouble_ptr array containing values of A in SELLP |
[in] | dcolind | magmaIndex_ptr columnindices of A in SELLP |
[in] | drowptr | magmaIndex_ptr rowpointer of SELLP |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |