MAGMA  1.6.3
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
single precision

Functions

magma_int_t magma_scuspaxpy (float *alpha, magma_s_matrix A, float *beta, magma_s_matrix B, magma_s_matrix *AB, magma_queue_t queue)
 This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format: More...
 
magma_int_t magma_scuspmm (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *AB, magma_queue_t queue)
 This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format. More...
 
magma_int_t magma_scustomspmv (float alpha, magma_s_matrix x, float beta, magma_s_matrix y, magma_queue_t queue)
 This is an interface to any custom sparse matrix vector product. More...
 
magma_int_t magma_s_spmv (float alpha, magma_s_matrix A, magma_s_matrix x, float beta, magma_s_matrix y, magma_queue_t queue)
 For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * A * x + beta * y. More...
 
magma_int_t magma_s_spmv_shift (float alpha, magma_s_matrix A, float lambda, magma_s_matrix x, float beta, magma_int_t offset, magma_int_t blocksize, magma_index_t *add_rows, magma_s_matrix y, magma_queue_t queue)
 For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * ( A - lambda I ) * x + beta * y. More...
 
magma_int_t magma_s_spmm (float alpha, magma_s_matrix A, magma_s_matrix B, magma_s_matrix *C, magma_queue_t queue)
 For a given input matrix A and B and scalar alpha, the wrapper determines the suitable SpMV computing C = alpha * A * B. More...
 
magma_int_t magma_sgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, float alpha, magmaFloat_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha * A * x + beta * y on the GPU. More...
 
magma_int_t magma_sgecsrmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, float alpha, float lambda, magmaFloat_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, int offset, int blocksize, magma_index_t *addrows, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU. More...
 
magma_int_t magma_sgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha * A * x + beta * y on the GPU. More...
 
magma_int_t magma_sgeellmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, float alpha, float lambda, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, int offset, int blocksize, magmaIndex_ptr addrows, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. More...
 
magma_int_t magma_sgeellrtmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowlength, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_int_t alignment, magma_int_t blocksize, magma_queue_t queue)
 This routine computes y = alpha * A * x + beta * y on the GPU. More...
 
magma_int_t magma_sgeelltmv_shift (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t nnz_per_row, float alpha, float lambda, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, int offset, int blocksize, magmaIndex_ptr addrows, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU. More...
 
magma_int_t magma_sgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha * A^t * x + beta * y on the GPU. More...
 
magma_int_t magma_sgesellcmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes y = alpha * A^t * x + beta * y on the GPU. More...
 
magma_int_t magma_smdotc (int n, int k, magmaFloat_ptr v, magmaFloat_ptr r, magmaFloat_ptr d1, magmaFloat_ptr d2, magmaFloat_ptr skp, magma_queue_t queue)
 Computes the scalar product of a set of vectors v_i such that. More...
 
magma_int_t magma_smgecsrmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, float alpha, magmaFloat_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More...
 
magma_int_t magma_smgeellmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More...
 
magma_int_t magma_smgeelltmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t nnz_per_row, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU. More...
 
magma_int_t magma_smgesellpmv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t num_vecs, magma_int_t blocksize, magma_int_t slices, magma_int_t alignment, float alpha, magmaFloat_ptr dval, magmaIndex_ptr dcolind, magmaIndex_ptr drowptr, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue)
 This routine computes Y = alpha * A^t * X + beta * Y on the GPU. More...
 

Detailed Description

Function Documentation

magma_int_t magma_s_spmm ( float  alpha,
magma_s_matrix  A,
magma_s_matrix  B,
magma_s_matrix *  C,
magma_queue_t  queue 
)

For a given input matrix A and B and scalar alpha, the wrapper determines the suitable SpMV computing C = alpha * A * B.

Parameters
[in]alphafloat scalar alpha
[in]Amagma_s_matrix sparse matrix A
[in]Bmagma_s_matrix sparse matrix C
[out]Cmagma_s_matrix * outpur sparse matrix C
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_s_spmv ( float  alpha,
magma_s_matrix  A,
magma_s_matrix  x,
float  beta,
magma_s_matrix  y,
magma_queue_t  queue 
)

For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * A * x + beta * y.

Parameters
[in]alphafloat scalar alpha
[in]Amagma_s_matrix sparse matrix A
[in]xmagma_s_matrix input vector x
[in]betafloat scalar beta
[out]ymagma_s_matrix output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_s_spmv_shift ( float  alpha,
magma_s_matrix  A,
float  lambda,
magma_s_matrix  x,
float  beta,
magma_int_t  offset,
magma_int_t  blocksize,
magma_index_t *  add_rows,
magma_s_matrix  y,
magma_queue_t  queue 
)

For a given input matrix A and vectors x, y and scalars alpha, beta the wrapper determines the suitable SpMV computing y = alpha * ( A - lambda I ) * x + beta * y.

Parameters
alphafloat scalar alpha
Amagma_s_matrix sparse matrix A
lambdafloat scalar lambda
xmagma_s_matrix input vector x
betafloat scalar beta
offsetmagma_int_t in case not the main diagonal is scaled
blocksizemagma_int_t in case of processing multiple vectors
add_rowsmagma_int_t* in case the matrixpowerskernel is used
ymagma_s_matrix output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_scuspaxpy ( float *  alpha,
magma_s_matrix  A,
float *  beta,
magma_s_matrix  B,
magma_s_matrix *  AB,
magma_queue_t  queue 
)

This is an interface to the cuSPARSE routine csrgeam computing the sum of two sparse matrices stored in csr format:

C = alpha * A + beta * B

Parameters
[in]alphafloat* scalar
[in]Amagma_s_matrix input matrix
[in]betafloat* scalar
[in]Bmagma_s_matrix input matrix
[out]ABmagma_s_matrix* output matrix AB = alpha * A + beta * B
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_scuspmm ( magma_s_matrix  A,
magma_s_matrix  B,
magma_s_matrix *  AB,
magma_queue_t  queue 
)

This is an interface to the cuSPARSE routine csrmm computing the product of two sparse matrices stored in csr format.

Parameters
[in]Amagma_s_matrix input matrix
[in]Bmagma_s_matrix input matrix
[out]ABmagma_s_matrix* output matrix AB = A * B
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_scustomspmv ( float  alpha,
magma_s_matrix  x,
float  beta,
magma_s_matrix  y,
magma_queue_t  queue 
)

This is an interface to any custom sparse matrix vector product.

It should compute y = alpha*FUNCTION(x) + beta*y The vectors are located on the device, the scalars on the CPU.

Parameters
[in]alphafloat scalar alpha
[in]xmagma_s_matrix input vector x
[in]betafloat scalar beta
[out]ymagma_s_matrix output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgecsrmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  drowptr,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha * A * x + beta * y on the GPU.

The input format is CSR (val, row, col).

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in CSR
[in]drowptrmagmaIndex_ptr rowpointer of A in CSR
[in]dcolindmagmaIndex_ptr columnindices of A in CSR
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgecsrmv_shift ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
float  alpha,
float  lambda,
magmaFloat_ptr  dval,
magmaIndex_ptr  drowptr,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
int  offset,
int  blocksize,
magma_index_t *  addrows,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha * ( A -lambda I ) * x + beta * y on the GPU.

It is a shifted version of the CSR-SpMV.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]alphafloat scalar multiplier
[in]lambdafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in CSR
[in]drowptrmagmaIndex_ptr rowpointer of A in CSR
[in]dcolindmagmaIndex_ptr columnindices of A in CSR
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[in]offsetmagma_int_t in case not the main diagonal is scaled
[in]blocksizemagma_int_t in case of processing multiple vectors
[in]addrowsmagmaIndex_ptr in case the matrixpowerskernel is used
[out]dymagmaFloat_ptr output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgeellmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nnz_per_row,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha * A * x + beta * y on the GPU.

Input format is ELLPACK.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]nnz_per_rowmagma_int_t number of elements in the longest row
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in ELLPACK
[in]dcolindmagmaIndex_ptr columnindices of A in ELLPACK
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgeellmv_shift ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nnz_per_row,
float  alpha,
float  lambda,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
int  offset,
int  blocksize,
magmaIndex_ptr  addrows,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.

Input format is ELLPACK. It is the shifted version of the ELLPACK SpMV.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]nnz_per_rowmagma_int_t number of elements in the longest row
[in]alphafloat scalar multiplier
[in]lambdafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in ELLPACK
[in]dcolindmagmaIndex_ptr columnindices of A in ELLPACK
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[in]offsetmagma_int_t in case not the main diagonal is scaled
[in]blocksizemagma_int_t in case of processing multiple vectors
[in]addrowsmagmaIndex_ptr in case the matrixpowerskernel is used
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgeellrtmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nnz_per_row,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaIndex_ptr  drowlength,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_int_t  alignment,
magma_int_t  blocksize,
magma_queue_t  queue 
)

This routine computes y = alpha * A * x + beta * y on the GPU.

Input format is ELLRT. The ideas are taken from "Improving the performance of the sparse matrix vector product with GPUs", (CIT 2010), and modified to provide correct values.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows
[in]nmagma_int_t number of columns
[in]nnz_per_rowmagma_int_t max number of nonzeros in a row
[in]alphafloat scalar alpha
[in]dvalmagmaFloat_ptr val array
[in]dcolindmagmaIndex_ptr col indices
[in]drowlengthmagmaIndex_ptr number of elements in each row
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar beta
[out]dymagmaFloat_ptr output vector y
[in]blocksizemagma_int_t threads per block
[in]alignmentmagma_int_t threads assigned to each row
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgeelltmv_shift ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nnz_per_row,
float  alpha,
float  lambda,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
int  offset,
int  blocksize,
magmaIndex_ptr  addrows,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha *( A - lambda I ) * x + beta * y on the GPU.

Input format is ELL.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]nnz_per_rowmagma_int_t number of elements in the longest row
[in]alphafloat scalar multiplier
[in]lambdafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in ELL
[in]dcolindmagmaIndex_ptr columnindices of A in ELL
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[in]offsetmagma_int_t in case not the main diagonal is scaled
[in]blocksizemagma_int_t in case of processing multiple vectors
[in]addrowsmagmaIndex_ptr in case the matrixpowerskernel is used
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgesellcmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  blocksize,
magma_int_t  slices,
magma_int_t  alignment,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaIndex_ptr  drowptr,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha * A^t * x + beta * y on the GPU.

Input format is SELLC/SELLP.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]blocksizemagma_int_t number of rows in one ELL-slice
[in]slicesmagma_int_t number of slices in matrix
[in]alignmentmagma_int_t number of threads assigned to one row (=1)
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in SELLC/P
[in]dcolindmagmaIndex_ptr columnindices of A in SELLC/P
[in]drowptrmagmaIndex_ptr rowpointer of SELLP
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_sgesellpmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  blocksize,
magma_int_t  slices,
magma_int_t  alignment,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaIndex_ptr  drowptr,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes y = alpha * A^t * x + beta * y on the GPU.

Input format is SELLP.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]blocksizemagma_int_t number of rows in one ELL-slice
[in]slicesmagma_int_t number of slices in matrix
[in]alignmentmagma_int_t number of threads assigned to one row
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in SELLP
[in]dcolindmagmaIndex_ptr columnindices of A in SELLP
[in]drowptrmagmaIndex_ptr rowpointer of SELLP
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_smdotc ( int  n,
int  k,
magmaFloat_ptr  v,
magmaFloat_ptr  r,
magmaFloat_ptr  d1,
magmaFloat_ptr  d2,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Computes the scalar product of a set of vectors v_i such that.

skp = ( <v_0,r>, <v_1,r>, .. )

Returns the vector skp.

Parameters
[in]nint length of v_i and r
[in]kint

vectors v_i

Parameters
[in]vmagmaFloat_ptr v = (v_0 .. v_i.. v_k)
[in]rmagmaFloat_ptr r
[in]d1magmaFloat_ptr workspace
[in]d2magmaFloat_ptr workspace
[out]skpmagmaFloat_ptr vector[k] of scalar products (<v_i,r>...)
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_smgecsrmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  num_vecs,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  drowptr,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.

Input format is CSR.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]num_vecsmama_int_t number of vectors
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in CSR
[in]drowptrmagmaIndex_ptr rowpointer of A in CSR
[in]dcolindmagmaIndex_ptr columnindices of A in CSR
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_smgeellmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  num_vecs,
magma_int_t  nnz_per_row,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.

Input format is ELLPACK.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]num_vecsmama_int_t number of vectors
[in]nnz_per_rowmagma_int_t number of elements in the longest row
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in ELLPACK
[in]dcolindmagmaIndex_ptr columnindices of A in ELLPACK
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_smgeelltmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  num_vecs,
magma_int_t  nnz_per_row,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes Y = alpha * A * X + beta * Y for X and Y sets of num_vec vectors on the GPU.

Input format is ELL.

Parameters
[in]transAmagma_trans_t transposition parameter for A
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]num_vecsmama_int_t number of vectors
[in]nnz_per_rowmagma_int_t number of elements in the longest row
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in ELL
[in]dcolindmagmaIndex_ptr columnindices of A in ELL
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_smgesellpmv ( magma_trans_t  transA,
magma_int_t  m,
magma_int_t  n,
magma_int_t  num_vecs,
magma_int_t  blocksize,
magma_int_t  slices,
magma_int_t  alignment,
float  alpha,
magmaFloat_ptr  dval,
magmaIndex_ptr  dcolind,
magmaIndex_ptr  drowptr,
magmaFloat_ptr  dx,
float  beta,
magmaFloat_ptr  dy,
magma_queue_t  queue 
)

This routine computes Y = alpha * A^t * X + beta * Y on the GPU.

Input format is SELLP. Note, that the input format for X is row-major while the output format for Y is column major!

Parameters
[in]transAmagma_trans_t transpose A?
[in]mmagma_int_t number of rows in A
[in]nmagma_int_t number of columns in A
[in]num_vecsmagma_int_t number of columns in X and Y
[in]blocksizemagma_int_t number of rows in one ELL-slice
[in]slicesmagma_int_t number of slices in matrix
[in]alignmentmagma_int_t number of threads assigned to one row
[in]alphafloat scalar multiplier
[in]dvalmagmaFloat_ptr array containing values of A in SELLP
[in]dcolindmagmaIndex_ptr columnindices of A in SELLP
[in]drowptrmagmaIndex_ptr rowpointer of SELLP
[in]dxmagmaFloat_ptr input vector x
[in]betafloat scalar multiplier
[out]dymagmaFloat_ptr input/output vector y
[in]queuemagma_queue_t Queue to execute in.