MAGMA  1.5.0
Matrix Algebra for GPU and Multicore Architectures
 All Functions Groups

Functions

magma_int_t magma_cailu_csr_a (magma_c_sparse_matrix A, magma_c_sparse_matrix L, magma_c_sparse_matrix U)
 This routine computes the ILU approximation of a matrix iteratively. More...
 
magma_int_t magma_cailu_csr_s (magma_c_sparse_matrix A_L, magma_c_sparse_matrix A_U, magma_c_sparse_matrix L, magma_c_sparse_matrix U)
 This routine computes the ILU approximation of a matrix iteratively. More...
 
magma_int_t magma_cbajac_csr (magma_int_t localiters, magma_c_sparse_matrix D, magma_c_sparse_matrix R, magma_c_vector b, magma_c_vector *x)
 This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. More...
 
magma_int_t magma_cbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zblocks, magmaFloatComplex **Aval, magmaFloatComplex **Bval, magmaFloatComplex **Bval2)
 For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros. More...
 
magma_int_t magma_cbcsrluegemm (magma_int_t size_b, magma_int_t num_brows, magma_int_t kblocks, magmaFloatComplex **dA, magmaFloatComplex **dB, magmaFloatComplex **dC)
 For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix. More...
 
magma_int_t magma_cbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaFloatComplex **dA, magma_int_t *ipiv)
 For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix. More...
 
magma_int_t magma_cbcsrswp (magma_int_t r_blocks, magma_int_t size_b, magma_int_t *ipiv, magmaFloatComplex *x)
 For a Block-CSR ILU factorization, this routine swaps rows in the vector *x according to the pivoting in *ipiv. More...
 
magma_int_t magma_cbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaFloatComplex *A, magma_index_t *blockinfo, magmaFloatComplex *x)
 For a Block-CSR ILU factorization, this routine performs the triangular solves. More...
 
void magma_ccompact (magma_int_t m, magma_int_t n, magmaFloatComplex *dA, magma_int_t ldda, float *dnorms, float tol, magma_index_t *active, magma_index_t *cBlock)
 ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol. More...
 
magma_int_t magma_cjacobisetup_vector_gpu (int num_rows, magmaFloatComplex *b, magmaFloatComplex *d, magmaFloatComplex *c)
 Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. More...
 
magma_int_t magma_clobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaFloatComplex *X, magmaFloatComplex *Y)
 This routine computes a axpy for a mxn matrix: More...
 
int magma_cbicgmerge1 (int n, magmaFloatComplex *skp, magmaFloatComplex *v, magmaFloatComplex *r, magmaFloatComplex *p)
 Mergels multiple operations into one kernel: More...
 
int magma_cbicgmerge2 (int n, magmaFloatComplex *skp, magmaFloatComplex *r, magmaFloatComplex *v, magmaFloatComplex *s)
 Mergels multiple operations into one kernel: More...
 
int magma_cbicgmerge3 (int n, magmaFloatComplex *skp, magmaFloatComplex *p, magmaFloatComplex *s, magmaFloatComplex *t, magmaFloatComplex *x, magmaFloatComplex *r)
 Mergels multiple operations into one kernel: More...
 
int magma_cbicgmerge4 (int type, magmaFloatComplex *skp)
 Performs some parameter operations for the BiCGSTAB with scalars on GPU. More...
 
magma_int_t magma_cbicgmerge_spmv1 (magma_c_sparse_matrix A, magmaFloatComplex *d1, magmaFloatComplex *d2, magmaFloatComplex *d_p, magmaFloatComplex *d_r, magmaFloatComplex *d_v, magmaFloatComplex *skp)
 Merges the first SpmV using CSR with the dot product and the computation of alpha. More...
 
magma_int_t magma_cbicgmerge_spmv2 (magma_c_sparse_matrix A, magmaFloatComplex *d1, magmaFloatComplex *d2, magmaFloatComplex *d_s, magmaFloatComplex *d_t, magmaFloatComplex *skp)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 
magma_int_t magma_cbicgmerge_xrbeta (int n, magmaFloatComplex *d1, magmaFloatComplex *d2, magmaFloatComplex *rr, magmaFloatComplex *r, magmaFloatComplex *p, magmaFloatComplex *s, magmaFloatComplex *t, magmaFloatComplex *x, magmaFloatComplex *skp)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 

Detailed Description

Function Documentation

magma_int_t magma_cailu_csr_a ( magma_c_sparse_matrix  A,
magma_c_sparse_matrix  L,
magma_c_sparse_matrix  U 
)

This routine computes the ILU approximation of a matrix iteratively.

The idea is according to Edmond Chow's presentation at SIAM 2014. The input format of the matrix is Magma_CSRCOO for the upper and lower triangular parts. Note however, that we flip col and rowidx for the U-part. Every component of L and U is handled by one thread.

Parameters
A_Lmagma_c_sparse_matrix input matrix L
A_Umagma_c_sparse_matrix input matrix U
Lmagma_c_sparse_matrix input/output matrix L containing the ILU approximation
Umagma_c_sparse_matrix input/output matrix U containing the ILU approximation
magma_int_t magma_cailu_csr_s ( magma_c_sparse_matrix  A_L,
magma_c_sparse_matrix  A_U,
magma_c_sparse_matrix  L,
magma_c_sparse_matrix  U 
)

This routine computes the ILU approximation of a matrix iteratively.

The idea is according to Edmond Chow's presentation at SIAM 2014. The input format of the matrix is Magma_CSRCOO for the upper and lower triangular parts. Note however, that we flip col and rowidx for the U-part. Every component of L and U is handled by one thread.

Parameters
A_Lmagma_c_sparse_matrix input matrix L
A_Umagma_c_sparse_matrix input matrix U
Lmagma_c_sparse_matrix input/output matrix L containing the ILU approximation
Umagma_c_sparse_matrix input/output matrix U containing the ILU approximation
magma_int_t magma_cbajac_csr ( magma_int_t  localiters,
magma_c_sparse_matrix  D,
magma_c_sparse_matrix  R,
magma_c_vector  b,
magma_c_vector *  x 
)

This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
localitersmagma_int_t number of local Jacobi-like updates
Dmagma_c_sparse_matrix input matrix with diagonal blocks
Rmagma_c_sparse_matrix input matrix with non-diagonal parts
bmagma_c_vector RHS
xmagma_c_vector* iterate/solution
magma_int_t magma_cbcsrluegemm ( magma_int_t  size_b,
magma_int_t  num_brows,
magma_int_t  kblocks,
magmaFloatComplex **  dA,
magmaFloatComplex **  dB,
magmaFloatComplex **  dC 
)

For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.

Parameters
size_bmagma_int_t blocksize in BCSR
num_browsmagma_int_t number of block rows
kblocksmagma_int_t number of blocks in row
dAmagmaFloatComplex** input blocks of matrix A
dBmagmaFloatComplex** input blocks of matrix B
dCmagmaFloatComplex** output blocks of matrix C
magma_int_t magma_cbcsrlupivloc ( magma_int_t  size_b,
magma_int_t  kblocks,
magmaFloatComplex **  dA,
magma_int_t *  ipiv 
)

For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.

Parameters
size_bmagma_int_t blocksize in BCSR
kblocksmagma_int_t number of blocks
dAmagmaFloatComplex** matrix in BCSR
ipivmagma_int_t* array containing pivots
magma_int_t magma_cbcsrswp ( magma_int_t  r_blocks,
magma_int_t  size_b,
magma_int_t *  ipiv,
magmaFloatComplex *  x 
)

For a Block-CSR ILU factorization, this routine swaps rows in the vector *x according to the pivoting in *ipiv.

Parameters
r_blocksmagma_int_t number of blocks
size_bmagma_int_t blocksize in BCSR
ipivmagma_int_t* array containing pivots
xmagmaFloatComplex* input/output vector x
magma_int_t magma_cbcsrtrsv ( magma_uplo_t  uplo,
magma_int_t  r_blocks,
magma_int_t  c_blocks,
magma_int_t  size_b,
magmaFloatComplex *  A,
magma_index_t *  blockinfo,
magmaFloatComplex *  x 
)

For a Block-CSR ILU factorization, this routine performs the triangular solves.

Parameters
uplomagma_uplo_t upper/lower fill structure
r_blocksmagma_int_t number of blocks in row
c_blocksmagma_int_t number of blocks in column
size_bmagma_int_t blocksize in BCSR
AmagmaFloatComplex* upper/lower factor
blockinfomagma_int_t* array containing matrix information
xmagmaFloatComplex* input/output vector x
magma_int_t magma_cbcsrvalcpy ( magma_int_t  size_b,
magma_int_t  num_blocks,
magma_int_t  num_zblocks,
magmaFloatComplex **  Aval,
magmaFloatComplex **  Bval,
magmaFloatComplex **  Bval2 
)

For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.

Parameters
size_bmagma_int_t blocksize in BCSR
num_blocksmagma_int_t number of nonzero blocks
num_zblocksmagma_int_t number of zero-blocks (will later be filled)
AvalmagmaFloatComplex** pointers to the nonzero blocks in A
BvalmagmaFloatComplex** pointers to the nonzero blocks in B
Bval2magmaFloatComplex** pointers to the zero blocks in B
int magma_cbicgmerge1 ( int  n,
magmaFloatComplex *  skp,
magmaFloatComplex *  v,
magmaFloatComplex *  r,
magmaFloatComplex *  p 
)

Mergels multiple operations into one kernel:

p = beta*p p = p-omega*beta*v p = p+r

-> p = r + beta * ( p - omega * v )

Parameters
nint dimension n
skpmagmaFloatComplex* set of scalar parameters
vmagmaFloatComplex* input v
rmagmaFloatComplex* input r
pmagmaFloatComplex* input/output p
int magma_cbicgmerge2 ( int  n,
magmaFloatComplex *  skp,
magmaFloatComplex *  r,
magmaFloatComplex *  v,
magmaFloatComplex *  s 
)

Mergels multiple operations into one kernel:

s=r s=s-alpha*v

-> s = r - alpha * v

Parameters
nint dimension n
skpmagmaFloatComplex* set of scalar parameters
rmagmaFloatComplex* input r
vmagmaFloatComplex* input v
smagmaFloatComplex* input/output s
int magma_cbicgmerge3 ( int  n,
magmaFloatComplex *  skp,
magmaFloatComplex *  p,
magmaFloatComplex *  s,
magmaFloatComplex *  t,
magmaFloatComplex *  x,
magmaFloatComplex *  r 
)

Mergels multiple operations into one kernel:

x=x+alpha*p x=x+omega*s r=s r=r-omega*t

-> x = x + alpha * p + omega * s -> r = s - omega * t

Parameters
nint dimension n
skpmagmaFloatComplex* set of scalar parameters
pmagmaFloatComplex* input p
smagmaFloatComplex* input s
tmagmaFloatComplex* input t
xmagmaFloatComplex* input/output x
rmagmaFloatComplex* input/output r
int magma_cbicgmerge4 ( int  type,
magmaFloatComplex *  skp 
)

Performs some parameter operations for the BiCGSTAB with scalars on GPU.

Parameters
typeint kernel type
skpmagmaFloatComplex* vector with parameters
magma_int_t magma_cbicgmerge_spmv1 ( magma_c_sparse_matrix  A,
magmaFloatComplex *  d1,
magmaFloatComplex *  d2,
magmaFloatComplex *  d_p,
magmaFloatComplex *  d_r,
magmaFloatComplex *  d_v,
magmaFloatComplex *  skp 
)

Merges the first SpmV using CSR with the dot product and the computation of alpha.

Parameters
Amagma_c_sparse_matrix system matrix
d1magmaFloatComplex* temporary vector
d2magmaFloatComplex* temporary vector
d_pmagmaFloatComplex* input vector p
d_rmagmaFloatComplex* input vector r
d_vmagmaFloatComplex* output vector v
skpmagmaFloatComplex* array for parameters ( skp[0]=alpha )
magma_int_t magma_cbicgmerge_spmv2 ( magma_c_sparse_matrix  A,
magmaFloatComplex *  d1,
magmaFloatComplex *  d2,
magmaFloatComplex *  d_s,
magmaFloatComplex *  d_t,
magmaFloatComplex *  skp 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
Amagma_c_sparse_matrix input matrix
d1magmaFloatComplex* temporary vector
d2magmaFloatComplex* temporary vector
d_smagmaFloatComplex* input vector s
d_tmagmaFloatComplex* output vector t
skpmagmaFloatComplex* array for parameters
magma_int_t magma_cbicgmerge_xrbeta ( int  n,
magmaFloatComplex *  d1,
magmaFloatComplex *  d2,
magmaFloatComplex *  rr,
magmaFloatComplex *  r,
magmaFloatComplex *  p,
magmaFloatComplex *  s,
magmaFloatComplex *  t,
magmaFloatComplex *  x,
magmaFloatComplex *  skp 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
nint dimension n
d1magmaFloatComplex* temporary vector
d2magmaFloatComplex* temporary vector
rrmagmaFloatComplex* input vector rr
rmagmaFloatComplex* input/output vector r
pmagmaFloatComplex* input vector p
smagmaFloatComplex* input vector s
tmagmaFloatComplex* input vector t
xmagmaFloatComplex* output vector x
skpmagmaFloatComplex* array for parameters
void magma_ccompact ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex *  dA,
magma_int_t  ldda,
float *  dnorms,
float  tol,
magma_index_t *  active,
magma_index_t *  cBlock 
)

ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.

The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.

Parameters
[in]mINTEGER The number of rows of the matrix dA. M >= 0.
[in]nINTEGER The number of columns of the matrix dA. N >= 0.
[in,out]dACOMPLEX REAL array, dimension (LDDA,N) The m by n matrix dA.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in]dnormsREAL array, dimension N The norms of the N vectors in dA
[in]tolDOUBLE PRECISON The tolerance value used in the criteria to compact or not.
[out]activeINTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed
[out]cBlockmagma_index_t* The number of vectors that remain in dA (i.e., with norms > tol).
magma_int_t magma_cjacobisetup_vector_gpu ( int  num_rows,
magmaFloatComplex *  b,
magmaFloatComplex *  d,
magmaFloatComplex *  c 
)

Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.

Returns the vector c. It calls a GPU kernel

Parameters
num_rowsmagma_int_t number of rows
bmagma_c_vector RHS b
dmagma_c_vector vector with diagonal entries
cmagma_c_vector* c = D^(-1) * b
magma_int_t magma_clobpcg_maxpy ( magma_int_t  num_rows,
magma_int_t  num_vecs,
magmaFloatComplex *  X,
magmaFloatComplex *  Y 
)

This routine computes a axpy for a mxn matrix:

Y = X + Y

It replaces: magma_caxpy(m*n, c_one, Y, 1, X, 1);

/ x1[0] x2[0] x3[0] \
| x1[1] x2[1] x3[1] |

X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /

Parameters
num_rowsmagma_int_t number of rows
num_vecsmagma_int_t number of vectors
XmagmaFloatComplex* input vector X
YmagmaFloatComplex* input/output vector Y