double-complex precision
[GPU kernels for sparse LA]

Functions

magma_int_t magma_zbajac_csr (magma_int_t localiters, magma_z_matrix D, magma_z_matrix R, magma_z_matrix b, magma_z_matrix *x, magma_queue_t queue)
 This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
magma_int_t magma_zbcsrblockinfo5 (magma_int_t lustep, magma_int_t num_blocks, magma_int_t c_blocks, magma_int_t size_b, magma_index_t *blockinfo, magmaDoubleComplex_ptr val, magmaDoubleComplex_ptr *AII, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.
magma_int_t magma_zbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zblocks, magmaDoubleComplex_ptr *Aval, magmaDoubleComplex_ptr *Bval, magmaDoubleComplex_ptr *Bval2, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.
magma_int_t magma_zbcsrluegemm (magma_int_t size_b, magma_int_t num_brows, magma_int_t kblocks, magmaDoubleComplex_ptr *dA, magmaDoubleComplex_ptr *dB, magmaDoubleComplex_ptr *dC, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.
magma_int_t magma_zbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaDoubleComplex_ptr *dA, magmaInt_ptr ipiv, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.
magma_int_t magma_zbcsrswp (magma_int_t r_blocks, magma_int_t size_b, magmaInt_ptr ipiv, magmaDoubleComplex_ptr x, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine swaps rows in the vector *x according to the pivoting in *ipiv.
magma_int_t magma_zbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaDoubleComplex_ptr A, magma_index_t *blockinfo, magmaDoubleComplex_ptr x, magma_queue_t queue)
 For a Block-CSR ILU factorization, this routine performs the triangular solves.
magma_int_t magma_zcompact (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDouble_ptr dnorms, double tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue)
 ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.
magma_int_t magma_ziteric_csr (magma_z_matrix A, magma_z_matrix A_CSR, magma_queue_t queue)
 This routine iteratively computes an incomplete Cholesky factorization.
magma_int_t magma_ziterilu_csr (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_queue_t queue)
 This routine iteratively computes an incomplete LU factorization.
magma_int_t magma_zjacobisetup_vector_gpu (int num_rows, magma_z_matrix b, magma_z_matrix d, magma_z_matrix c, magma_z_matrix *x, magma_queue_t queue)
 Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
magma_int_t magma_zlobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaDoubleComplex_ptr X, magmaDoubleComplex_ptr Y, magma_queue_t queue)
 This routine computes a axpy for a mxn matrix:.
int magma_zbicgmerge1 (int n, magmaDoubleComplex_ptr skp, magmaDoubleComplex_ptr v, magmaDoubleComplex_ptr r, magmaDoubleComplex_ptr p)
 Mergels multiple operations into one kernel:.
int magma_zbicgmerge2 (int n, magmaDoubleComplex_ptr skp, magmaDoubleComplex_ptr r, magmaDoubleComplex_ptr v, magmaDoubleComplex_ptr s)
 Mergels multiple operations into one kernel:.
int magma_zbicgmerge3 (int n, magmaDoubleComplex_ptr skp, magmaDoubleComplex_ptr p, magmaDoubleComplex_ptr s, magmaDoubleComplex_ptr t, magmaDoubleComplex_ptr x, magmaDoubleComplex_ptr r)
 Mergels multiple operations into one kernel:.
int magma_zbicgmerge4 (int type, magmaDoubleComplex_ptr skp)
 Performs some parameter operations for the BiCGSTAB with scalars on GPU.
magma_int_t magma_zbicgmerge_spmv1 (magma_z_matrix A, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr dp, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dv, magmaDoubleComplex_ptr skp, magma_queue_t queue)
 Merges the first SpmV using CSR with the dot product and the computation of alpha.
magma_int_t magma_zbicgmerge_spmv2 (magma_z_matrix A, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr ds, magmaDoubleComplex_ptr dt, magmaDoubleComplex_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega.
magma_int_t magma_zbicgmerge_xrbeta (int n, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr rr, magmaDoubleComplex_ptr r, magmaDoubleComplex_ptr p, magmaDoubleComplex_ptr s, magmaDoubleComplex_ptr t, magmaDoubleComplex_ptr x, magmaDoubleComplex_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega.
magma_int_t magma_zcgmerge_spmv1 (magma_z_matrix A, magmaDoubleComplex_ptr d1, magmaDoubleComplex_ptr d2, magmaDoubleComplex_ptr dd, magmaDoubleComplex_ptr dz, magmaDoubleComplex_ptr skp, magma_queue_t queue)
 Merges the first SpmV using different formats with the dot product and the computation of rho.

Function Documentation

magma_int_t magma_zbajac_csr ( magma_int_t  localiters,
magma_z_matrix  D,
magma_z_matrix  R,
magma_z_matrix  b,
magma_z_matrix *  x,
magma_queue_t  queue 
)

This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters:
[in] localiters magma_int_t number of local Jacobi-like updates
[in] D magma_z_matrix input matrix with diagonal blocks
[in] R magma_z_matrix input matrix with non-diagonal parts
[in] b magma_z_matrix RHS
[in] x magma_z_matrix* iterate/solution
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrblockinfo5 ( magma_int_t  lustep,
magma_int_t  num_blocks,
magma_int_t  c_blocks,
magma_int_t  size_b,
magma_index_t *  blockinfo,
magmaDoubleComplex_ptr  val,
magmaDoubleComplex_ptr *  AII,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.

Parameters:
[in] lustep magma_int_t lustep
[in] num_blocks magma_int_t number of nonzero blocks
[in] c_blocks magma_int_t number of column-blocks
[in] size_b magma_int_t blocksize
[in] blockinfo magma_int_t* block filled? location?
[in] val magmaDoubleComplex* pointers to the nonzero blocks in A
[in] AII magmaDoubleComplex** pointers to the respective nonzero blocks in B
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrluegemm ( magma_int_t  size_b,
magma_int_t  num_brows,
magma_int_t  kblocks,
magmaDoubleComplex_ptr *  dA,
magmaDoubleComplex_ptr *  dB,
magmaDoubleComplex_ptr *  dC,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.

Parameters:
[in] size_b magma_int_t blocksize in BCSR
[in] num_brows magma_int_t number of block rows
[in] kblocks magma_int_t number of blocks in row
[in] dA magmaDoubleComplex** input blocks of matrix A
[in] dB magmaDoubleComplex** input blocks of matrix B
[in] dC magmaDoubleComplex** output blocks of matrix C
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrlupivloc ( magma_int_t  size_b,
magma_int_t  kblocks,
magmaDoubleComplex_ptr *  dA,
magmaInt_ptr  ipiv,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.

Parameters:
[in] size_b magma_int_t blocksize in BCSR
[in] kblocks magma_int_t number of blocks
[in] dA magmaDoubleComplex_ptr * matrix in BCSR
[in] ipiv magmaInt_ptr array containing pivots
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrswp ( magma_int_t  r_blocks,
magma_int_t  size_b,
magmaInt_ptr  ipiv,
magmaDoubleComplex_ptr  x,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine swaps rows in the vector *x according to the pivoting in *ipiv.

Parameters:
[in] r_blocks magma_int_t number of blocks
[in] size_b magma_int_t blocksize in BCSR
[in] ipiv magma_int_t* array containing pivots
[in] x magmaDoubleComplex_ptr input/output vector x
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrtrsv ( magma_uplo_t  uplo,
magma_int_t  r_blocks,
magma_int_t  c_blocks,
magma_int_t  size_b,
magmaDoubleComplex_ptr  A,
magma_index_t *  blockinfo,
magmaDoubleComplex_ptr  x,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine performs the triangular solves.

Parameters:
[in] uplo magma_uplo_t upper/lower fill structure
[in] r_blocks magma_int_t number of blocks in row
[in] c_blocks magma_int_t number of blocks in column
[in] size_b magma_int_t blocksize in BCSR
[in] A magmaDoubleComplex_ptr upper/lower factor
[in] blockinfo magma_int_t* array containing matrix information
[in] x magmaDoubleComplex_ptr input/output vector x
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbcsrvalcpy ( magma_int_t  size_b,
magma_int_t  num_blocks,
magma_int_t  num_zblocks,
magmaDoubleComplex_ptr *  Aval,
magmaDoubleComplex_ptr *  Bval,
magmaDoubleComplex_ptr *  Bval2,
magma_queue_t  queue 
)

For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.

Parameters:
[in] size_b magma_int_t blocksize in BCSR
[in] num_blocks magma_int_t number of nonzero blocks
[in] num_zblocks magma_int_t number of zero-blocks (will later be filled)
[in] Aval magmaDoubleComplex_ptr * pointers to the nonzero blocks in A
[in] Bval magmaDoubleComplex_ptr * pointers to the nonzero blocks in B
[in] Bval2 magmaDoubleComplex_ptr * pointers to the zero blocks in B
[in] queue magma_queue_t Queue to execute in.
int magma_zbicgmerge1 ( int  n,
magmaDoubleComplex_ptr  skp,
magmaDoubleComplex_ptr  v,
magmaDoubleComplex_ptr  r,
magmaDoubleComplex_ptr  p 
)

Mergels multiple operations into one kernel:.

p = beta*p p = p-omega*beta*v p = p+r

-> p = r + beta * ( p - omega * v )

Parameters:
[in] n int dimension n
[in] skp magmaDoubleComplex_ptr set of scalar parameters
[in] v magmaDoubleComplex_ptr input v
[in] r magmaDoubleComplex_ptr input r
in/out] p magmaDoubleComplex_ptr input/output p
[in] queue magma_queue_t Queue to execute in.
int magma_zbicgmerge2 ( int  n,
magmaDoubleComplex_ptr  skp,
magmaDoubleComplex_ptr  r,
magmaDoubleComplex_ptr  v,
magmaDoubleComplex_ptr  s 
)

Mergels multiple operations into one kernel:.

s=r s=s-alpha*v

-> s = r - alpha * v

Parameters:
[in] n int dimension n
[in] skp magmaDoubleComplex_ptr set of scalar parameters
[in] r magmaDoubleComplex_ptr input r
[in] v magmaDoubleComplex_ptr input v
s] s magmaDoubleComplex_ptr output s
[in] queue magma_queue_t Queue to execute in.
int magma_zbicgmerge3 ( int  n,
magmaDoubleComplex_ptr  skp,
magmaDoubleComplex_ptr  p,
magmaDoubleComplex_ptr  s,
magmaDoubleComplex_ptr  t,
magmaDoubleComplex_ptr  x,
magmaDoubleComplex_ptr  r 
)

Mergels multiple operations into one kernel:.

x=x+alpha*p x=x+omega*s r=s r=r-omega*t

-> x = x + alpha * p + omega * s -> r = s - omega * t

Parameters:
[in] n int dimension n
[in] skp magmaDoubleComplex_ptr set of scalar parameters
[in] p magmaDoubleComplex_ptr input p
[in] s magmaDoubleComplex_ptr input s
[in] t magmaDoubleComplex_ptr input t
in/out] x magmaDoubleComplex_ptr input/output x
in/out] r magmaDoubleComplex_ptr input/output r
[in] queue magma_queue_t Queue to execute in.
int magma_zbicgmerge4 ( int  type,
magmaDoubleComplex_ptr  skp 
)

Performs some parameter operations for the BiCGSTAB with scalars on GPU.

Parameters:
[in] type int kernel type
in/out] skp magmaDoubleComplex_ptr vector with parameters
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbicgmerge_spmv1 ( magma_z_matrix  A,
magmaDoubleComplex_ptr  d1,
magmaDoubleComplex_ptr  d2,
magmaDoubleComplex_ptr  dp,
magmaDoubleComplex_ptr  dr,
magmaDoubleComplex_ptr  dv,
magmaDoubleComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using CSR with the dot product and the computation of alpha.

Parameters:
[in] A magma_z_matrix system matrix
[in] d1 magmaDoubleComplex_ptr temporary vector
[in] d2 magmaDoubleComplex_ptr temporary vector
[in] dp magmaDoubleComplex_ptr input vector p
[in] dr magmaDoubleComplex_ptr input vector r
[in] dv magmaDoubleComplex_ptr output vector v
in/out] skp magmaDoubleComplex_ptr array for parameters ( skp[0]=alpha )
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbicgmerge_spmv2 ( magma_z_matrix  A,
magmaDoubleComplex_ptr  d1,
magmaDoubleComplex_ptr  d2,
magmaDoubleComplex_ptr  ds,
magmaDoubleComplex_ptr  dt,
magmaDoubleComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters:
[in] A magma_z_matrix input matrix
[in] d1 magmaDoubleComplex_ptr temporary vector
[in] d2 magmaDoubleComplex_ptr temporary vector
[in] ds magmaDoubleComplex_ptr input vector s
[in] dt magmaDoubleComplex_ptr output vector t
in/out] skp magmaDoubleComplex_ptr array for parameters
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zbicgmerge_xrbeta ( int  n,
magmaDoubleComplex_ptr  d1,
magmaDoubleComplex_ptr  d2,
magmaDoubleComplex_ptr  rr,
magmaDoubleComplex_ptr  r,
magmaDoubleComplex_ptr  p,
magmaDoubleComplex_ptr  s,
magmaDoubleComplex_ptr  t,
magmaDoubleComplex_ptr  x,
magmaDoubleComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters:
[in] n int dimension n
[in] d1 magmaDoubleComplex_ptr temporary vector
[in] d2 magmaDoubleComplex_ptr temporary vector
[in] rr magmaDoubleComplex_ptr input vector rr
[in] r magmaDoubleComplex_ptr input/output vector r
[in] p magmaDoubleComplex_ptr input vector p
[in] s magmaDoubleComplex_ptr input vector s
[in] t magmaDoubleComplex_ptr input vector t
[out] x magmaDoubleComplex_ptr output vector x
[in] skp magmaDoubleComplex_ptr array for parameters
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zcgmerge_spmv1 ( magma_z_matrix  A,
magmaDoubleComplex_ptr  d1,
magmaDoubleComplex_ptr  d2,
magmaDoubleComplex_ptr  dd,
magmaDoubleComplex_ptr  dz,
magmaDoubleComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using different formats with the dot product and the computation of rho.

Parameters:
[in] A magma_z_matrix input matrix
[in] d1 magmaDoubleComplex_ptr temporary vector
[in] d2 magmaDoubleComplex_ptr temporary vector
[in] dd magmaDoubleComplex_ptr input vector d
[out] dz magmaDoubleComplex_ptr input vector z
[out] skp magmaDoubleComplex_ptr array for parameters ( skp[3]=rho )
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zcompact ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex_ptr  dA,
magma_int_t  ldda,
magmaDouble_ptr  dnorms,
double  tol,
magmaInt_ptr  active,
magmaInt_ptr  cBlock,
magma_queue_t  queue 
)

ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.

The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.

Parameters:
[in] m INTEGER The number of rows of the matrix dA. M >= 0.
[in] n INTEGER The number of columns of the matrix dA. N >= 0.
[in] in,out] dA COMPLEX DOUBLE PRECISION array, dimension (LDDA,N) The m by n matrix dA.
[in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in] dnorms DOUBLE PRECISION array, dimension N The norms of the N vectors in dA
[in] tol DOUBLE PRECISON The tolerance value used in the criteria to compact or not.
[in] out] active INTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed
[in] out] cBlock magmaInt_ptr The number of vectors that remain in dA (i.e., with norms > tol).
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_ziteric_csr ( magma_z_matrix  A,
magma_z_matrix  A_CSR,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete Cholesky factorization.

The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'

The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.

Parameters:
[in] A magma_z_matrix input matrix A - initial guess (lower triangular)
[in] out] A_CSR magma_z_matrix input/output matrix containing the IC approximation
[in] A_CSR magma_z_matrix input/output matrix containing the IC approximation
magma_int_t magma_ziterilu_csr ( magma_z_matrix  A,
magma_z_matrix  L,
magma_z_matrix  U,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete LU factorization.

The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'

The input format of the matrix is Magma_CSRCOO for the upper and lower triangular parts. Note however, that we flip col and rowidx for the U-part. Every component of L and U is handled by one thread.

Parameters:
[in] A magma_z_matrix input matrix A determing initial guess & processing order
[in] out] L magma_z_matrix input/output matrix L containing the ILU approximation
[in] out] U magma_z_matrix input/output matrix U containing the ILU approximation
[in] A_CSR magma_z_matrix input/output matrix containing the IC approximation
magma_int_t magma_zjacobisetup_vector_gpu ( int  num_rows,
magma_z_matrix  b,
magma_z_matrix  d,
magma_z_matrix  c,
magma_z_matrix *  x,
magma_queue_t  queue 
)

Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.

Returns the vector c. It calls a GPU kernel

Parameters:
[in] num_rows magma_int_t number of rows
[in] b magma_z_matrix RHS b
[in] d magma_z_matrix vector with diagonal entries
[out] c magma_z_matrix* c = D^(-1) * b
[out] x magma_z_matrix* iteration vector
[in] queue magma_queue_t Queue to execute in.
magma_int_t magma_zlobpcg_maxpy ( magma_int_t  num_rows,
magma_int_t  num_vecs,
magmaDoubleComplex_ptr  X,
magmaDoubleComplex_ptr  Y,
magma_queue_t  queue 
)

This routine computes a axpy for a mxn matrix:.

Y = X + Y

It replaces: magma_zaxpy(m*n, c_one, Y, 1, X, 1);

/ x1[0] x2[0] x3[0] \ | x1[1] x2[1] x3[1] | X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /

Parameters:
[in] num_rows magma_int_t number of rows
[in] num_vecs magma_int_t number of vectors
[in] X magmaDoubleComplex_ptr input vector X
in/out] Y magmaDoubleComplex_ptr input/output vector Y
[in] queue magma_queue_t Queue to execute in.

Generated on 3 May 2015 for MAGMA by  doxygen 1.6.1