MAGMA  2.7.1
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages

Functions

magma_int_t magma_cbajac_csr (magma_int_t localiters, magma_c_matrix D, magma_c_matrix R, magma_c_matrix b, magma_c_matrix *x, magma_queue_t queue)
 This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. More...
 
magma_int_t magma_cbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_c_matrix *D, magma_c_matrix *R, magma_c_matrix b, magma_c_matrix *x, magma_queue_t queue)
 This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. More...
 
magma_int_t magma_ccompact (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloat_ptr dnorms, float tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue)
 ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol. More...
 
magma_int_t magma_cjacobisetup_vector_gpu (magma_int_t num_rows, magma_c_matrix b, magma_c_matrix d, magma_c_matrix c, magma_c_matrix *x, magma_queue_t queue)
 Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. More...
 
magma_int_t magma_clobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaFloatComplex_ptr X, magmaFloatComplex_ptr Y, magma_queue_t queue)
 This routine computes a axpy for a mxn matrix: More...
 
magma_int_t magma_cbicgstab_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex omega, magmaFloatComplex_ptr r, magmaFloatComplex_ptr v, magmaFloatComplex_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgstab_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex_ptr r, magmaFloatComplex_ptr v, magmaFloatComplex_ptr s, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgstab_3 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex omega, magmaFloatComplex_ptr p, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgstab_4 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex omega, magmaFloatComplex_ptr y, magmaFloatComplex_ptr z, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgmerge_spmv1 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr dp, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr skp, magma_queue_t queue)
 Merges the first SpmV using CSR with the dot product and the computation of alpha. More...
 
magma_int_t magma_cbicgmerge_spmv2 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr ds, magmaFloatComplex_ptr dt, magmaFloatComplex_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 
magma_int_t magma_cbicgmerge_xrbeta (magma_int_t n, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr rr, magmaFloatComplex_ptr r, magmaFloatComplex_ptr p, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 
magma_int_t magma_cbicgmerge1 (magma_int_t n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr v, magmaFloatComplex_ptr r, magmaFloatComplex_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgmerge2 (magma_int_t n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr r, magmaFloatComplex_ptr v, magmaFloatComplex_ptr s, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgmerge3 (magma_int_t n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr p, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cbicgmerge4 (magma_int_t type, magmaFloatComplex_ptr skp, magma_queue_t queue)
 Performs some parameter operations for the BiCGSTAB with scalars on GPU. More...
 
magma_int_t magma_cmergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex_ptr alpha, magmaFloatComplex_ptr p, magmaFloatComplex_ptr x, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ccgmerge_spmv1 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr dd, magmaFloatComplex_ptr dz, magmaFloatComplex_ptr skp, magma_queue_t queue)
 Merges the first SpmV using different formats with the dot product and the computation of rho. More...
 
magma_int_t magma_ccgs_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex_ptr r, magmaFloatComplex_ptr q, magmaFloatComplex_ptr u, magmaFloatComplex_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ccgs_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex_ptr r, magmaFloatComplex_ptr u, magmaFloatComplex_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ccgs_3 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex_ptr v_hat, magmaFloatComplex_ptr u, magmaFloatComplex_ptr q, magmaFloatComplex_ptr t, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ccgs_4 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex_ptr u_hat, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cidr_smoothing_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex_ptr drs, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cidr_smoothing_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex omega, magmaFloatComplex_ptr dx, magmaFloatComplex_ptr dxs, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex rho, magmaFloatComplex psi, magmaFloatComplex_ptr y, magmaFloatComplex_ptr z, magmaFloatComplex_ptr v, magmaFloatComplex_ptr w, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex pde, magmaFloatComplex rde, magmaFloatComplex_ptr y, magmaFloatComplex_ptr z, magmaFloatComplex_ptr p, magmaFloatComplex_ptr q, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_3 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex_ptr pt, magmaFloatComplex_ptr v, magmaFloatComplex_ptr y, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_4 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex eta, magmaFloatComplex_ptr p, magmaFloatComplex_ptr pt, magmaFloatComplex_ptr d, magmaFloatComplex_ptr s, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_5 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex eta, magmaFloatComplex pds, magmaFloatComplex_ptr p, magmaFloatComplex_ptr pt, magmaFloatComplex_ptr d, magmaFloatComplex_ptr s, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_6 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex rho, magmaFloatComplex psi, magmaFloatComplex_ptr y, magmaFloatComplex_ptr z, magmaFloatComplex_ptr v, magmaFloatComplex_ptr w, magmaFloatComplex_ptr wt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_7 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex_ptr pt, magmaFloatComplex_ptr v, magmaFloatComplex_ptr vt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cqmr_8 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex rho, magmaFloatComplex psi, magmaFloatComplex_ptr vt, magmaFloatComplex_ptr wt, magmaFloatComplex_ptr y, magmaFloatComplex_ptr z, magmaFloatComplex_ptr v, magmaFloatComplex_ptr w, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ctfqmr_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex sigma, magmaFloatComplex_ptr v, magmaFloatComplex_ptr Au, magmaFloatComplex_ptr u_m, magmaFloatComplex_ptr pu_m, magmaFloatComplex_ptr u_mp1, magmaFloatComplex_ptr w, magmaFloatComplex_ptr d, magmaFloatComplex_ptr Ad, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ctfqmr_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex eta, magmaFloatComplex_ptr d, magmaFloatComplex_ptr Ad, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ctfqmr_3 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex_ptr w, magmaFloatComplex_ptr u_m, magmaFloatComplex_ptr u_mp1, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_ctfqmr_4 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex beta, magmaFloatComplex_ptr Au_new, magmaFloatComplex_ptr v, magmaFloatComplex_ptr Au, magma_queue_t queue)
 Merges multiple operations into one kernel: More...
 
magma_int_t magma_ctfqmr_5 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex alpha, magmaFloatComplex sigma, magmaFloatComplex_ptr v, magmaFloatComplex_ptr Au, magmaFloatComplex_ptr u_mp1, magmaFloatComplex_ptr w, magmaFloatComplex_ptr d, magmaFloatComplex_ptr Ad, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_cparic_csr (magma_c_matrix A, magma_c_matrix A_CSR, magma_queue_t queue)
 This routine iteratively computes an incomplete LU factorization. More...
 
magma_int_t magma_cparilu_csr (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_queue_t queue)
 This routine iteratively computes an incomplete LU factorization. More...
 
magma_int_t magma_cparilut_sweep_gpu (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue)
 This function does an ParILUT sweep. More...
 
magma_int_t magma_cparilut_residuals_gpu (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *R, magma_queue_t queue)
 This function computes the ILU residual in the locations included in the sparsity pattern of R. More...
 
magma_int_t magma_cvalinit_gpu (magma_int_t num_el, magmaFloatComplex_ptr dval, magma_queue_t queue)
 Initializes a device array with zero. More...
 
magma_int_t magma_cindexinit_gpu (magma_int_t num_el, magmaIndex_ptr dind, magma_queue_t queue)
 Initializes a device array with zero. More...
 

Detailed Description

Function Documentation

magma_int_t magma_cbajac_csr ( magma_int_t  localiters,
magma_c_matrix  D,
magma_c_matrix  R,
magma_c_matrix  b,
magma_c_matrix *  x,
magma_queue_t  queue 
)

This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
[in]localitersmagma_int_t number of local Jacobi-like updates
[in]Dmagma_c_matrix input matrix with diagonal blocks
[in]Rmagma_c_matrix input matrix with non-diagonal parts
[in]bmagma_c_matrix RHS
[in]xmagma_c_matrix* iterate/solution
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbajac_csr_overlap ( magma_int_t  localiters,
magma_int_t  matrices,
magma_int_t  overlap,
magma_c_matrix *  D,
magma_c_matrix *  R,
magma_c_matrix  b,
magma_c_matrix *  x,
magma_queue_t  queue 
)

This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
[in]localitersmagma_int_t number of local Jacobi-like updates
[in]matricesmagma_int_t number of sub-matrices
[in]overlapmagma_int_t size of the overlap
[in]Dmagma_c_matrix* set of matrices with diagonal blocks
[in]Rmagma_c_matrix* set of matrices with non-diagonal parts
[in]bmagma_c_matrix RHS
[in]xmagma_c_matrix* iterate/solution
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccompact ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex_ptr  dA,
magma_int_t  ldda,
magmaFloat_ptr  dnorms,
float  tol,
magmaInt_ptr  active,
magmaInt_ptr  cBlock,
magma_queue_t  queue 
)

ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.

The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.

Parameters
[in]mINTEGER The number of rows of the matrix dA. M >= 0.
[in]nINTEGER The number of columns of the matrix dA. N >= 0.
[in,out]dACOMPLEX REAL array, dimension (LDDA,N) The m by n matrix dA.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in]dnormsREAL array, dimension N The norms of the N vectors in dA
[in]tolDOUBLE PRECISON The tolerance value used in the criteria to compact or not.
[in,out]activeINTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed
[in,out]cBlockmagmaInt_ptr The number of vectors that remain in dA (i.e., with norms > tol).
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cjacobisetup_vector_gpu ( magma_int_t  num_rows,
magma_c_matrix  b,
magma_c_matrix  d,
magma_c_matrix  c,
magma_c_matrix *  x,
magma_queue_t  queue 
)

Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.

Returns the vector c. It calls a GPU kernel

Parameters
[in]num_rowsmagma_int_t number of rows
[in]bmagma_c_matrix RHS b
[in]dmagma_c_matrix vector with diagonal entries
[out]cmagma_c_matrix* c = D^(-1) * b
[out]xmagma_c_matrix* iteration vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_clobpcg_maxpy ( magma_int_t  num_rows,
magma_int_t  num_vecs,
magmaFloatComplex_ptr  X,
magmaFloatComplex_ptr  Y,
magma_queue_t  queue 
)

This routine computes a axpy for a mxn matrix:

Y = X + Y

It replaces: magma_caxpy(m*n, c_one, Y, 1, X, 1);

/ x1[0] x2[0] x3[0] \
| x1[1] x2[1] x3[1] |

X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /

Parameters
[in]num_rowsmagma_int_t number of rows
[in]num_vecsmagma_int_t number of vectors
[in]XmagmaFloatComplex_ptr input vector X
[in,out]YmagmaFloatComplex_ptr input/output vector Y
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgstab_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex  omega,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = r + beta * ( p - omega * v )

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]omegamagmaFloatComplex scalar
[in]rmagmaFloatComplex_ptr vector
[in]vmagmaFloatComplex_ptr vector
[in,out]pmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgstab_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  s,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

s = r - alpha v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]rmagmaFloatComplex_ptr vector
[in]vmagmaFloatComplex_ptr vector
[in,out]smagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgstab_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex  omega,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  t,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha * p + omega * s r = s - omega * t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]omegamagmaFloatComplex scalar
[in]pmagmaFloatComplex_ptr vector
[in]smagmaFloatComplex_ptr vector
[in]tmagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgstab_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex  omega,
magmaFloatComplex_ptr  y,
magmaFloatComplex_ptr  z,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  t,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha * y + omega * z r = s - omega * t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]omegamagmaFloatComplex scalar
[in]ymagmaFloatComplex_ptr vector
[in]zmagmaFloatComplex_ptr vector
[in]smagmaFloatComplex_ptr vector
[in]tmagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgmerge_spmv1 ( magma_c_matrix  A,
magmaFloatComplex_ptr  d1,
magmaFloatComplex_ptr  d2,
magmaFloatComplex_ptr  dp,
magmaFloatComplex_ptr  dr,
magmaFloatComplex_ptr  dv,
magmaFloatComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using CSR with the dot product and the computation of alpha.

Parameters
[in]Amagma_c_matrix system matrix
[in]d1magmaFloatComplex_ptr temporary vector
[in]d2magmaFloatComplex_ptr temporary vector
[in]dpmagmaFloatComplex_ptr input vector p
[in]drmagmaFloatComplex_ptr input vector r
[in]dvmagmaFloatComplex_ptr output vector v
[in,out]skpmagmaFloatComplex_ptr array for parameters ( skp[0]=alpha )
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgmerge_spmv2 ( magma_c_matrix  A,
magmaFloatComplex_ptr  d1,
magmaFloatComplex_ptr  d2,
magmaFloatComplex_ptr  ds,
magmaFloatComplex_ptr  dt,
magmaFloatComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
[in]Amagma_c_matrix input matrix
[in]d1magmaFloatComplex_ptr temporary vector
[in]d2magmaFloatComplex_ptr temporary vector
[in]dsmagmaFloatComplex_ptr input vector s
[in]dtmagmaFloatComplex_ptr output vector t
[in,out]skpmagmaFloatComplex_ptr array for parameters
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgmerge_xrbeta ( magma_int_t  n,
magmaFloatComplex_ptr  d1,
magmaFloatComplex_ptr  d2,
magmaFloatComplex_ptr  rr,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  t,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
[in]nint dimension n
[in]d1magmaFloatComplex_ptr temporary vector
[in]d2magmaFloatComplex_ptr temporary vector
[in]rrmagmaFloatComplex_ptr input vector rr
[in]rmagmaFloatComplex_ptr input/output vector r
[in]pmagmaFloatComplex_ptr input vector p
[in]smagmaFloatComplex_ptr input vector s
[in]tmagmaFloatComplex_ptr input vector t
[out]xmagmaFloatComplex_ptr output vector x
[in]skpmagmaFloatComplex_ptr array for parameters
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgmerge1 ( magma_int_t  n,
magmaFloatComplex_ptr  skp,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = beta*p p = p-omega*beta*v p = p+r

-> p = r + beta * ( p - omega * v )

Parameters
[in]nint dimension n
[in]skpmagmaFloatComplex_ptr set of scalar parameters
[in]vmagmaFloatComplex_ptr input vector v
[in]rmagmaFloatComplex_ptr input vector r
[in,out]pmagmaFloatComplex_ptr input/output vector p
[in]queuemagma_queue_t queue to execute in.
magma_int_t magma_cbicgmerge2 ( magma_int_t  n,
magmaFloatComplex_ptr  skp,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  s,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

s=r s=s-alpha*v

-> s = r - alpha * v

Parameters
[in]nint dimension n
[in]skpmagmaFloatComplex_ptr set of scalar parameters
[in]rmagmaFloatComplex_ptr input vector r
[in]vmagmaFloatComplex_ptr input vector v
[out]smagmaFloatComplex_ptr output vector s
[in]queuemagma_queue_t queue to execute in.
magma_int_t magma_cbicgmerge3 ( magma_int_t  n,
magmaFloatComplex_ptr  skp,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  t,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x=x+alpha*p x=x+omega*s r=s r=r-omega*t

-> x = x + alpha * p + omega * s -> r = s - omega * t

Parameters
[in]nint dimension n
[in]skpmagmaFloatComplex_ptr set of scalar parameters
[in]pmagmaFloatComplex_ptr input p
[in]smagmaFloatComplex_ptr input s
[in]tmagmaFloatComplex_ptr input t
[in,out]xmagmaFloatComplex_ptr input/output x
[in,out]rmagmaFloatComplex_ptr input/output r
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cbicgmerge4 ( magma_int_t  type,
magmaFloatComplex_ptr  skp,
magma_queue_t  queue 
)

Performs some parameter operations for the BiCGSTAB with scalars on GPU.

Parameters
[in]typeint kernel type
[in,out]skpmagmaFloatComplex_ptr vector with parameters
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cmergeblockkrylov ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex_ptr  alpha,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  x,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex_ptr matrix containing all SKP
[in]pmagmaFloatComplex_ptr search directions
[in,out]xmagmaFloatComplex_ptr approximation vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccgmerge_spmv1 ( magma_c_matrix  A,
magmaFloatComplex_ptr  d1,
magmaFloatComplex_ptr  d2,
magmaFloatComplex_ptr  dd,
magmaFloatComplex_ptr  dz,
magmaFloatComplex_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using different formats with the dot product and the computation of rho.

Parameters
[in]Amagma_c_matrix input matrix
[in]d1magmaFloatComplex_ptr temporary vector
[in]d2magmaFloatComplex_ptr temporary vector
[in]ddmagmaFloatComplex_ptr input vector d
[out]dzmagmaFloatComplex_ptr input vector z
[out]skpmagmaFloatComplex_ptr array for parameters ( skp[3]=rho )
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccgs_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  q,
magmaFloatComplex_ptr  u,
magmaFloatComplex_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u = r + beta q p = u + beta*(q + beta*p)

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]rmagmaFloatComplex_ptr vector
[in]qmagmaFloatComplex_ptr vector
[in,out]umagmaFloatComplex_ptr vector
[in,out]pmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccgs_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex_ptr  r,
magmaFloatComplex_ptr  u,
magmaFloatComplex_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u = r p = r

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rmagmaFloatComplex_ptr vector
[in,out]umagmaFloatComplex_ptr vector
[in,out]pmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccgs_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex_ptr  v_hat,
magmaFloatComplex_ptr  u,
magmaFloatComplex_ptr  q,
magmaFloatComplex_ptr  t,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

q = u - alpha v_hat t = u + q

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]v_hatmagmaFloatComplex_ptr vector
[in]umagmaFloatComplex_ptr vector
[in,out]qmagmaFloatComplex_ptr vector
[in,out]tmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ccgs_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex_ptr  u_hat,
magmaFloatComplex_ptr  t,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha u_hat r = r -alpha*A u_hat = r -alpha*t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]u_hatmagmaFloatComplex_ptr vector
[in]tmagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cidr_smoothing_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex_ptr  drs,
magmaFloatComplex_ptr  dr,
magmaFloatComplex_ptr  dt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

dt = drs - dr

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]drsmagmaFloatComplex_ptr vector
[in]drmagmaFloatComplex_ptr vector
[in,out]dtmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cidr_smoothing_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  omega,
magmaFloatComplex_ptr  dx,
magmaFloatComplex_ptr  dxs,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

dxs = dxs - gamma*(dxs-dx)

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]omegamagmaFloatComplex scalar
[in]dxmagmaFloatComplex_ptr vector
[in,out]dxsmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  rho,
magmaFloatComplex  psi,
magmaFloatComplex_ptr  y,
magmaFloatComplex_ptr  z,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  w,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rhomagmaFloatComplex scalar
[in]psimagmaFloatComplex scalar
[in,out]ymagmaFloatComplex_ptr vector
[in,out]zmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]wmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  pde,
magmaFloatComplex  rde,
magmaFloatComplex_ptr  y,
magmaFloatComplex_ptr  z,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  q,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = y - pde * p q = z - rde * q

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]pdemagmaFloatComplex scalar
[in]rdemagmaFloatComplex scalar
[in]ymagmaFloatComplex_ptr vector
[in]zmagmaFloatComplex_ptr vector
[in,out]pmagmaFloatComplex_ptr vector
[in,out]qmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex_ptr  pt,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  y,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = pt - beta * v y = v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]ptmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]ymagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  eta,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  pt,
magmaFloatComplex_ptr  d,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

d = eta * p; s = eta * pt; x = x + d; r = r - s;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etamagmaFloatComplex scalar
[in]pmagmaFloatComplex_ptr vector
[in]ptmagmaFloatComplex_ptr vector
[in,out]dmagmaFloatComplex_ptr vector
[in,out]smagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_5 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  eta,
magmaFloatComplex  pds,
magmaFloatComplex_ptr  p,
magmaFloatComplex_ptr  pt,
magmaFloatComplex_ptr  d,
magmaFloatComplex_ptr  s,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

d = eta * p + pds * d; s = eta * pt + pds * s; x = x + d; r = r - s;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etamagmaFloatComplex scalar
[in]pdsmagmaFloatComplex scalar
[in]pmagmaFloatComplex_ptr vector
[in]ptmagmaFloatComplex_ptr vector
[in,out]dmagmaFloatComplex_ptr vector
[in,out]smagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_6 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex  rho,
magmaFloatComplex  psi,
magmaFloatComplex_ptr  y,
magmaFloatComplex_ptr  z,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  w,
magmaFloatComplex_ptr  wt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

wt = wt - conj(beta) * w v = y / rho y = y / rho w = wt / psi z = wt / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]rhomagmaFloatComplex scalar
[in]psimagmaFloatComplex scalar
[in,out]ymagmaFloatComplex_ptr vector
[in,out]zmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]wmagmaFloatComplex_ptr vector
[in,out]wtmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_7 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex_ptr  pt,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  vt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

vt = pt - beta * v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]ptmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]vtmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cqmr_8 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  rho,
magmaFloatComplex  psi,
magmaFloatComplex_ptr  vt,
magmaFloatComplex_ptr  wt,
magmaFloatComplex_ptr  y,
magmaFloatComplex_ptr  z,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  w,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rhomagmaFloatComplex scalar
[in]psimagmaFloatComplex scalar
[in]vtmagmaFloatComplex_ptr vector
[in]wtmagmaFloatComplex_ptr vector
[in,out]ymagmaFloatComplex_ptr vector
[in,out]zmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]wmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ctfqmr_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex  sigma,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  Au,
magmaFloatComplex_ptr  u_m,
magmaFloatComplex_ptr  pu_m,
magmaFloatComplex_ptr  u_mp1,
magmaFloatComplex_ptr  w,
magmaFloatComplex_ptr  d,
magmaFloatComplex_ptr  Ad,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u_mp1 = u_mp1 - alpha*v; w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]sigmamagmaFloatComplex scalar
[in]vmagmaFloatComplex_ptr vector
[in]AumagmaFloatComplex_ptr vector
[in,out]u_mmagmaFloatComplex_ptr vector
[in,out]pu_mmagmaFloatComplex_ptr vector
[in,out]u_mp1magmaFloatComplex_ptr vector
[in,out]wmagmaFloatComplex_ptr vector
[in,out]dmagmaFloatComplex_ptr vector
[in,out]AdmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ctfqmr_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  eta,
magmaFloatComplex_ptr  d,
magmaFloatComplex_ptr  Ad,
magmaFloatComplex_ptr  x,
magmaFloatComplex_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + eta * d r = r - eta * Ad

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etamagmaFloatComplex scalar
[in]dmagmaFloatComplex_ptr vector
[in]AdmagmaFloatComplex_ptr vector
[in,out]xmagmaFloatComplex_ptr vector
[in,out]rmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ctfqmr_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex_ptr  w,
magmaFloatComplex_ptr  u_m,
magmaFloatComplex_ptr  u_mp1,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u_mp1 = w + beta*u_mp1

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]wmagmaFloatComplex_ptr vector
[in]u_mmagmaFloatComplex_ptr vector
[in,out]u_mp1magmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ctfqmr_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  beta,
magmaFloatComplex_ptr  Au_new,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  Au,
magma_queue_t  queue 
)

Merges multiple operations into one kernel:

v = Au_new + beta*(Au+beta*v); Au = Au_new

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betamagmaFloatComplex scalar
[in]Au_newmagmaFloatComplex_ptr vector
[in,out]vmagmaFloatComplex_ptr vector
[in,out]AumagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_ctfqmr_5 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloatComplex  alpha,
magmaFloatComplex  sigma,
magmaFloatComplex_ptr  v,
magmaFloatComplex_ptr  Au,
magmaFloatComplex_ptr  u_mp1,
magmaFloatComplex_ptr  w,
magmaFloatComplex_ptr  d,
magmaFloatComplex_ptr  Ad,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloatComplex scalar
[in]sigmamagmaFloatComplex scalar
[in]vmagmaFloatComplex_ptr vector
[in]AumagmaFloatComplex_ptr vector
[in,out]u_mp1magmaFloatComplex_ptr vector
[in,out]wmagmaFloatComplex_ptr vector
[in,out]dmagmaFloatComplex_ptr vector
[in,out]AdmagmaFloatComplex_ptr vector
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cparic_csr ( magma_c_matrix  A,
magma_c_matrix  A_CSR,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete LU factorization.

For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.

The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.

Parameters
[in]Amagma_c_matrix input matrix A - initial guess (lower triangular)
[in,out]A_CSRmagma_c_matrix input/output matrix containing the IC approximation
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cparilu_csr ( magma_c_matrix  A,
magma_c_matrix  L,
magma_c_matrix  U,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete LU factorization.

For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.

The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.

Parameters
[in]Amagma_c_matrix input matrix A determing initial guess & processing order
[in,out]Lmagma_c_matrix input/output matrix L containing the lower triangular factor
[in,out]Umagma_c_matrix input/output matrix U containing the upper triangular factor
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cparilut_sweep_gpu ( magma_c_matrix *  A,
magma_c_matrix *  L,
magma_c_matrix *  U,
magma_queue_t  queue 
)

This function does an ParILUT sweep.

The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.

This is the GPU version of the asynchronous ParILUT sweep.

Parameters
[in]Amagma_c_matrix* System matrix. The format is sorted CSR.
[in,out]Lmagma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in,out]Umagma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cparilut_residuals_gpu ( magma_c_matrix  A,
magma_c_matrix  L,
magma_c_matrix  U,
magma_c_matrix *  R,
magma_queue_t  queue 
)

This function computes the ILU residual in the locations included in the sparsity pattern of R.

Parameters
[in]Amagma_c_matrix System matrix. The format is sorted CSR.
[in]Lmagma_c_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in]Umagma_c_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in,out]Rmagma_c_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual.
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cvalinit_gpu ( magma_int_t  num_el,
magmaFloatComplex_ptr  dval,
magma_queue_t  queue 
)

Initializes a device array with zero.

Parameters
[in]num_elmagma_int_t size of array
[in,out]dvalmagmaFloatComplex_ptr array to initialize
[in]queuemagma_queue_t Queue to execute in.
magma_int_t magma_cindexinit_gpu ( magma_int_t  num_el,
magmaIndex_ptr  dind,
magma_queue_t  queue 
)

Initializes a device array with zero.

Parameters
[in]num_elmagma_int_t size of array
[in,out]dindmagmaIndex_ptr array to initialize
[in]queuemagma_queue_t Queue to execute in.