![]() |
MAGMA
1.6.3
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_cbajac_csr (magma_int_t localiters, magma_c_matrix D, magma_c_matrix R, magma_c_matrix b, magma_c_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. More... | |
magma_int_t | magma_ccompact (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloat_ptr dnorms, float tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue) |
ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol. More... | |
magma_int_t | magma_citeric_csr (magma_c_matrix A, magma_c_matrix A_CSR, magma_queue_t queue) |
This routine iteratively computes an incomplete Cholesky factorization. More... | |
magma_int_t | magma_citerilu_csr (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. More... | |
magma_int_t | magma_cjacobisetup_vector_gpu (int num_rows, magma_c_matrix b, magma_c_matrix d, magma_c_matrix c, magma_c_matrix *x, magma_queue_t queue) |
Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. More... | |
magma_int_t | magma_clobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaFloatComplex_ptr X, magmaFloatComplex_ptr Y, magma_queue_t queue) |
This routine computes a axpy for a mxn matrix: More... | |
int | magma_cbicgmerge1 (int n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr v, magmaFloatComplex_ptr r, magmaFloatComplex_ptr p, magma_queue_t queue) |
Mergels multiple operations into one kernel: More... | |
int | magma_cbicgmerge2 (int n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr r, magmaFloatComplex_ptr v, magmaFloatComplex_ptr s, magma_queue_t queue) |
Mergels multiple operations into one kernel: More... | |
int | magma_cbicgmerge3 (int n, magmaFloatComplex_ptr skp, magmaFloatComplex_ptr p, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr r, magma_queue_t queue) |
Mergels multiple operations into one kernel: More... | |
int | magma_cbicgmerge4 (int type, magmaFloatComplex_ptr skp) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU. More... | |
magma_int_t | magma_cbicgmerge_spmv1 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr dp, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr skp, magma_queue_t queue) |
Merges the first SpmV using CSR with the dot product and the computation of alpha. More... | |
magma_int_t | magma_cbicgmerge_spmv2 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr ds, magmaFloatComplex_ptr dt, magmaFloatComplex_ptr skp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. More... | |
magma_int_t | magma_cbicgmerge_xrbeta (int n, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr rr, magmaFloatComplex_ptr r, magmaFloatComplex_ptr p, magmaFloatComplex_ptr s, magmaFloatComplex_ptr t, magmaFloatComplex_ptr x, magmaFloatComplex_ptr skp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. More... | |
magma_int_t | magma_ccgmerge_spmv1 (magma_c_matrix A, magmaFloatComplex_ptr d1, magmaFloatComplex_ptr d2, magmaFloatComplex_ptr dd, magmaFloatComplex_ptr dz, magmaFloatComplex_ptr skp, magma_queue_t queue) |
Merges the first SpmV using different formats with the dot product and the computation of rho. More... | |
magma_int_t | magma_cidr_smoothing_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex_ptr drs, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dt, magma_queue_t queue) |
Mergels multiple operations into one kernel: More... | |
magma_int_t | magma_cidr_smoothing_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex omega, magmaFloatComplex_ptr dx, magmaFloatComplex_ptr dxs, magma_queue_t queue) |
Mergels multiple operations into one kernel: More... | |
magma_int_t magma_cbajac_csr | ( | magma_int_t | localiters, |
magma_c_matrix | D, | ||
magma_c_matrix | R, | ||
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue | ||
) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | D | magma_c_matrix input matrix with diagonal blocks |
[in] | R | magma_c_matrix input matrix with non-diagonal parts |
[in] | b | magma_c_matrix RHS |
[in] | x | magma_c_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
int magma_cbicgmerge1 | ( | int | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | v, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | p, | ||
magma_queue_t | queue | ||
) |
Mergels multiple operations into one kernel:
p = beta*p p = p-omega*beta*v p = p+r
-> p = r + beta * ( p - omega * v )
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | v | magmaFloatComplex_ptr input vector v |
[in] | r | magmaFloatComplex_ptr input vector r |
[in,out] | p | magmaFloatComplex_ptr input/output vector p |
[in] | queue | magma_queue_t queue to execute in. |
int magma_cbicgmerge2 | ( | int | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | v, | ||
magmaFloatComplex_ptr | s, | ||
magma_queue_t | queue | ||
) |
Mergels multiple operations into one kernel:
s=r s=s-alpha*v
-> s = r - alpha * v
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | r | magmaFloatComplex_ptr input vector r |
[in] | v | magmaFloatComplex_ptr input vector v |
[out] | s | magmaFloatComplex_ptr output vector s |
[in] | queue | magma_queue_t queue to execute in. |
int magma_cbicgmerge3 | ( | int | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | p, | ||
magmaFloatComplex_ptr | s, | ||
magmaFloatComplex_ptr | t, | ||
magmaFloatComplex_ptr | x, | ||
magmaFloatComplex_ptr | r, | ||
magma_queue_t | queue | ||
) |
Mergels multiple operations into one kernel:
x=x+alpha*p x=x+omega*s r=s r=r-omega*t
-> x = x + alpha * p + omega * s -> r = s - omega * t
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | p | magmaFloatComplex_ptr input p |
[in] | s | magmaFloatComplex_ptr input s |
[in] | t | magmaFloatComplex_ptr input t |
[in,out] | x | magmaFloatComplex_ptr input/output x |
[in,out] | r | magmaFloatComplex_ptr input/output r |
[in] | queue | magma_queue_t Queue to execute in. |
int magma_cbicgmerge4 | ( | int | type, |
magmaFloatComplex_ptr | skp | ||
) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU.
[in] | type | int kernel type |
[in,out] | skp | magmaFloatComplex_ptr vector with parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_spmv1 | ( | magma_c_matrix | A, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | dp, | ||
magmaFloatComplex_ptr | dr, | ||
magmaFloatComplex_ptr | dv, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue | ||
) |
Merges the first SpmV using CSR with the dot product and the computation of alpha.
[in] | A | magma_c_matrix system matrix |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | dp | magmaFloatComplex_ptr input vector p |
[in] | dr | magmaFloatComplex_ptr input vector r |
[in] | dv | magmaFloatComplex_ptr output vector v |
[in,out] | skp | magmaFloatComplex_ptr array for parameters ( skp[0]=alpha ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_spmv2 | ( | magma_c_matrix | A, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | ds, | ||
magmaFloatComplex_ptr | dt, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue | ||
) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | A | magma_c_matrix input matrix |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | ds | magmaFloatComplex_ptr input vector s |
[in] | dt | magmaFloatComplex_ptr output vector t |
[in,out] | skp | magmaFloatComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_xrbeta | ( | int | n, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | rr, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | p, | ||
magmaFloatComplex_ptr | s, | ||
magmaFloatComplex_ptr | t, | ||
magmaFloatComplex_ptr | x, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue | ||
) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | n | int dimension n |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | rr | magmaFloatComplex_ptr input vector rr |
[in] | r | magmaFloatComplex_ptr input/output vector r |
[in] | p | magmaFloatComplex_ptr input vector p |
[in] | s | magmaFloatComplex_ptr input vector s |
[in] | t | magmaFloatComplex_ptr input vector t |
[out] | x | magmaFloatComplex_ptr output vector x |
[in] | skp | magmaFloatComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ccgmerge_spmv1 | ( | magma_c_matrix | A, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | dd, | ||
magmaFloatComplex_ptr | dz, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue | ||
) |
Merges the first SpmV using different formats with the dot product and the computation of rho.
[in] | A | magma_c_matrix input matrix |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | dd | magmaFloatComplex_ptr input vector d |
[out] | dz | magmaFloatComplex_ptr input vector z |
[out] | skp | magmaFloatComplex_ptr array for parameters ( skp[3]=rho ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ccompact | ( | magma_int_t | m, |
magma_int_t | n, | ||
magmaFloatComplex_ptr | dA, | ||
magma_int_t | ldda, | ||
magmaFloat_ptr | dnorms, | ||
float | tol, | ||
magmaInt_ptr | active, | ||
magmaInt_ptr | cBlock, | ||
magma_queue_t | queue | ||
) |
ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.
The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.
[in] | m | INTEGER The number of rows of the matrix dA. M >= 0. |
[in] | n | INTEGER The number of columns of the matrix dA. N >= 0. |
[in,out] | dA | COMPLEX REAL array, dimension (LDDA,N) The m by n matrix dA. |
[in] | ldda | INTEGER The leading dimension of the array dA. LDDA >= max(1,M). |
[in] | dnorms | REAL array, dimension N The norms of the N vectors in dA |
[in] | tol | DOUBLE PRECISON The tolerance value used in the criteria to compact or not. |
[in,out] | active | INTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed |
[in,out] | cBlock | magmaInt_ptr The number of vectors that remain in dA (i.e., with norms > tol). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cidr_smoothing_1 | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaFloatComplex_ptr | drs, | ||
magmaFloatComplex_ptr | dr, | ||
magmaFloatComplex_ptr | dt, | ||
magma_queue_t | queue | ||
) |
Mergels multiple operations into one kernel:
dt = drs - dr
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | drs | magmaFloatComplex_ptr vector |
[in] | dr | magmaFloatComplex_ptr vector |
[in,out] | dt | magmaFloatComplex_ptr vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cidr_smoothing_2 | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaFloatComplex | omega, | ||
magmaFloatComplex_ptr | dx, | ||
magmaFloatComplex_ptr | dxs, | ||
magma_queue_t | queue | ||
) |
Mergels multiple operations into one kernel:
dxs = dxs - gamma*(dxs-dx)
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | omega | magmaFloatComplex scalar |
[in] | dx | magmaFloatComplex_ptr vector |
[in,out] | dxs | magmaFloatComplex_ptr vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_citeric_csr | ( | magma_c_matrix | A, |
magma_c_matrix | A_CSR, | ||
magma_queue_t | queue | ||
) |
This routine iteratively computes an incomplete Cholesky factorization.
The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'
The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.
[in] | A | magma_c_matrix input matrix A - initial guess (lower triangular) |
[in,out] | A_CSR | magma_c_matrix input/output matrix containing the IC approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_citerilu_csr | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_queue_t | queue | ||
) |
This routine iteratively computes an incomplete LU factorization.
The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'
The input format of the matrix is Magma_CSRCOO for the upper and lower triangular parts. Note however, that we flip col and rowidx for the U-part. Every component of L and U is handled by one thread.
[in] | A | magma_c_matrix input matrix A determing initial guess & processing order |
[in,out] | L | magma_c_matrix input/output matrix L containing the ILU approximation |
[in,out] | U | magma_c_matrix input/output matrix U containing the ILU approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cjacobisetup_vector_gpu | ( | int | num_rows, |
magma_c_matrix | b, | ||
magma_c_matrix | d, | ||
magma_c_matrix | c, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue | ||
) |
Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
Returns the vector c. It calls a GPU kernel
[in] | num_rows | magma_int_t number of rows |
[in] | b | magma_c_matrix RHS b |
[in] | d | magma_c_matrix vector with diagonal entries |
[out] | c | magma_c_matrix* c = D^(-1) * b |
[out] | x | magma_c_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_clobpcg_maxpy | ( | magma_int_t | num_rows, |
magma_int_t | num_vecs, | ||
magmaFloatComplex_ptr | X, | ||
magmaFloatComplex_ptr | Y, | ||
magma_queue_t | queue | ||
) |
This routine computes a axpy for a mxn matrix:
Y = X + Y
It replaces: magma_caxpy(m*n, c_one, Y, 1, X, 1);
/ x1[0] x2[0] x3[0] \ | x1[1] x2[1] x3[1] |
X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /
[in] | num_rows | magma_int_t number of rows |
[in] | num_vecs | magma_int_t number of vectors |
[in] | X | magmaFloatComplex_ptr input vector X |
[in,out] | Y | magmaFloatComplex_ptr input/output vector Y |
[in] | queue | magma_queue_t Queue to execute in. |