PLASMA
2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
|
Go to the source code of this file.
Macros | |
#define | A(m, n) BLKADDR(A, PLASMA_Complex64_t, m, n) |
Functions | |
void | CORE_ztrdalg_v2 (PLASMA_enum uplo, PLASMA_desc *pA, PLASMA_Complex64_t *V, PLASMA_Complex64_t *TAU, int grsiz, int lcsweep, int id, int blksweep) |
void | QUARK_CORE_ztrdalg_v2 (Quark *quark, Quark_Task_Flags *task_flags, int uplo, PLASMA_desc *pA, PLASMA_Complex64_t *V, PLASMA_Complex64_t *TAU, int grsiz, int lcsweep, int id, int blksweep) |
void | CORE_ztrdalg_v2_quark (Quark *quark) |
PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver
Definition in file core_ztrdalg_v2.c.
#define A | ( | m, | |
n | |||
) | BLKADDR(A, PLASMA_Complex64_t, m, n) |
Definition at line 126 of file core_ztrdalg_v2.c.
void CORE_ztrdalg_v2 | ( | PLASMA_enum | uplo, |
PLASMA_desc * | pA, | ||
PLASMA_Complex64_t * | V, | ||
PLASMA_Complex64_t * | TAU, | ||
int | grsiz, | ||
int | lcsweep, | ||
int | id, | ||
int | blksweep | ||
) |
CORE_ztrdalg_v2 is a part of the tridiagonal reduction algorithm (bulgechasing) It correspond to a local driver of the kernels that should be executed on a single core.
[in] | uplo |
|
[in] | N | The order of the matrix A. N >= 0. |
[in] | NB | The size of the Bandwidth of the matrix A, which correspond to the tile size. NB >= 0. |
[in] | pA | A pointer to the descriptor of the matrix A. |
[out] | V | PLASMA_Complex64_t array, dimension (N). The scalar elementary reflectors are written in this array. So it is used as a workspace for V at each step of the bulge chasing algorithm. |
[out] | TAU | PLASMA_Complex64_t array, dimension (N). The scalar factors of the elementary reflectors are written in thisarray. So it is used as a workspace for TAU at each step of the bulge chasing algorithm. |
[in] | i | Integer that refer to the current sweep. (outer loop). |
[in] | j | Integer that refer to the sweep to chase.(inner loop). |
[in] | m | Integer that refer to a sweep step, to ensure order dependencies. |
[in] | grsiz | Integer that refer to the size of a group. group mean the number of kernel that should be executed sequentially on the same core. group size is a trade-off between locality (cache reuse) and parallelism. a small group size increase parallelism while a large group size increase cache reuse. |
PLASMA_SUCCESS | successful exit |
<0 | if -i, the i-th argument had an illegal value |
Definition at line 82 of file core_ztrdalg_v2.c.
References A, CORE_zhbelr(), CORE_zhblrx(), CORE_zhbrce(), plasma_desc_t::dtyp, plasma_desc_t::m, plasma_desc_t::mb, min, plasma_desc_t::nt, and plasma_element_size().
void CORE_ztrdalg_v2_quark | ( | Quark * | quark | ) |
Definition at line 175 of file core_ztrdalg_v2.c.
References CORE_ztrdalg_v2(), quark_unpack_args_8, TAU, uplo, and V.
void QUARK_CORE_ztrdalg_v2 | ( | Quark * | quark, |
Quark_Task_Flags * | task_flags, | ||
int | uplo, | ||
PLASMA_desc * | pA, | ||
PLASMA_Complex64_t * | V, | ||
PLASMA_Complex64_t * | TAU, | ||
int | grsiz, | ||
int | lcsweep, | ||
int | id, | ||
int | blksweep | ||
) |
Definition at line 127 of file core_ztrdalg_v2.c.
References A, CORE_ztrdalg_v2_quark(), INOUT, NODEP, plasma_desc_t::nt, QUARK_Insert_Task_Packed(), QUARK_Task_Init(), QUARK_Task_Pack_Arg(), and VALUE.