PLASMA
2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
|
Go to the source code of this file.
Macros | |
#define | REAL |
Functions | |
int | CORE_stsmlq_sytra1 (int side, int trans, int m1, int n1, int m2, int n2, int k, int ib, float *A1, int lda1, float *A2, int lda2, float *V, int ldv, float *T, int ldt, float *WORK, int ldwork) |
void | QUARK_CORE_stsmlq_sytra1 (Quark *quark, Quark_Task_Flags *task_flags, int side, int trans, int m1, int n1, int m2, int n2, int k, int ib, int nb, float *A1, int lda1, float *A2, int lda2, float *V, int ldv, float *T, int ldt) |
void | CORE_stsmlq_sytra1_quark (Quark *quark) |
PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver
Definition in file core_stsmlq_sytra1.c.
#define REAL |
Definition at line 20 of file core_stsmlq_sytra1.c.
int CORE_stsmlq_sytra1 | ( | int | side, |
int | trans, | ||
int | m1, | ||
int | n1, | ||
int | m2, | ||
int | n2, | ||
int | k, | ||
int | ib, | ||
float * | A1, | ||
int | lda1, | ||
float * | A2, | ||
int | lda2, | ||
float * | V, | ||
int | ldv, | ||
float * | T, | ||
int | ldt, | ||
float * | WORK, | ||
int | ldwork | ||
) |
CORE_stsmlq_sytra1: see CORE_stsmlq
This kernel applies a Right transformation on | A1' A2 | and does not handle the transpose of A1. Needs therefore to make the explicit transpose of A1 before and after the application of the block of reflectors Can be further optimized by changing accordingly the underneath kernel ztsrfb!
[in] | side |
|
[in] | trans |
|
[in] | M1 | The number of rows of the tile A1. M1 >= 0. |
[in] | N1 | The number of columns of the tile A1. N1 >= 0. |
[in] | M2 | The number of rows of the tile A2. M2 >= 0. M2 = M1 if side == PlasmaRight. |
[in] | N2 | The number of columns of the tile A2. N2 >= 0. N2 = N1 if side == PlasmaLeft. |
[in] | K | The number of elementary reflectors whose product defines the matrix Q. |
[in] | IB | The inner-blocking size. IB >= 0. |
[in,out] | A1 | On entry, the M1-by-N1 tile A1. On exit, A1 is overwritten by the application of Q. |
[in] | LDA1 | The leading dimension of the array A1. LDA1 >= max(1,M1). |
[in,out] | A2 | On entry, the M2-by-N2 tile A2. On exit, A2 is overwritten by the application of Q. |
[in] | LDA2 | The leading dimension of the tile A2. LDA2 >= max(1,M2). |
[in] | V | The i-th row must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CORE_STSLQT in the first k rows of its array argument V. |
[in] | LDV | The leading dimension of the array V. LDV >= max(1,K). |
[out] | T | The IB-by-N1 triangular factor T of the block reflector. T is upper triangular by block (economic storage); The rest of the array is not referenced. |
[in] | LDT | The leading dimension of the array T. LDT >= IB. |
[out] | WORK | Workspace array of size LDWORK-by-M1 if side == PlasmaLeft LDWORK-by-IB if side == PlasmaRight |
[in] | LDWORK | The leading dimension of the array WORK. LDWORK >= max(1,IB) if side == PlasmaLeft LDWORK >= max(1,N1) if side == PlasmaRight |
PLASMA_SUCCESS | successful exit |
<0 | if -i, the i-th argument had an illegal value |
Definition at line 125 of file core_stsmlq_sytra1.c.
References CORE_stsmlq(), coreblas_error, and PLASMA_SUCCESS.
void CORE_stsmlq_sytra1_quark | ( | Quark * | quark | ) |
This kernel applies a Right transformation on | A1' A2 | and does not handle the transpose of A1. Needs therefore to make the explicit transpose of A1 before and after the application of the block of reflectors Can be further optimized by changing accordingly the underneath kernel ztsrfb!
Definition at line 218 of file core_stsmlq_sytra1.c.
References CORE_stsmlq_sytra1(), quark_unpack_args_18, side, T, trans, and V.
void QUARK_CORE_stsmlq_sytra1 | ( | Quark * | quark, |
Quark_Task_Flags * | task_flags, | ||
int | side, | ||
int | trans, | ||
int | m1, | ||
int | n1, | ||
int | m2, | ||
int | n2, | ||
int | k, | ||
int | ib, | ||
int | nb, | ||
float * | A1, | ||
int | lda1, | ||
float * | A2, | ||
int | lda2, | ||
float * | V, | ||
int | ldv, | ||
float * | T, | ||
int | ldt | ||
) |
Definition at line 174 of file core_stsmlq_sytra1.c.
References CORE_stsmlq_sytra1_quark(), INOUT, INPUT, PlasmaLeft, QUARK_Insert_Task(), QUARK_REGION_D, QUARK_REGION_U, SCRATCH, and VALUE.