|
PLASMA
2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
|
#include "common.h"
Go to the source code of this file.
Functions | |
| int | CORE_sttmlq (int side, int trans, int M1, int N1, int M2, int N2, int K, int IB, float *A1, int LDA1, float *A2, int LDA2, float *V, int LDV, float *T, int LDT, float *WORK, int LDWORK) |
| void | QUARK_CORE_sttmlq (Quark *quark, Quark_Task_Flags *task_flags, int side, int trans, int m1, int n1, int m2, int n2, int k, int ib, int nb, float *A1, int lda1, float *A2, int lda2, float *V, int ldv, float *T, int ldt) |
| void | CORE_sttmlq_quark (Quark *quark) |
PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver
Definition in file core_sttmlq.c.
| int CORE_sttmlq | ( | int | side, |
| int | trans, | ||
| int | M1, | ||
| int | N1, | ||
| int | M2, | ||
| int | N2, | ||
| int | K, | ||
| int | IB, | ||
| float * | A1, | ||
| int | LDA1, | ||
| float * | A2, | ||
| int | LDA2, | ||
| float * | V, | ||
| int | LDV, | ||
| float * | T, | ||
| int | LDT, | ||
| float * | WORK, | ||
| int | LDWORK | ||
| ) |
CORE_sttmlq overwrites the general complex M1-by-N1 tile A1 and M2-by-N2 tile A2 (N1 == N2) with
SIDE = 'L' SIDE = 'R'
TRANS = 'N': Q * | A1 | | A1 | * Q | A2 | | A2 |
TRANS = 'C': Q**T * | A1 | | A1 | * Q**T | A2 | | A2 |
where Q is a complex unitary matrix defined as the product of k elementary reflectors
Q = H(1) H(2) . . . H(k)
as returned by CORE_sttqrt.
| [in] | side |
|
| [in] | trans |
|
| [in] | M1 | The number of rows of the tile A1. M1 >= 0. |
| [in] | N1 | The number of columns of the tile A1. N1 >= 0. |
| [in] | M2 | The number of rows of the tile A2. M2 >= 0. |
| [in] | N2 | The number of columns of the tile A2. N2 >= 0. |
| [in] | K | The number of elementary reflectors whose product defines the matrix Q. |
| [in] | IB | The inner-blocking size. IB >= 0. |
| [in,out] | A1 | On entry, the M1-by-N1 tile A1. On exit, A1 is overwritten by the application of Q. |
| [in] | LDA1 | The leading dimension of the array A1. LDA1 >= max(1,M1). |
| [in,out] | A2 | On entry, the M2-by-N2 tile A2. On exit, A2 is overwritten by the application of Q. |
| [in] | LDA2 | The leading dimension of the tile A2. LDA2 >= max(1,M2). |
| [in] | V | The i-th row must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CORE_STTQRT in the first k rows of its array argument V. |
| [in] | LDV | The leading dimension of the array V. LDV >= max(1,K). |
| [out] | T | The IB-by-N1 triangular factor T of the block reflector. T is upper triangular by block (economic storage); The rest of the array is not referenced. |
| [in] | LDT | The leading dimension of the array T. LDT >= IB. |
| [out] | WORK | Workspace array of size LDWORK-by-N1. |
| [in] | LDWORK | The dimension of the array WORK. LDWORK >= max(1,IB). |
| PLASMA_SUCCESS | successful exit |
| <0 | if -i, the i-th argument had an illegal value |
Definition at line 116 of file core_sttmlq.c.
References CORE_sparfb(), coreblas_error, max, min, PLASMA_SUCCESS, PlasmaForward, PlasmaLeft, PlasmaNoTrans, PlasmaRight, PlasmaRowwise, and PlasmaTrans.


| void CORE_sttmlq_quark | ( | Quark * | quark | ) |
Definition at line 299 of file core_sttmlq.c.
References CORE_sttmlq(), quark_unpack_args_18, side, T, trans, and V.


| void QUARK_CORE_sttmlq | ( | Quark * | quark, |
| Quark_Task_Flags * | task_flags, | ||
| int | side, | ||
| int | trans, | ||
| int | m1, | ||
| int | n1, | ||
| int | m2, | ||
| int | n2, | ||
| int | k, | ||
| int | ib, | ||
| int | nb, | ||
| float * | A1, | ||
| int | lda1, | ||
| float * | A2, | ||
| int | lda2, | ||
| float * | V, | ||
| int | ldv, | ||
| float * | T, | ||
| int | ldt | ||
| ) |
Definition at line 259 of file core_sttmlq.c.
References CORE_sttmlq_quark(), DAG_CORE_TTMLQ, INOUT, INPUT, PlasmaLeft, QUARK_Insert_Task(), QUARK_REGION_D, QUARK_REGION_L, SCRATCH, and VALUE.

