PLASMA  2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Groups
core_dtsmqr_corner.c File Reference
#include <lapacke.h>
#include "common.h"
Include dependency graph for core_dtsmqr_corner.c:

Go to the source code of this file.

Macros

#define REAL

Functions

int CORE_dtsmqr_corner (int m1, int n1, int m2, int n2, int m3, int n3, int k, int ib, int nb, double *A1, int lda1, double *A2, int lda2, double *A3, int lda3, double *V, int ldv, double *T, int ldt, double *WORK, int ldwork)
void QUARK_CORE_dtsmqr_corner (Quark *quark, Quark_Task_Flags *task_flags, int m1, int n1, int m2, int n2, int m3, int n3, int k, int ib, int nb, double *A1, int lda1, double *A2, int lda2, double *A3, int lda3, double *V, int ldv, double *T, int ldt)
void CORE_dtsmqr_corner_quark (Quark *quark)

Detailed Description

PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver

Version:
2.4.5
Author:
Hatem Ltaief
Mathieu Faverge
Azzam Haidar
Date:
2010-11-15 d Tue Nov 22 14:35:23 2011

Definition in file core_dtsmqr_corner.c.


Macro Definition Documentation

#define REAL

Definition at line 20 of file core_dtsmqr_corner.c.


Function Documentation

int CORE_dtsmqr_corner ( int  m1,
int  n1,
int  m2,
int  n2,
int  m3,
int  n3,
int  k,
int  ib,
int  nb,
double *  A1,
int  lda1,
double *  A2,
int  lda2,
double *  A3,
int  lda3,
double *  V,
int  ldv,
double *  T,
int  ldt,
double *  WORK,
int  ldwork 
)

CORE_dtsmqr_corner: see CORE_dtsmqr

This kernel applies left and right transformations as depicted below: |I -VT'V'| * | A1 A2'| * |I - VTV'| | A2 A3 | where A1 and A3 are symmetric matrices. Only the lower part is referenced. This is an adhoc implementation, can be further optimized...

Parameters:
[in]side
  • PlasmaLeft : apply Q or Q**T from the Left;
  • PlasmaRight : apply Q or Q**T from the Right.
[in]trans
  • PlasmaNoTrans : No transpose, apply Q;
  • PlasmaTrans : ConjTranspose, apply Q**T.
[in]M1The number of rows of the tile A1. M1 >= 0.
[in]N1The number of columns of the tile A1. N1 >= 0.
[in]M2The number of rows of the tile A2. M2 >= 0. M2 = M1 if side == PlasmaRight.
[in]N2The number of columns of the tile A2. N2 >= 0. N2 = N1 if side == PlasmaLeft.
[in]KThe number of elementary reflectors whose product defines the matrix Q.
[in]IBThe inner-blocking size. IB >= 0.
[in,out]A1On entry, the M1-by-N1 tile A1. On exit, A1 is overwritten by the application of Q.
[in]LDA1The leading dimension of the array A1. LDA1 >= max(1,M1).
[in,out]A2On entry, the M2-by-N2 tile A2. On exit, A2 is overwritten by the application of Q.
[in]LDA2The leading dimension of the tile A2. LDA2 >= max(1,M2).
[in]VThe i-th row must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CORE_DTSQRT in the first k columns of its array argument V.
[in]LDVThe leading dimension of the array V. LDV >= max(1,K).
[out]TThe IB-by-N1 triangular factor T of the block reflector. T is upper triangular by block (economic storage); The rest of the array is not referenced.
[in]LDTThe leading dimension of the array T. LDT >= IB.
[out]WORKWorkspace array of size LDWORK-by-N1 if side == PlasmaLeft LDWORK-by-IB if side == PlasmaRight
[in]LDWORKThe leading dimension of the array WORK. LDWORK >= max(1,IB) if side == PlasmaLeft LDWORK >= max(1,M1) if side == PlasmaRight
Returns:
Return values:
PLASMA_SUCCESSsuccessful exit
<0if -i, the i-th argument had an illegal value

Definition at line 125 of file core_dtsmqr_corner.c.

References CORE_dtsmqr(), coreblas_error, PLASMA_SUCCESS, PlasmaLeft, PlasmaNoTrans, PlasmaRight, PlasmaTrans, side, and trans.

{
int i, j;
if ( m1 != n1 ) {
coreblas_error(1, "Illegal value of M1, N1");
return -1;
}
/* Rebuild the symmetric block: WORK <- A1 */
for (j = 0; j < n1; j++)
for (i = j; i < m1; i++){
*(WORK + i + j*ldwork) = *(A1 + i + j*lda1);
if (i > j){
*(WORK + j + i*ldwork) = ( *(WORK + i + j*ldwork) );
}
}
/* Copy the transpose of A2: WORK+nb*ldwork <- A2' */
for (j = 0; j < n2; j++)
for (i = 0; i < m2; i++){
*(WORK + j + (i + nb) * ldwork) = ( *(A2 + i + j*lda2) );
}
side = PlasmaLeft;
trans = PlasmaTrans;
/* Left application on |A1| */
/* |A2| */
CORE_dtsmqr(side, trans, m1, n1, m2, n2, k, ib,
WORK, ldwork, A2, lda2,
V, ldv, T, ldt,
WORK + 3*nb*ldwork, ldwork);
/* Rebuild the symmetric block: WORK+2*nb*ldwork <- A3 */
for (j = 0; j < n3; j++)
for (i = j; i < m3; i++){
*(WORK + i + (j + 2*nb) * ldwork) = *(A3 + i + j*lda3);
if (i != j){
*(WORK + j + (i + 2*nb) * ldwork) = ( *(WORK + i + (j + 2*nb) * ldwork) );
}
}
/* Left application on | A2'| */
/* | A3 | */
CORE_dtsmqr(side, trans, n2, m2, m3, n3, k, ib,
WORK+nb*ldwork, ldwork, WORK+2*nb*ldwork, ldwork,
V, ldv, T, ldt,
WORK + 3*nb*ldwork, ldwork);
side = PlasmaRight;
trans = PlasmaNoTrans;
/* Right application on | A1 A2' | */
CORE_dtsmqr(side, trans, m1, n1, n2, m2, k, ib,
WORK, ldwork, WORK+nb*ldwork, ldwork,
V, ldv, T, ldt,
WORK + 3*nb*ldwork, ldwork);
/* Copy back the final result to the lower part of A1 */
/* A1 = WORK */
for (j = 0; j < n1; j++)
for (i = j; i < m1; i++)
*(A1 + i + j*lda1) = *(WORK + i + j*ldwork);
/* Right application on | A2 A3 | */
CORE_dtsmqr(side, trans, m2, n2, m3, n3, k, ib,
A2, lda2, WORK+2*nb*ldwork, ldwork,
V, ldv, T, ldt,
WORK + 3*nb*ldwork, ldwork);
/* Copy back the final result to the lower part of A3 */
/* A3 = WORK+2*nb*ldwork */
for (j = 0; j < n3; j++)
for (i = j; i < m3; i++)
*(A3 + i + j*lda3) = *(WORK + i + (j+ 2*nb) * ldwork);
}

Here is the call graph for this function:

Here is the caller graph for this function:

void CORE_dtsmqr_corner_quark ( Quark quark)

Definition at line 254 of file core_dtsmqr_corner.c.

References CORE_dtsmqr_corner(), quark_unpack_args_21, T, and V.

{
int m1;
int n1;
int m2;
int n2;
int m3;
int n3;
int k;
int ib;
int nb;
double *A1;
int lda1;
double *A2;
int lda2;
double *A3;
int lda3;
double *V;
int ldv;
double *T;
int ldt;
double *WORK;
int ldwork;
quark_unpack_args_21(quark, m1, n1, m2, n2, m3, n3, k, ib, nb,
A1, lda1, A2, lda2, A3, lda3, V, ldv, T, ldt, WORK, ldwork);
CORE_dtsmqr_corner(m1, n1, m2, n2, m3, n3, k, ib, nb,
A1, lda1, A2, lda2, A3, lda3, V, ldv, T, ldt, WORK, ldwork);
}

Here is the call graph for this function:

Here is the caller graph for this function:

void QUARK_CORE_dtsmqr_corner ( Quark quark,
Quark_Task_Flags task_flags,
int  m1,
int  n1,
int  m2,
int  n2,
int  m3,
int  n3,
int  k,
int  ib,
int  nb,
double *  A1,
int  lda1,
double *  A2,
int  lda2,
double *  A3,
int  lda3,
double *  V,
int  ldv,
double *  T,
int  ldt 
)

Definition at line 214 of file core_dtsmqr_corner.c.

References CORE_dtsmqr_corner_quark(), INOUT, INPUT, QUARK_Insert_Task(), QUARK_REGION_D, QUARK_REGION_L, SCRATCH, and VALUE.

{
int ldwork = nb;
sizeof(int), &m1, VALUE,
sizeof(int), &n1, VALUE,
sizeof(int), &m2, VALUE,
sizeof(int), &n2, VALUE,
sizeof(int), &m3, VALUE,
sizeof(int), &n3, VALUE,
sizeof(int), &k, VALUE,
sizeof(int), &ib, VALUE,
sizeof(int), &nb, VALUE,
sizeof(double)*nb*nb, A1, INOUT|QUARK_REGION_D|QUARK_REGION_L,
sizeof(int), &lda1, VALUE,
sizeof(double)*nb*nb, A2, INOUT,
sizeof(int), &lda2, VALUE,
sizeof(double)*nb*nb, A3, INOUT|QUARK_REGION_D|QUARK_REGION_L,
sizeof(int), &lda3, VALUE,
sizeof(double)*nb*nb, V, INPUT,
sizeof(int), &ldv, VALUE,
sizeof(double)*ib*nb, T, INPUT,
sizeof(int), &ldt, VALUE,
sizeof(double)*4*nb*nb, NULL, SCRATCH,
sizeof(int), &ldwork, VALUE,
0);
}

Here is the call graph for this function:

Here is the caller graph for this function: