A problem as follows has been discovered in PDGEMM - let's consider a PDGEMM example with parameters: m=120M=120000000, n=80, nrhs=80, nrows=4, ncols=80.
In this case, we have such local matrices: A(30M x 1), B(20 x 1), C(30M x 1).
In PBLAS/SRC/PTOOLS/PB_CpgemmAB.c we have (line 360):
kb = pilaenv_( &ctxt, C2F_CHAR( &TYPE->type ) );
..so, kb==32. Then (line 429):
PB_COutV( TYPE, COLUMN, NOINIT, M, N, Cd0, kb, &WA, WAd0, &WAfr, &WAsum );
There WA is tried to be allocated (PB_COutV.c:299):
*YAPTR = PB_Cmalloc( Amp * K * TYPE->size );
The problem is that (Amp * K * TYPE->size) == (20M * 32 *
So, in this testcase, there;s no need to have kb=32, but it's enough to have it equal to 1. I propose to truncate kb if it's bigger than needed or if we know that 'int' will be exceeded.
Please find and review the hot-fix attached. And in general, it isn't correct that PB_Cmalloc accepts int, but no size_t:
char * PB_Cmalloc ( int );
Regards,
Konstantin

