I am using Scalapack example sample_pssyev_call.f to solve the eigenproblem for big matrices. So far I have used the code successfully to solve matrix of size 23400x23400. The code fails when I try to solve a larger matrix of size 39600x39600, however. According to the message below the problem seems to be with LWORK. I am defining LWORK as LWORK=MAXN*MAXN*8 and it has been working perfectly fine for smaller matrices. I tried several things such as LWORK query, increasing the number of processors, changed the value of NB, etc, but none of those solved the problem. I thought that maybe something was going wrong during definition of the input matrix A (for example elements in the term diag([...can take on very small values that would make PSSYEV to crash later on maybe because of some division by zero or a maximum interation reached, etc). Then I gave a try using my own data but the error is exactly the same, the problem with LWORK persists. As you can see in the message below, the code executes completely, however, the computation is incorrect. I am only printing the first 10 largest eigenvalues. I am not exhausting the memory, I have plenty (24 GB) for this type of calculation. Does anyone one has an idea on why the code is crashing..??.
- Code: Select all
helios 619% mpirun -np 4 sample_pssyev_call
....READING..OK..
{ 0, 1}: On entry to PSSYEV parameter number 14 had an illegal value
{ 1, 0}: On entry to PSSYEV parameter number 14 had an illegal value
{ 1, 1}: On entry to PSSYEV parameter number 14 had an illegal value
{ 0, 0}: On entry to PSSYEV parameter number 14 had an illegal value
N = 39600
A = hilb(N) + diag([1:-1/N:1/N])
W( 39600 )= 0.0000000 ;
W( 39599 )= 0.0000000 ;
W( 39598 )= 0.0000000 ;
W( 39597 )= 0.0000000 ;
W( 39596 )= 0.0000000 ;
W( 39595 )= 0.0000000 ;
W( 39594 )= 0.0000000 ;
W( 39593 )= 0.0000000 ;
W( 39592 )= 0.0000000 ;
W( 39591 )= 0.0000000 ;
W( 39590 )= 0.0000000 ;
backerror = A - Z * diag(W) * Z'
resid = A * Z - Z * diag(W)
ortho = Z' * Z - eye(N)
norm(backerror)
norm(resid)
norm(ortho)

