inconsistent results with zgeev+pthreads
I have a program which computes eigensystems in separate (POSIX) threads using zgeev. I have noticed incorrect/inconsistent results when I do this. If I put a mutex around the call the zgeev, then the results are consistently correct, but when I don't, each time I run it, the results are different. For example, the output is
The compiler flag
I can run the good scenario many times, and it always produces the right output, while every time I run the bad scenario (without a mutex around the call), it always produces different results.
I have attached the simplest skeleton code with a test matrix that shows this behavior. The makefile and example output is embedded in the comments in the code. As you can see from the above output, I am running this on a 64-bit Ubuntu server. I have double checked the integer sizes (int is 32-bit, long int is 64-bit, and it actually doesn't make a difference which integer type I use). I ran a simple test code to call ilaver, and it returned 3.2.1. I believe the libraries are the default ones packaged by Ubuntu. I am wondering if anyone can verify this or else I hope someone can point what I am doing wrong, since the Lapack 3.2 release notes say that everything should be thread-safe (and I initialize dlamch/slamch before threads start).
Edit: I should add that if I don't do a workspace query and use the minimal workspace, then I always get consistently good results. It appears that the problem is in the blocked code. I found this problem running valgrind's DRD tool, which reported (vague) errors that indicate the problem is in zlahr2.
Edit: It appears that the local array T is probably allocated statically (it exceeds the maximum gfortran stack variable size). Is it possible to make T part of the workspace?
- Code: Select all
$ uname -a
Linux hostname 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x86_64 GNU/Linux
$ make good
g++ -DGOOD main.cpp /usr/lib/liblapack.a /usr/lib/libblas.a -lgfortran -z muldefs -pthread
$ ./a.out
Info1,2 = 0,0
-7050.245589,-8479.237956
-8580.257527,-9209.117716
-6301.087406,-6399.428666
$ ./a.out
Info1,2 = 0,0
-7050.245589,-8479.237956
-8580.257527,-9209.117716
-6301.087406,-6399.428666
$ make bad
g++ main.cpp /usr/lib/liblapack.a /usr/lib/libblas.a -lgfortran -z muldefs -pthread
$ ./a.out
Info1,2 = 0,0
-7050.355400,-8478.864949
-8457.115555,-8501.374698
1703.121368,-3150.274644
$ ./a.out
Info1,2 = 0,0
-7106.927073,-8501.507914
-8580.257527,-9209.117716
-6299.486160,-6398.683343
The compiler flag
- Code: Select all
-z muldefs
I can run the good scenario many times, and it always produces the right output, while every time I run the bad scenario (without a mutex around the call), it always produces different results.
I have attached the simplest skeleton code with a test matrix that shows this behavior. The makefile and example output is embedded in the comments in the code. As you can see from the above output, I am running this on a 64-bit Ubuntu server. I have double checked the integer sizes (int is 32-bit, long int is 64-bit, and it actually doesn't make a difference which integer type I use). I ran a simple test code to call ilaver, and it returned 3.2.1. I believe the libraries are the default ones packaged by Ubuntu. I am wondering if anyone can verify this or else I hope someone can point what I am doing wrong, since the Lapack 3.2 release notes say that everything should be thread-safe (and I initialize dlamch/slamch before threads start).
Edit: I should add that if I don't do a workspace query and use the minimal workspace, then I always get consistently good results. It appears that the problem is in the blocked code. I found this problem running valgrind's DRD tool, which reported (vague) errors that indicate the problem is in zlahr2.
Edit: It appears that the local array T is probably allocated statically (it exceeds the maximum gfortran stack variable size). Is it possible to make T part of the workspace?