1.6.1, CUDA 7.0, GOTO2, SGESV, and Matlab R2014b (on linux)

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
Boxed Cylon
Posts: 36
Joined: Sat Nov 21, 2009 6:03 pm

1.6.1, CUDA 7.0, GOTO2, SGESV, and Matlab R2014b (on linux)

Post by Boxed Cylon » Mon Mar 02, 2015 6:14 am

Just to note that I compiled magma 1.6.1 without any issues using the GOTO2 blas and the beta CUDA 7.0.
I managed also to compile my original mex test code to run sgesv on the gpu, with minimal effort.
(needed a magma_init();)

I've mostly been using Matlab's gpu capabilities lately because it is far more simple. But benchmarking
A\B in matlab vs. magma's sgesv shows that magma is about 30% faster for 5000X5000 sized
matrices. That's a factor of 20X speed up compared to a single i7-3820 core, and it seems to me a
significant speedup over earlier versions of magma. I use an NVIDIA Titan GPU.

My inertia is still to use matlab's GPU stuff, but it is nice that the original mex files I wrote some years
ago still work. The one trick was that I had to LD_PRELOAD the GotoBLAS2 library before starting matlab,
otherwise the matlab mkl library steps on the mex program and one gets a matlab crash.
(Looks like BLAS_VERSION may not work any more with matlab?)

One suggestion would be to figure out how to compile magma against matlab's own mkl library, if that were
possible. That way magma would, by default, play more nicely with matlab by using the same library. Perhaps I'll try
to do that sometime.

Thanks for the nice code!

Boxed Cylon
Posts: 36
Joined: Sat Nov 21, 2009 6:03 pm

Re: 1.6.1, CUDA 7.0, GOTO2, SGESV, and Matlab R2014b (on lin

Post by Boxed Cylon » Tue Mar 03, 2015 2:17 am

A correction to my benchmark discussion above. These benchmarks are rather slippery to get right. I'll just
give the computation times for sgesv on 5000X5000 matrices:

i7-3820 (3.6 GHz) cpu, titan black gpu
----------------------------------------------------
Matlab - single CPU: 6.28 s
Matlab - GPU : 0.41 s

Matlab - single CPU, GOTO2 BLAS: 11.68 s
Matlab/MAGMA - GPU, GOTO2 BLAS : 0.59 s

Matlab - single CPU, OpenBLAS: 7.59 s (Sandybridge)
Matlab/MAGMA - GPU, OpenBLAS: 0.49 s

Matlab - EIGHT CPU, OpenBLAS: 2.38 s (Sandybridge)
Matlab/MAGMA - GPU, OpenBLAS: 0.40 s

i7-4710HQ (2.5 GHz, mobile) cpu, 970m (6 GB) gpu
----------------------------------------------------
Matlab - single CPU: 5.28 s
Matlab - GPU : 0.52 s

Matlab - single CPU, GOTO2 BLAS: 12.56 s
Matlab/MAGMA - GPU, GOTO2 BLAS : 0.64 s

Matlab - single CPU, OpenBLAS: 5.09 s (Haswell)
Matlab/MAGMA - GPU, OpenBLAS: 0.48 s

I had reported relative computation times above, which of course makes little sense if the CPU times are
also changing. Clearly the GOTO2 BLAS is not doing so well in this case. More motivation for attempting to
use matlab's BLAS! The results highlight the need for MAGMA to use an optimal BLAS for best results.
(My new laptop is quite a computing device...its Haswell/Maxwell, of course)

Note that only a single core is used with these benchmarks, while one can use the 8 hyperthreaded cores to great
advantage with single precision.

mgates3
Posts: 918
Joined: Fri Jan 06, 2012 2:13 pm

Re: 1.6.1, CUDA 7.0, GOTO2, SGESV, and Matlab R2014b (on lin

Post by mgates3 » Thu Mar 05, 2015 12:02 pm

Thanks for sharing.
-mark

Post Reply