single-threaded / multi-threaded mkl performance difference

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
dni
Posts: 5
Joined: Tue Aug 09, 2011 3:42 am

single-threaded / multi-threaded mkl performance difference

Post by dni » Mon Sep 12, 2011 3:27 am

I build magma twice, one with single-threaded mkl, then with multi-threaded mkl. It is interesting to find that multi-threaded testing program is almost twice as fast as the single threaded version. I only tested sgeqrf. I wonder why magma performance is so much dependent on mkl. Is it true most calculation is done on GPU?

Stan Tomov
Posts: 283
Joined: Fri Aug 21, 2009 10:39 pm

Re: single-threaded / multi-threaded mkl performance differe

Post by Stan Tomov » Wed Sep 14, 2011 10:36 pm

Yes, most of the computation is done on the GPU but the critical path of many of the algorithms is done on the CPU. Therefore, the CPU code has to be as fast as possible. Currently, the code is tuned to get best performance if you use all the cores of a socket on your host. If you would like to use only one core, you must re-tuned the algorithms to get better performance (e.g., reduce the blocking sizes in file control/get_nb.cpp).

Post Reply