Hello,
I ran some benchmark comparisons between MAGMA BLAS 0.2 sgemm and cublas sgemm and found that magma outperforms cublas by several folds. I am supposing that the improvement has to do with splitting the matrix and the processing in parallel the subtasks using both GPU and CPU cores. Is anyone able to provide some general insights on whether this assumption is correct for the current version of magma ?
Regards
MAGMA versus CUBLAS performance
-
Stan Tomov
- Posts: 283
- Joined: Fri Aug 21, 2009 10:39 pm
Re: MAGMA versus CUBLAS performance
Hi,
MAGMA BLAS implements only part of the BLAS specification and is meant as a complement to CUBLAS. It improves on certain CUBLAS routines in specific situations (arguments) that are needed in the MAGMA routines. In general the improvements are only up to 2 times only in some cases. If you see improvements of several folds most probably something has gone wrong. Which routine did you test and on what GPU? If you ran something through a magma testing routine, are the errors as expected, e.g., zero as in a
Stan
MAGMA BLAS implements only part of the BLAS specification and is meant as a complement to CUBLAS. It improves on certain CUBLAS routines in specific situations (arguments) that are needed in the MAGMA routines. In general the improvements are only up to 2 times only in some cases. If you see improvements of several folds most probably something has gone wrong. Which routine did you test and on what GPU? If you ran something through a magma testing routine, are the errors as expected, e.g., zero as in a
Code: Select all
[tomov@cumin testing]$ ./testing_sgemv
Usage
testing_sgemv N
device 0: GeForce GTX 280, 1296.0 MHz clock, 1023.8 MB memory
device 1: Quadro NVS 290, 918.0 MHz clock, 255.3 MB memory
n CUBLAS,Gflop/s MAGMABLAS0.2,Gflop/s "error"
==============================================================
64 0.20 0.39 0
128 0.55 1.21 0
192 0.87 2.17 0
256 1.20 3.20 0
320 1.55 4.36 0
384 1.90 5.46 0
448 2.27 6.27 0
512 2.52 7.71 0
576 2.82 9.09 0
704 3.59 11.39 0
832 4.31 13.84 0
960 4.93 16.17 0
1088 5.56 18.50 0
1216 6.29 20.68 0
1408 7.29 24.63 0
1600 8.34 27.83 0
1792 9.40 31.33 0
1984 10.39 34.68 0
2240 11.74 37.87 0
2496 13.13 42.82 0
2816 14.78 47.34 0
3136 16.45 50.56 0
3520 18.49 54.22 0
3904 20.40 55.22 0
4352 22.76 60.70 0
4800 24.81 60.24 0
5312 27.41 61.95 0
5888 30.23 64.80 0
6528 33.20 64.08 0
7232 36.64 65.17 0
8000 40.16 64.58 0