Code: Select all
This are MAGMA 0.3 DGEMM and SGEMM Routines for Fermi GPUs.
In this version matrix sizes have to be divisible by 64
Usage:
./testing_dgemm N
N magmablas0.3 GFLops/s cudablas-3.2 GFlops/s error
==========================================================================
512 130.18208 127.70478 0.000000e+00
1088 153.85420 153.24094 0.000000e+00
1664 157.40103 156.64116 0.000000e+00
2240 160.20845 159.43350 0.000000e+00
2816 159.83791 159.06029 0.000000e+00
3392 163.98298 164.37494 0.000000e+00
3968 164.20745 164.60266 0.000000e+00
4544 164.12606 164.75651 0.000000e+00
5120 164.55581 165.02481 0.000000e+00
5696 164.41005 164.91638 0.000000e+00
6272 164.49734 165.06085 0.000000e+00
6848 164.47954 165.15991 0.000000e+00
7424 164.53056 165.10375 0.000000e+00
This is a MAGMA 0.3 SGEMM Routine for Fermi GPUs.
In this version sizes have to be divisible by 96.
Usage:
./testing_sgemm N
N magmablas0.3 GFLops/s cudablas-3.2 GFlops/s error
==================================================================
864 665.26307 702.20201 0.000000e+00
1440 801.71406 824.51581 0.000000e+00
2016 794.79485 814.87221 0.000000e+00
2592 817.72439 832.54093 0.000000e+00
3168 824.51008 836.61608 0.000000e+00
3744 825.33249 837.03466 0.000000e+00
4320 832.99221 842.95151 0.000000e+00
4896 830.81927 840.00669 0.000000e+00
5472 833.72471 842.49372 0.000000e+00
6048 834.53324 842.74165 0.000000e+00
6624 834.80414 842.60437 0.000000e+00
7200 837.26658 844.44771 0.000000e+00
7776 836.65414 843.91701 0.000000e+00
8352 837.34876 844.12340 0.000000e+00
Allan