LAPACK/ScaLAPACK Development

Posted: **Mon Mar 31, 2008 6:35 pm**

When the scale of the matrices are small, I found that reference blas is much faster than those optimized blas. By small scale I mean the size of the matrices are small than about 30x30. I tested several Libs: reference blas, GotoBlas, ACML and Atlas, and found that reference blas on netlib is the fastest. Can anybody here tell me why? Thank you

Posted: **Wed Apr 02, 2008 3:18 am**

neodreamer wrote:When the scale of the matrices are small, I found that reference blas is much faster than those optimized blas. By small scale I mean the size of the matrices are small than about 30x30. I tested several Libs: reference blas, GotoBlas, ACML and Atlas, and found that reference blas on netlib is the fastest. Can anybody here tell me why? Thank you

Neodreamer,
hmmm, this doesn't make much sense to me. However, measuring the performance of an operation on such small input data is quite a tricky thing; you have to be aware of timer resolution and cache effects, Can you tell me more about the way you measure it?

alfredo

LAPACK/ScaLAPACK Development

Why reference blas on netlib is faster in some cases

Why reference blas on netlib is faster in some cases

Re: Why reference blas on netlib is faster in some cases