Hi,
the Lapack User guide says that "xGEQP3 is considerably faster than xGEQPF"
but I found it is not always the case and I look for an explanation. On my laptop
(dell Latitude D620, with linux mandriva spring 2007 as OS) using DGEQP3
+ DORGQR (to form the matrix Q explicitely) or using DGEQPF + DORGQR
lead me to the following results:
matrix size | DGEQP3 + DORGQR | DGEQPF + DORGQR
600 x 400 | 0.24 s | 0.19 s
800 x 600 | 0.62 s | 0.47 s
1000 x 900 | 1.40 s | 1.46 s
1500 x1000 | 3.45 s | 4.1 s
1800 x1400 | 6.42 s | 9.4 s
(I precise that the BLAS lib used is the lastest ATLAS, build on my machine).
So that Yes DGEQP3 becomes faster when the size increase
and is a little slower before but it doesn't ressemble to what I first hope
reading the comment in the Lapack User 's guide. (for DGEQP3 I use first
a query size call to get the optimal work size). To see if I have not done some
mistakes I have compared with different softwares :
1/ with octave which uses DGEQPF + DORGQR (and the same ATLAS lib)
I got nearly the same results (than the ones of the right column)
2/ with Matlab 7.5 which claims to use DGEQP3 (and comes with its own
optimised BLAS) I got : 0.21s, 0.5 s, 1.29 s, 3.55 s, 7.44 s.
In brief these results shows the same tendency: DGEQPF seems a little
faster than DGEQP3 until the array size is not big enough (but 800 x 600
is not a so small array at least for a laptop computer) and after this is
the inverse but DGEQP3 don't appear to be "considerably faster" (may
be it becomes true as the sizes becomes larger and larger).
So my questions :
- is someone has experienced the same behavior between
DGEQP3 and DGEQPF ?
- and which explanation for these results ?
Bruno

