LAPACK/ScaLAPACK Development

by **haydiamet** » Thu Feb 05, 2015 1:06 pm

Hi!,
I'm looking for the function that resolve the matrix vector multiplication, meaning that the result should be at this form: x <--Ax which x is a vector and A a matrix.
DGEMV requires the result of the multiplication by a new vector y (x <-- x+ alpha*A*y).
thx !

by **zerothi** » Tue May 19, 2015 4:28 am

That is not existing in BLAS.
You have to create it yourself (or use dgemv with another vector).

by **franki** » Mon Aug 24, 2015 6:48 pm

zerothi wrote:That is not existing in BLAS.
You have to create it yourself (or use dgemv with another vector).

Hello
How would I create it myself? not sure what you mean by dgemv "with another vector"?

by **zerothi** » Tue Sep 01, 2015 5:07 pm

Use dgemv, by far this is the easiest, and fastest solution.

by **Julien Langou** » Tue Sep 01, 2015 9:26 pm

Quick explanation of why there is no subroutine in the BLAS so that x <- A*x
for a dense matrix A.

Observation: We cannot do the operation x <- A*x in place. We have to copy x in a temporaray
vector y and then perform x <- A*y. This is because, once we have computed the output value x(i) (for any i), we cannot reuse the input value x(i)
to compute the other output x(j) (for j not i). So we cannot do x <- A*x in place.

So, this leaves us to two choices to design a BLAS subroutine. (1) Either stick
with the idea that we need to create a subroutine such that x <- A*x. This
means that the BLAS will have to have a routine that allocates a temporary
workspace for y, copies y <- x, then calls the existing and useful x <- A*y,
then frees y. (2) Or we focus on performing x <- A*y and leave the tasks of
allocation, copy, free to user. The choice was (2) over (1) since it is easy
enough to get (1) from (2). The design choice for BLAS was to only implement
(2) and leave (1) to the user.

Comment #1: If A is triangular, then it is possible to implement x <- A*x in
place. If A is upper triangular, you first compute x(1) using x(1), x(2) . . .
x(n). And this is OK since x(1) is not needed to compute x(2) . . . x(n). Then
you compute x(2), then x(3), etc. This explains why the TRMV routine is by
design in-place.

Comment #2: This explanation about GEMV and TRMV holds for GEMM and TRMM. There
is no subroutine to perform A <- A*B in the BLAS. There is only GEMM of the
type A <- B*C. But there is TRMM such that A <- A*B when B is triangular or
A <- B*A when B is triangular. TRMM is in place by design.

Comment #3: If you really really are short in memory and have to implement say
A <- A*B where A is m-by-n, B is n-by-n, and n << m, you can perform the LU
factorization of B: B = P*L*U and then compute A <- (A*P)*L)*U). This way you
do not need to allocate an extra m-by-n workspace for a temporary C. You can
actually do A <- A*B in place. This is an extreme trick. I did use it a few
times in some iterative solvers where memory was scarce. For example, n is of
the order of 10, and m is of the order of 10^9, so the trade off is either
perform the LU factorization of 10-by-10 matrix B or allocate a 10^9-by-10
temporary array. Yes you may worry a little about stability with LU, so you can
use a more stable factorization for B like the QR factorization. That would
work similarly.

Julien

LAPACK/ScaLAPACK Development

matrix vector multiplication

matrix vector multiplication

Re: matrix vector multiplication

Re: matrix vector multiplication

Re: matrix vector multiplication

Re: matrix vector multiplication

Who is online