MAGMA + pycuda + my own CUDA kernels

mrader1248 · Post by **mrader1248** » Mon Oct 16, 2017 4:07 am

Just as a reminder: I want to obtain both Q and R.

When I use magma_zgeqrf2_gpu, I have direct access to R, but there is no matching function to restore Q: magma_zungqr and magma_zungqr2 both require A to be in host memory and magma_zungqr_gpu requires the dT array which I don't get from magma_zgeqrf2_gpu.

When I use magma_zgeqrf3_gpu, I can use magma_zungqr_gpu to obtain Q and the code from testing_zgeqrf_gpu.cpp to restore R?

Just as a small side question: What are the computational complexities of *geqrf* and *ungqr*? Is the complexity of *ungqr* negligible in comparison to *geqrf* (and therefore the reason, why there is only a CPU-*ungqr* for magma_zgeqrf2_gpu)?

mgates3 · Post by **mgates3** » Wed Oct 18, 2017 5:37 pm

With the currently available functions, you can use either

magma_zgeqrf2_gpu( dA ), copy dA to A on host, magma_zungqr2( A )
magma_zgeqrf3_gpu( dA, dT ), copy dA to dQ, magma_zungqr_gpu( dQ, dT ), reconstruct R in dA using bits from dT

Another option might be

magma_zgeqrf2_gpu( dA ), copy dA to wA on host, set dQ = identity on GPU [magmablas_zlaset( zero, one, dQ )], magma_zunmqr2_gpu( dA, dQ, wA )

That magma_zunmqr2_gpu was written for a particular use in the eigenvalue codes, so it's weird in taking both dA (on GPU) and wA (on host).

There's no particular reason that magma_zungqr2_gpu doesn't exist. We've just never needed it.

For a real, square matrix:
geqrf is 4/3 n^3 flops
ungqr is 4/3 n^3 flops
In complex, those get multiplied by about 4.
For rectangular matrices, it depends on what part of Q you want. LAPACK Working Note (LAWN) 41 has detailed flop counts for most of the routines (listed under the single-precision names: sgeqrf, sorgqr, etc.).
http://www.netlib.org/lapack/lawnspdf/lawn41.pdf

Often, you can use unmqr (multiply by Q) instead of ungqr (generate explicit Q), but not always.

-mark

MAGMA Forum

MAGMA + pycuda + my own CUDA kernels

Re: MAGMA + pycuda + my own CUDA kernels

Re: MAGMA + pycuda + my own CUDA kernels