MAGMA Forum

Posted: **Mon Jan 17, 2011 4:13 am**

I'm looking at magma_dgeqrf_gpu (dgeqrf_gpu.cpp), for example, and see host allocations and data exchange between CPU and GPU. Can't we allocate all the memory, including input, output, and work, and call a sequence of GPU kernels to operate on it? Without calling CPU LAPACK functions?
Igor

Posted: **Fri Jan 21, 2011 2:37 am**

This is possible and people have tried it but it is in general slower. The computations/tasks that are offloaded to the CPU are small and can not be executed efficiently in parallel on the GPU. These small tasks can be offloaded to the CPU and overlapped with more efficient work (e.g., Level 3 BLAS) on the GPU. What happens is that asymptotically, for large matrices, the execution of the small tasks on the CPU get totally overlapped by work on the GPU, and as a result the overall algorithm runs with the speed that one can execute the Level 3 BLAS on the GPU.

Posted: **Thu Feb 17, 2011 1:31 am**

Hello , everyone
I am a student, and new to GPGPU thing ,
I have one question,
MAGMA is on CUDA , and I have heard one other name ViennaCL ,
so Which one is better in terms of performance/Speed ..?

MAGMA Forum

why not do all the work on GPU?

why not do all the work on GPU?

Re: why not do all the work on GPU?

Re: why not do all the work on GPU?