Where are ZSYTRF and ZSYTRI?
Where are ZSYTRF and ZSYTRI?
Dear Developer,
I'm planning to compute the inverse of the double-complex symmetric square matrix. I don't find the MAGMA function that are similar as LAPACK Subroutines zsytrf(...) and zsytri(...). I only found the magma_zpotrf(...) and zpotri(...) functions that works for Hermitian matrix. Are these LAPACK functions, zsytrf(...), and zsytri(...), implemented in the MAGMA? Please help.
Ning
I'm planning to compute the inverse of the double-complex symmetric square matrix. I don't find the MAGMA function that are similar as LAPACK Subroutines zsytrf(...) and zsytri(...). I only found the magma_zpotrf(...) and zpotri(...) functions that works for Hermitian matrix. Are these LAPACK functions, zsytrf(...), and zsytri(...), implemented in the MAGMA? Please help.
Ning
Re: Where are ZSYTRF and ZSYTRI?
We don't as yet have the complex symmetric version, zsytrf, implemented. Only the Hermitian (zhetrf) and Hermitian positive definite (zpotrf) are currently available. We'll look into providing the symmetric version in future releases.
Also, are you sure you need the inverse? If you are solving a system, Ax = b, or equivalently computing x = A^{-1} b, it is usually both faster and more accurate to compute a factorization and a solve (zsytrf & zsytrs), rather than a factorization, inverse, and multiply (zsytrf & zsytri & zsymm).
-mark
Also, are you sure you need the inverse? If you are solving a system, Ax = b, or equivalently computing x = A^{-1} b, it is usually both faster and more accurate to compute a factorization and a solve (zsytrf & zsytrs), rather than a factorization, inverse, and multiply (zsytrf & zsytri & zsymm).
-mark
Re: Where are ZSYTRF and ZSYTRI?
Sorry click wrong button. Please see next.
Last edited by ning_an on Wed Aug 05, 2015 3:43 pm, edited 1 time in total.
Re: Where are ZSYTRF and ZSYTRI?
Hi, Mark,
Thanks for your reply. Yes, I need to inverse the dense matrix, which is unusual, but the theory requires doing so.
When the size of matrix increases, I got the error as below.
My machine configuration is in the below.
I have three question to ask for help.
Q1: I found the "testing_zpotri" does not use the advantage of symmetric property to reduce the memory usage. It is okay for small size matrix, but it consumes big portion of the memory for large size matrix. Do you have plan to develop a pair of functions as ZSPTRF + ZSPTRI in LAPACK (symmetric indefinite, packed storage)?
Q2: From the error message, it shows there is not enough memory on the GPU [4983.5MB(needed)>4096.0MB(installed)]. Are magma_zpotrf/magma_zpotri function smart to manage the memory? If they are not, are the functions zgetrf/zgetri smart to manage the memory?
Q3: If "zgetrf/zgetri" are not smart to manage the memory, please give me a example to show how to make code that can manage memory smart as "magma_cgesv".
Thanks.
Ning
Thanks for your reply. Yes, I need to inverse the dense matrix, which is unusual, but the theory requires doing so.
When the size of matrix increases, I got the error as below.
Code: Select all
D:\magma\build\testing\Release>testing_zpotri.exe --lapack
MAGMA 1.6.2 compiled for CUDA capability >= 2.0
CUDA runtime 7000, driver 7050. OpenMP threads 32. MKL 11.0.5, MKL threads 16.
ndevices 2
device 0: GeForce GTX 980, 1215.5 MHz clock, 4096.0 MB memory, capability 5.2
device 1: GeForce GTX 750 Ti, 1254.5 MHz clock, 2048.0 MB memory, capability 5.0
Usage: testing_zpotri.exe [options] [-h|--help]
uplo = Upper
N CPU GFlop/s (sec) GPU GFlop/s (sec) ||R||_F / ||A||_F
=================================================================
magma_zpotri returned error -113: cannot allocate memory on GPU device.
18072 159.34 ( 148.18) 1293.69 ( 18.25) 4.04e-001 failed
I searched through the MAGMA FORUM. I found a thread of "viewtopic.php?f=2&t=1042", in which YOU saidProcessor: Intel(R) Xeon CPU E5-2687W 0@3.10GHz (2 Processors)
Memory(RAM): 512GB
System Type: 64-bit OS, Windows 8.1Pro
Graphics Card: 2
GeForce GTX 980, 1215.5 MHz clock, 4096.0 MB memory, capability 5.2 (for computing)
GeForce GTX 750 Ti, 1254.5 MHz clock, 2048.0 MB memory, capability 5.0 (for display)
CUDA: CUDA 7.0
MKL: Intel MKL 11.0.5
Compiler: Visual Studio 2013 Community version
magma_cgesv tries to be smart about memory. If you use one GPU and the matrix fits on one GPU, it uses magma_cgetrf_gpu and magma_cgetrs_gpu (essentially magma_cgesv_gpu). If you request multiple GPUs or the matrix does not fit on one GPU, it uses the multi-GPU, out-of-GPU-core magma_cgetrf. This distributes the matrix across the GPUs, and can cycle portions of the matrix through the GPUs if it doesn't fit in GPU memory.
I have three question to ask for help.
Q1: I found the "testing_zpotri" does not use the advantage of symmetric property to reduce the memory usage. It is okay for small size matrix, but it consumes big portion of the memory for large size matrix. Do you have plan to develop a pair of functions as ZSPTRF + ZSPTRI in LAPACK (symmetric indefinite, packed storage)?
Q2: From the error message, it shows there is not enough memory on the GPU [4983.5MB(needed)>4096.0MB(installed)]. Are magma_zpotrf/magma_zpotri function smart to manage the memory? If they are not, are the functions zgetrf/zgetri smart to manage the memory?
Q3: If "zgetrf/zgetri" are not smart to manage the memory, please give me a example to show how to make code that can manage memory smart as "magma_cgesv".
Thanks.
Ning
Last edited by ning_an on Wed Aug 05, 2015 3:46 pm, edited 1 time in total.
Re: Where are ZSYTRF and ZSYTRI?
We do not currently have plans to make a packed version, due to the poor performance with that memory access pattern. Possibly we could use rectangular full packed storage, but there are no current plans for that. See www.netlib.org/lapack/lawnspdf/lawn199.pdf.
magma_zpotrf and magma_zgetrf are smart about running out of GPU memory. However, the inverse routines magma_zpotri and magma_zgetri require the entire matrix to fit in the GPU memory.
After using MAGMA to factor a matrix, you can use LAPACK's zpotri and zgetri to invert the matrix on the CPU.
-mark
magma_zpotrf and magma_zgetrf are smart about running out of GPU memory. However, the inverse routines magma_zpotri and magma_zgetri require the entire matrix to fit in the GPU memory.
After using MAGMA to factor a matrix, you can use LAPACK's zpotri and zgetri to invert the matrix on the CPU.
-mark
Re: Where are ZSYTRF and ZSYTRI?
Thanks Mark,
I will try what you suggested, and report the results.
Have a great day.
Ning
I will try what you suggested, and report the results.
Have a great day.
Ning
Re: Where are ZSYTRF and ZSYTRI?
Hi, Mark,
I just comment the function of "magma_zpotri( opts.uplo, N, h_R, lda, &info );", and replace by " lapackf77_zpotri(lapack_uplo_const(opts.uplo), &N, h_R, &lda, &info);" , Which looks like in the below.
Old
Change to
But the result is not okay, as shown in the below.
I'm new to use MAGMA. Should I do something on the factorized matrix (h_R), before calling lapackf77_zpotri(...) .
I think some how function lapackf77_zpotri(...) does not accept the the factorized matrix (h_R) from magma_zpotrf(...) directly.
any suggestion is welcome.
Ning
I just comment the function of "magma_zpotri( opts.uplo, N, h_R, lda, &info );", and replace by " lapackf77_zpotri(lapack_uplo_const(opts.uplo), &N, h_R, &lda, &info);" , Which looks like in the below.
Old
Code: Select all
gpu_time = magma_wtime();
/* factorize matrix */
magma_zpotrf( opts.uplo, N, h_R, lda, &info );
if (info != 0)
printf("magma_zpotrf returned error %d: %s.\n",
(int)info, magma_strerror(info));
magma_zpotri( opts.uplo, N, h_R, lda, &info );
gpu_time = magma_wtime() - gpu_time;
gpu_perf = gflops / gpu_time;
if (info != 0)
printf("magma_zpotri returned error %d: %s.\n",
(int) info, magma_strerror( info ));
Code: Select all
gpu_time = magma_wtime();
/* factorize matrix */
magma_zpotrf( opts.uplo, N, h_R, lda, &info );
if (info != 0)
printf("magma_zpotrf returned error %d: %s.\n",
(int)info, magma_strerror(info));
// magma_zpotri( opts.uplo, N, h_R, lda, &info );
lapackf77_zpotri(lapack_uplo_const(opts.uplo), &N, h_R, &lda, &info);
gpu_time = magma_wtime() - gpu_time;
gpu_perf = gflops / gpu_time;
if (info != 0)
printf("magma_zpotri returned error %d: %s.\n",
(int) info, magma_strerror( info ));
Code: Select all
D:\magma\build\testing\Release>testing_zpotri.exe --lapack
MAGMA 1.6.2 compiled for CUDA capability >= 2.0
CUDA runtime 7000, driver 7050. OpenMP threads 32. MKL 11.0.5, MKL threads 16.
ndevices 2
device 0: GeForce GTX 980, 1215.5 MHz clock, 4096.0 MB memory, capability 5.2
device 1: GeForce GTX 750 Ti, 1254.5 MHz clock, 2048.0 MB memory, capability 5.0
Usage: testing_zpotri.exe [options] [-h|--help]
N CPU GFlop/s (sec) GPU GFlop/s (sec) ||R||_F / ||A||_F
=================================================================
18072 151.66 ( 155.68) 179.80 ( 131.32) 3.55e+169 failed
I think some how function lapackf77_zpotri(...) does not accept the the factorized matrix (h_R) from magma_zpotrf(...) directly.
any suggestion is welcome.
Ning