![]() |
MAGMA 2.10.0
Matrix Algebra for GPU and Multicore Architectures
|
First, create a make.inc file, using one of the examples as a template. Set environment variables for where external packages are installed, either in your .cshrc/.bashrc file, or in the make.inc file itself.
All the make.inc files assume $CUDADIR is set in your environment. For bash (sh), put in ~/.bashrc (with your system's path):
export CUDADIR=/usr/loca/cuda
For csh/tcsh, put in ~/.cshrc:
setenv CUDADIR /usr/local/cuda
AOCL has adopted the BLIS and libFLAME libraries. These may be installed in separate directories. Set $BLIS_DIR and $FLAME_DIR in your environment or make.inc file. For bash (sh), put in ~/.bashrc (with your system's paths):
export BLIS_DIR=/opt/blis export FLAME_DIR=/opt/libflame
For csh/tcsh, put in ~/.cshrc:
setenv BLIS_DIR /opt/blis setenv FLAME_DIR /opt/libflame
The MKL make.inc files assume $MKLROOT is set in your environment. To set it, for bash (sh), put in ~/.bashrc (with your system's path):
source /opt/intel/bin/compilervars.sh intel64
For csh/tcsh, put in ~/.cshrc:
source /opt/intel/bin/compilervars.csh intel64
MAGMA is tested with both LP64 and ILP64.
The ATLAS make.inc file assumes $ATLASDIR and $LAPACKDIR are set in your environment. If not installed, install LAPACK from http://www.netlib.org/lapack/ For bash (sh), put in ~/.bashrc (with your system's path):
export ATLASDIR=/opt/atlas export LAPACKDIR=/opt/LAPACK
For csh/tcsh, put in ~/.cshrc:
setenv ATLASDIR /opt/atlas setenv LAPACKDIR /opt/LAPACK
The OpenBLAS make.inc file assumes $OPENBLASDIR is set in your environment. For bash (sh), put in ~/.bashrc (with your system's path):
export OPENBLASDIR=/opt/openblas
For csh/tcsh, put in ~/.cshrc:
setenv OPENBLASDIR /opt/openblas
Depending on the Fortran compiler used for your BLAS and LAPACK libraries, the linking convention is one of:
gemm() in Fortran becomes gemm_() in C.gemm() in Fortran becomes GEMM() in C.gemm() in Fortran stays gemm() in C.Set -DADD_, -DUPCASE, or -DNOCHANGE, respectively, in all FLAGS in your make.inc file to select the appropriate one. Use nm to examine your BLAS library:
sh methane lib> nm libopenblas.so | grep -i dsyr2k 000000000017ee50 T cblas_dsyr2k 000000000017c8b0 T dsyr2k_ # Note this line 00000000001fa690 T dsyr2k_LN 00000000001fb2e0 T dsyr2k_LT 00000000001f8f70 T dsyr2k_UN 00000000001f9b70 T dsyr2k_UT 00000000001fcab0 T dsyr2k_kernel_L 00000000001fc750 T dsyr2k_kernel_U
In this case, it shows that -DADD_ (dsyr2k_) should work. The default in all example make.inc files is -DADD_, except for IBM ESSL, which uses -DNOCHANGE.
Several compiler defines, below, affect how MAGMA is compiled and might have a large performance impact. These are set in make.inc files using the -D compiler flag, e.g., -DMAGMA_WITH_MKL in CFLAGS.
MAGMA_WITH_MKL
If linked with MKL, allows MAGMA to get MKL's version and set MKL's number of threads.
MAGMA_NOAFFINITY
Disables thread affinity, available in glibc 2.6 and later.
BATCH_DISABLE_CHECKING
For batched routines, disables the info_array that contains errors. For example, for Cholesky factorization if you are sure your matrix is SPD and want better performance, you can compile with this flag.
BATCH_DISABLE_CLEANUP
For batched routines, disables the cleanup code. For example, the {sy|he}rk called with "lower" will write data on the upper triangular portion of the matrix.
BATCHED_DISABLE_PARCPU
In the testing directory, disables the parallel implementation of the batched computation on CPU. Can be used to compare a naive versus a parallelized CPU batched computation.
These variables control MAGMA, BLAS, and LAPACK run-time behavior.
$MAGMA_NUM_GPUSFor multi-GPU functions, set $MAGMA_NUM_GPUS to the number of GPUs to use.
$OMP_NUM_THREADS$MKL_NUM_THREADS
For multi-core BLAS libraries, set $OMP_NUM_THREADS or $MKL_NUM_THREADS to the number of CPU threads, depending on your BLAS library. See the documentation for your BLAS and LAPACK libraries.
If you do not have a Fortran compiler, comment out FORT in make.inc. MAGMA's Fortran 90 interface and Fortran testers will not be built. Also, many testers will not be able to check their results – they will print an error message, e.g.:
magma/testing> ./testing_dgehrd -N 100 -c ... Cannot check results: dhst01_ unavailable, since there was no Fortran compiler. 100 --- ( --- ) 0.70 ( 0.00) 0.00e+00 0.00e+00 ok
By default, all make.inc files (except ATLAS) add the -fPIC option to CFLAGS, FFLAGS, F90FLAGS, and NVCCFLAGS, required for building a shared library. Note in NVCCFLAGS that -fPIC is passed via the -Xcompiler option. Running:
make
or
make lib make test make sparse-lib make sparse-test
will create shared libraries:
lib/libmagma.so lib/libmagma_sparse.so
and static libraries:
lib/libmagma.a lib/libmagma_sparse.a
and testing drivers in testing and sparse-iter/testing.
The current exception is for ATLAS, in make.inc.atlas, which in our install is a static library, thus requiring MAGMA to be a static library.
Static libraries are always built along with the shared libraries above. Alternatively, comment out FPIC in your make.inc file to compile only a static library. Then, running:
make
will create static libraries:
lib/libmagma.a lib/libmagma_sparse.a
and testing drivers in testing and sparse-iter/testing.
To install libraries and include files in a given prefix, run:
make install prefix=/usr/local/magma
The default prefix is /usr/local/magma. You can also set prefix in make.inc. This installs MAGMA libraries in ${prefix}/lib, MAGMA header files in ${prefix}/include, and ${prefix}/lib/pkgconfig/magma.pc for pkg-config.
You can modify the blocking factors for the algorithms of interest in control/get_nb.cpp.
Performance results are included in results/vA.B.C/cudaX.Y-zzz/\*.txt for MAGMA version A.B.C, CUDA version X.Y, and GPU zzz.