MAGMA  1.6.1
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
Utilities


More...

Functions

magma_int_t magma_is_devptr (const void *A)
 For debugging purposes, determines whether a pointer points to CPU or GPU memory. More...
 
magma_int_t magma_get_parallel_numthreads ()
 Returns the number of threads to use for parallel sections of MAGMA. More...
 
magma_int_t magma_get_lapack_numthreads ()
 Returns the number of threads currently used for LAPACK and BLAS. More...
 
void magma_set_lapack_numthreads (magma_int_t threads)
 Sets the number of threads to use for LAPACK and BLAS. More...
 
void magma_xerbla (const char *srname, magma_int_t minfo)
 magma_xerbla is an error handler for the MAGMA routines. More...
 
cublasStatus_t magmablasSetKernelStream (magma_queue_t stream)
 magmablasSetKernelStream sets the CUDA stream that MAGMA BLAS and CUBLAS (v1) routines use (unless explicitly given a stream). More...
 
cublasStatus_t magmablasGetKernelStream (magma_queue_t *stream)
 magmablasGetKernelStream gets the CUDA stream that MAGMA BLAS routines use. More...
 

Detailed Description


Function Documentation

magma_int_t magma_get_lapack_numthreads ( )

Returns the number of threads currently used for LAPACK and BLAS.

Typically, the number of threads is initially set by the environment variables OMP_NUM_THREADS or MKL_NUM_THREADS.

If MAGMA is compiled with MAGMA_WITH_MKL, this queries MKL; else if MAGMA is compiled with OpenMP, this queries OpenMP; else this returns 1.

See also
magma_get_parallel_numthreads
magma_set_lapack_numthreads
magma_int_t magma_get_parallel_numthreads ( )

Returns the number of threads to use for parallel sections of MAGMA.

Typically, it is initially set by the environment variables OMP_NUM_THREADS or MAGMA_NUM_THREADS.

If MAGMA_NUM_THREADS is set, this returns min( num_cores, MAGMA_NUM_THREADS ); else if MAGMA is compiled with OpenMP, this queries OpenMP and returns min( num_cores, OMP_NUM_THREADS ); else this returns num_cores.

For the number of cores, if MAGMA is compiled with hwloc, this queries hwloc; else it queries sysconf (on Unix) or GetSystemInfo (on Windows).

See also
magma_get_lapack_numthreads
magma_set_lapack_numthreads
magma_int_t magma_is_devptr ( const void *  A)

For debugging purposes, determines whether a pointer points to CPU or GPU memory.

On CUDA architecture 2.0 cards with unified addressing, CUDA can tell if it is a device pointer or pinned host pointer. For malloc'd host pointers, cudaPointerGetAttributes returns error, implying it is a (non-pinned) host pointer.

On older cards, this cannot determine if it is CPU or GPU memory.

Parameters
Apointer to test
Returns
  • 1: if A is a device pointer (definitely),
  • 0: if A is a host pointer (definitely or inferred from error),
  • -1: if unknown.
Author
Mark Gates
void magma_set_lapack_numthreads ( magma_int_t  threads)

Sets the number of threads to use for LAPACK and BLAS.

Internal routines.

This is often used to set BLAS to be single-threaded during sections where MAGMA uses explicit pthread parallelism. Example use:

nthread_save = magma_get_lapack_numthreads();
magma_set_lapack_numthreads( 1 );

... launch pthreads, do work, terminate pthreads ...

magma_set_lapack_numthreads( nthread_save );

If MAGMA is compiled with MAGMA_WITH_MKL, this sets MKL threads; else if MAGMA is compiled with OpenMP, this sets OpenMP threads; else this does nothing.

Parameters
[in]threadsINTEGER Number of threads to use. threads >= 1. If threads < 1, this silently does nothing.
See also
magma_get_parallel_numthreads
magma_get_lapack_numthreads
void magma_xerbla ( const char *  srname,
magma_int_t  minfo 
)

magma_xerbla is an error handler for the MAGMA routines.

It is called by a MAGMA routine if an input parameter has an invalid value. It prints an error message.

Installers may consider modifying it to call system-specific exception-handling facilities.

Parameters
[in]srnameCHAR* The name of the subroutine that called XERBLA. In C/C++ it is convenient to use "__func__".
[in]minfoINTEGER Note minfo's sign is opposite info's normal sign.

Normally:

  • minfo > 0: The position of the invalid parameter in the parameter list of the calling routine.

These conditions are also reported, but normally code should not call xerbla for these runtime errors:

  • minfo < 0: Function-specific error.
  • minfo >= -MAGMA_ERR: Pre-defined MAGMA error, such as malloc failure.
  • minfo == 0: No error.
cublasStatus_t magmablasGetKernelStream ( magma_queue_t *  stream)

magmablasGetKernelStream gets the CUDA stream that MAGMA BLAS routines use.

Parameters
[out]streammagma_queue_t The CUDA stream.
cublasStatus_t magmablasSetKernelStream ( magma_queue_t  stream)

magmablasSetKernelStream sets the CUDA stream that MAGMA BLAS and CUBLAS (v1) routines use (unless explicitly given a stream).

In a multi-threaded application, be careful to avoid race conditions when using this. For instance, if calls are executed in this order:

    thread 1                            thread 2
    ------------------------------      ------------------------------
1.  magmablasSetKernelStream( s1 )         
2.                                      magmablasSetKernelStream( s2 )
3.  magma_dgemm( ... )
4.                                      magma_dgemm( ... )

both magma_dgemm would occur on stream s2. A lock should be used to prevent this, so the dgemm in thread 1 uses stream s1, and the dgemm in thread 2 uses s2:

    thread 1                            thread 2
    ------------------------------      ------------------------------
1.  lock()                                  
2.  magmablasSetKernelStream( s1 )          
3.  magma_dgemm( ... )                      
4.  unlock()                                
5.                                      lock()
6.                                      magmablasSetKernelStream( s2 )
7.                                      magma_dgemm( ... )
8.                                      unlock()

Most BLAS calls in MAGMA, such as magma_dgemm, are asynchronous, so the lock will only have to wait until dgemm is queued, not until it is finished.

Parameters
[in]streammagma_queue_t The CUDA stream.