MAGMA  1.5.0
Matrix Algebra for GPU and Multicore Architectures
 All Files Functions Groups
single precision

Functions

void magmablas_sgemvn_fermi (magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, float beta, float *y)
 This routine computes Y = alpha A x + beta y, on the GPU. More...
 
void magmablas_sgemvt_fermi (magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, float beta, float *y)
 

Purpose

More...
 
void magmablas_sgemv (magma_trans_t trans, magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, magma_int_t incx, float beta, float *y, magma_int_t incy)
 

Purpose

This routine computes: 1) y = A x if trans == 'N' or 'n', alpha == 1, beta == 0, and incx == incy == 1 (using magmablas code) 2) y = alpha A^T x if trans == 'T' or 't', beta == 0, and incx == incy == 1 (using magmablas code) 3) y = alpha A^trans x + beta y otherwise, using CUBLAS. More...
 
void magmablas_sgemv_tesla (magma_trans_t trans, magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, magma_int_t incx, float beta, float *y, magma_int_t incy)
 This routine computes: 1) y = A x if trans == 'N' or 'n', alpha == 1, beta == 0, and incx == incy == 1 (using magmablas code) 2) y = alpha A^T x if trans == 'T' or 't', beta == 0, and incx == incy == 1 (using magmablas code) 3) y = alpha A^TRANS x + beta y otherwise, using CUBLAS. More...
 
void magmablas_sgemv2_tesla (magma_int_t m, magma_int_t n, const float *A, magma_int_t lda, const float *x, magma_int_t incx, float *y)
 

Purpose

This routine computes y = A x on the GPU. More...
 
void magmablas_sgemvt1_tesla (magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, float *y)
 

Purpose

This routine computes y = alpha A^T x on the GPU. More...
 
void magmablas_sgemvt2_tesla (magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, float *y)
 

Purpose

This routine computes y = alpha A^T x on the GPU. More...
 
void magmablas_sgemvt_tesla (magma_int_t m, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, float *y)
 

Purpose

This routine computes y = alpha A^T x on the GPU. More...
 
magma_int_t magmablas_ssymv (magma_uplo_t uplo, magma_int_t n, float alpha, const float *A, magma_int_t lda, const float *x, magma_int_t incx, float beta, float *y, magma_int_t incy)
 magmablas_ssymv performs the matrix-vector operation: More...
 
magma_int_t magmablas_ssymv_mgpu_offset (magma_uplo_t uplo, magma_int_t n, float alpha, float **A, magma_int_t lda, float **x, magma_int_t incx, float beta, float **y, magma_int_t incy, float **work, magma_int_t lwork, magma_int_t num_gpus, magma_int_t nb, magma_int_t offset, magma_queue_t stream[][10])
 magmablas_ssymv performs the matrix-vector operation: More...
 

Detailed Description

Function Documentation

void magmablas_sgemv ( magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
magma_int_t  incx,
float  beta,
float *  y,
magma_int_t  incy 
)

Purpose

This routine computes: 1) y = A x if trans == 'N' or 'n', alpha == 1, beta == 0, and incx == incy == 1 (using magmablas code) 2) y = alpha A^T x if trans == 'T' or 't', beta == 0, and incx == incy == 1 (using magmablas code) 3) y = alpha A^trans x + beta y otherwise, using CUBLAS.

Parameters
[in]transmagma_trans_t On entry, TRANS specifies the operation to be performed as follows:
  • = MagmaNoTrans: y := alpha*A *x + beta*y
  • = MagmaTrans: y := alpha*A^T*x + beta*y
[in]mINTEGER On entry, m specifies the number of rows of the matrix A.
[in]nINTEGER On entry, n specifies the number of columns of the matrix A
[in]alphaREAL On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension ( LDA, n ) on the GPU.
[in]ldaINTEGER LDA specifies the leading dimension of A.
[in]xREAL array of dimension n if trans == 'n' m if trans == 't'
[in]incxSpecifies the increment for the elements of X. INCX must not be zero.
[in]betaREAL On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[out]yREAL array of dimension m if trans == 'n' n if trans == 't'
[in]incySpecifies the increment for the elements of Y. INCY must not be zero.
void magmablas_sgemv2_tesla ( magma_int_t  m,
magma_int_t  n,
const float *  A,
magma_int_t  lda,
const float *  x,
magma_int_t  incx,
float *  y 
)

Purpose

This routine computes y = A x on the GPU.

This version has INCX as an argument.

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]AREAL array of dimension (LDA, N) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension N.
[in]incxSpecifies the increment for the elements of X. INCX must not be zero.
[out]yREAL array of dimension M. On exit Y = A X.
void magmablas_sgemv_tesla ( magma_trans_t  trans,
magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
magma_int_t  incx,
float  beta,
float *  y,
magma_int_t  incy 
)

This routine computes: 1) y = A x if trans == 'N' or 'n', alpha == 1, beta == 0, and incx == incy == 1 (using magmablas code) 2) y = alpha A^T x if trans == 'T' or 't', beta == 0, and incx == incy == 1 (using magmablas code) 3) y = alpha A^TRANS x + beta y otherwise, using CUBLAS.

Parameters
[in]transmagma_trans_t On entry, TRANS specifies the operation to be performed as follows:
  • = MagmaNoTrans: y := alpha*A *x + beta*y
  • = MagmaTrans: y := alpha*A^T*x + beta*y
[in]mINTEGER On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER On entry, N specifies the number of columns of the matrix A
[in]alphaREAL On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension (LDA, N) on the GPU.
[in]ldaINTEGER LDA specifies the leading dimension of A.
[in]xREAL array of dimension n if trans == 'n' m if trans == 't'
[in]incxSpecifies the increment for the elements of X. INCX must not be zero.
[in]betaREAL On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[out]yREAL array of dimension m if trans == 'n' n if trans == 't'
[in]incySpecifies the increment for the elements of Y. INCY must not be zero.
void magmablas_sgemvn_fermi ( magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
float  beta,
float *  y 
)

This routine computes Y = alpha A x + beta y, on the GPU.

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension ( LDA, n ) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension n.
[in]betaREAL On entry, BETA specifies the scalar beta.
[out]yREAL array of dimension n. On exit Y = alpha A X.
void magmablas_sgemvt1_tesla ( magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
float *  y 
)

Purpose

This routine computes y = alpha A^T x on the GPU.

Recommended for large M and N.

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension (LDA, N) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension m.
[out]yREAL array of dimension n. On exit Y = alpha A^T X.
void magmablas_sgemvt2_tesla ( magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
float *  y 
)

Purpose

This routine computes y = alpha A^T x on the GPU.

Used in least squares solver for N small (e.g. = BS, a block size of order 64, 128, etc).

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension (LDA, N) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension m.
[out]yREAL array of dimension n. On exit Y = alpha A^T X.
void magmablas_sgemvt_fermi ( magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
float  beta,
float *  y 
)

Purpose

This routine computes y = alpha * A^T * x + beta * y, on the GPU.

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension ( LDA, n ) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension m.
[in]betaREAL On entry, BETA specifies the scalar beta.
[out]yREAL array of dimension n. On exit Y = alpha A^T X.
void magmablas_sgemvt_tesla ( magma_int_t  m,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
float *  y 
)

Purpose

This routine computes y = alpha A^T x on the GPU.

Parameters
[in]mINTEGER. On entry, M specifies the number of rows of the matrix A.
[in]nINTEGER. On entry, N specifies the number of columns of the matrix A
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of dimension (LDA, N) on the GPU.
[in]ldaINTEGER. LDA specifies the leading dimension of A.
[in]xREAL array of dimension m.
[out]yREAL array of dimension n. On exit Y = alpha A^T X.
magma_int_t magmablas_ssymv ( magma_uplo_t  uplo,
magma_int_t  n,
float  alpha,
const float *  A,
magma_int_t  lda,
const float *  x,
magma_int_t  incx,
float  beta,
float *  y,
magma_int_t  incy 
)

magmablas_ssymv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of DIMENSION ( LDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]ldaINTEGER. On entry, LDA specifies the first dimension of A as declared in the calling (sub) program. LDA must be at least max( 1, n ). It is recommended that lda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]xREAL array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaREAL. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]yREAL array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.
magma_int_t magmablas_ssymv_mgpu_offset ( magma_uplo_t  uplo,
magma_int_t  n,
float  alpha,
float **  A,
magma_int_t  lda,
float **  x,
magma_int_t  incx,
float  beta,
float **  y,
magma_int_t  incy,
float **  work,
magma_int_t  lwork,
magma_int_t  num_gpus,
magma_int_t  nb,
magma_int_t  offset,
magma_queue_t  stream[][10] 
)

magmablas_ssymv performs the matrix-vector operation:

y := alpha*A*x + beta*y,

where alpha and beta are scalars, x and y are n element vectors and A is an n by n symmetric matrix.

Parameters
[in]uplomagma_uplo_t. On entry, UPLO specifies whether the upper or lower triangular part of the array A is to be referenced as follows:
  • = MagmaUpper: Only the upper triangular part of A is to be referenced.
  • = MagmaLower: Only the lower triangular part of A is to be referenced.
[in]nINTEGER. On entry, N specifies the order of the matrix A. N must be at least zero.
[in]alphaREAL. On entry, ALPHA specifies the scalar alpha.
[in]AREAL array of DIMENSION ( LDA, n ). Before entry with UPLO = MagmaUpper, the leading n by n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = MagmaLower, the leading n by n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero.
[in]ldaINTEGER. On entry, LDA specifies the first dimension of A as declared in the calling (sub) program. LDA must be at least max( 1, n ). It is recommended that lda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.
[in]xREAL array of dimension at least ( 1 + ( n - 1 )*abs( INCX ) ). Before entry, the incremented array X must contain the n element vector x.
[in]incxINTEGER. On entry, INCX specifies the increment for the elements of X. INCX must not be zero.
[in]betaREAL. On entry, BETA specifies the scalar beta. When BETA is supplied as zero then Y need not be set on input.
[in,out]yREAL array of dimension at least ( 1 + ( n - 1 )*abs( INCY ) ). Before entry, the incremented array Y must contain the n element vector y. On exit, Y is overwritten by the updated vector y.
[in]incyINTEGER. On entry, INCY specifies the increment for the elements of Y. INCY must not be zero.