Problem with magmablas_dgemvt
Posted: Wed Jul 07, 2010 12:35 pm
I gather that magmablas_dgemvt is supposed to compute y = A^t x, where A is n x m, x is n x 1, and y is m x 1. However, the function appears to compute only the first n elements of y. Thus, results are correct if m <= n, but incorrect otherwise. Below is a program, and its output, to illustrate the problem.
The main reason for bringing this up is because page 55 of the manual states that the MAGMA BLAS library includes "only certain kernels that are crucial for the performance of MAGMA routines." Hence, I am wondering if there are problems with other MAGMA routines that could be attributable to their calling dgemvt, such as the problem I am having with magma_dgeqrs/dgeqrf_gpu2.
On the other hand, the documentation for dgemvt is a bit unclear to me, so there is a chance that I am not using the function as intended.
Program output:
Matrix A:
0.000000 2.000000 4.000000 6.000000 8.000000
1.000000 3.000000 5.000000 7.000000 9.000000
Vector x:
1.000000
1.000000
MAGMA Result
=============
A^t x (magmablas_dgemvt):
1.000000
5.000000
0.000000
0.000000
0.000000
CUBLAS Result
==============
A^t x (cublasDgemv):
1.000000
5.000000
9.000000
13.000000
17.000000
The main reason for bringing this up is because page 55 of the manual states that the MAGMA BLAS library includes "only certain kernels that are crucial for the performance of MAGMA routines." Hence, I am wondering if there are problems with other MAGMA routines that could be attributable to their calling dgemvt, such as the problem I am having with magma_dgeqrs/dgeqrf_gpu2.
On the other hand, the documentation for dgemvt is a bit unclear to me, so there is a chance that I am not using the function as intended.
Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>
#include <cublas.h>
#include <magma.h>
/***************************************************************************
* Program to compute y = A^t x.
* - magmablas_dgemvt fails
* - cublasDgemv passes
***************************************************************************/
void mprint(int m, int n, double *x, int ldx) {
int i, j;
for(i = 0; i < m; i++) {
for(j = 0; j < n; j++) {
printf("%f ", x[i + j * ldx]);
}
printf("\n");
}
}
#define n 2 // Number of rows in A
#define m 5 // Number of columns in A
int main(int argc, char** argv)
{
int lda = n, i;
double A[n * m], x[n], y[m], zeros[m], *d_A, *d_x, *d_y;
for(i = 0; i < (n * m); i++) A[i] = i;
for(i = 0; i < n; i++) x[i] = 1.0;
for(i = 0; i < m; i++) zeros[i] = 0.0;
printf("Matrix A:\n");
mprint(n, m, A, lda);
printf("\nVector x:\n");
mprint(n, 1.0, x, n);
cublasInit();
cublasAlloc(n * m, sizeof(double), (void **)&d_A);
cublasAlloc(n, sizeof(double), (void **)&d_x);
cublasAlloc(m, sizeof(double), (void **)&d_y);
if(d_A == NULL || d_x == NULL || d_y == NULL) {
fprintf(stderr, "Error: CUDA memory allocation failed\n");
exit(1);
}
/***************************************************************************
* MAGMA Matrix-Vector Multiplication
* - Incorrectly returns only first m elements of the n-dimensional array z
***************************************************************************/
printf("\nMAGMA Result\n=============\n");
cublasSetVector(n * m, sizeof(double), A, 1, d_A, 1);
cublasSetVector(n, sizeof(double), x, 1, d_x, 1);
cublasSetVector(m, sizeof(double), zeros, 1, d_y, 1);
magmablas_dgemvt(n, m, 1.0, d_A, lda, d_x, d_y);
cublasGetVector(m, sizeof(double), d_y, 1, y, 1);
printf("\nA^t x (magmablas_dgemvt):\n");
mprint(m, 1.0, y, m);
/***************************************************************************
* CUBLAS Matrix-Vector Multiplication
* - Returns correct result
***************************************************************************/
printf("\nCUBLAS Result\n==============\n");
cublasSetVector(m, sizeof(double), zeros, 1, d_y, 1);
cublasDgemv('T', n, m, 1.0, d_A, lda, d_x, 1, 0.0, d_y, 1);
cublasGetVector(m, sizeof(double), d_y, 1, y, 1);
printf("\nA^t x (cublasDgemv):\n");
mprint(m, 1.0, y, m);
cublasFree(d_A);
cublasFree(d_x);
cublasFree(d_y);
cublasShutdown();
return(EXIT_SUCCESS);
}Matrix A:
0.000000 2.000000 4.000000 6.000000 8.000000
1.000000 3.000000 5.000000 7.000000 9.000000
Vector x:
1.000000
1.000000
MAGMA Result
=============
A^t x (magmablas_dgemvt):
1.000000
5.000000
0.000000
0.000000
0.000000
CUBLAS Result
==============
A^t x (cublasDgemv):
1.000000
5.000000
9.000000
13.000000
17.000000