LAPACK/ScaLAPACK Development

by **Saed Hussain** » Fri May 18, 2012 9:13 am

Hi, I am a beginner in LAPACK (using CLAPACK) and I am currently working on an algorithm (Levenberg - Marquardt) in C that would require me to do the following matrix calculation quite frequently:

((J^T)J + uI)^(-1), where, u is a constant, I is the identity matrix and J is (m x n).

After going through this forum and other websites, I understand that there is a difference between the matrix in C (row-major) and FORTRAN (column-major).

Since my main algorithm is in C, I am stuck with the following dilemma:

1. Store the data in the matrix in row major (default in C), convert it into column major using a custom algorithm in C, pass it to the CLAPACK routines , and convert the result back into row major.

2. Store the data in the matrix in row major (default in C), transpose it using some CLAPACK routine (what’s the routine to just transpose the matrix?), pass it to the required CLAPACK routines for the main operation, and then transpose the result (using CLAPCK) back into row major.

3. Store the data in the matrix as column major; pass it to the required CLPACK routine for the main operation.

Method (1) and (2) have some overhead involved in transposing the matrices before the main operation.
Method (3) however doesn’t have any overhead involved in transposing the matrices, but I am slightly concerned with any overhead involved with the way the data is written in memory and accessed. Normally accessing memory in sequence is much quicker than having to jump over certain addresses

Which method would be the best in terms of speed?

I am developing the algorithm on a MacBook Pro (i7 2.3 GHz) using the accelerate framework, provided by apple. Matrix J would be of the following dimension: (n) x (m), where n (row) would be between 2 and 4 and m (column) would be more than 150.

I would be really appreciated if anyone could advise me on this

Saed

by **admin** » Fri Jun 08, 2012 11:48 am

You also have option 4: to use LAPACKE, the new C Standard interface, that would do the transpose for you.

Your matrix sizes are very small so yes the overhead may not negligible.
3 should be the fastest but best is to do a quick timing between those methods to be sure. It will certainly depend on the size of your matrix.

For large size matrix, you shouldn't see the overhead.

LAPACK/ScaLAPACK Development

Passing C matrices to CLAPACK routines (Performance Concern)

Passing C matrices to CLAPACK routines (Performance Concern)

Re: Passing C matrices to CLAPACK routines (Performance Conc

Who is online