Publications
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures,”
Parallel Computing (to appear), 00 2010.
(612.23 KB)
“
Accelerating Scientific Computations with Mixed Precision Algorithms,”
Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.
(402.69 KB)
“
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures,”
Parallel Computing, vol. 35, pp. 38-53, 00 2009.
(274.74 KB)
“
Parallel Dense Linear Algebra Software in the Multicore Era,”
in Cyberinfrastructure Technologies and Applications: Nova Science Publishers, Inc., pp. 9-24, 00 2009.
“Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,”
in High Performance Computing and Grids in Action, Amsterdam, IOS Press, January 2008.
(92.95 KB)
“
Parallel Tiled QR Factorization for Multicore Architectures,”
Concurrency and Computation: Practice and Experience, vol. 20, pp. 1573-1590, January 2008.
(277.92 KB)
“
The PlayStation 3 for High Performance Scientific Computing,”
University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.
(2.45 MB)
“
The PlayStation 3 for High Performance Scientific Computing,”
Computing in Science and Engineering, pp. 80-83, January 2008.
(2.45 MB)
“
Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization,”
IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 9, pp. 1-11, January 2008.
(751.57 KB)
“
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy,”
ACM Transactions on Mathematical Software, vol. 34, no. 4, pp. 17-22, 00 2008.
(364.48 KB)
“
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures,”
University of Tennessee Computer Science Technical Report, no. UT-CS-07-600 (also LAPACK Working Note 191), January 2007.
(274.74 KB)
“
Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,”
In High Performance Computing and Grids in Action (to appear), Amsterdam, IOS Press, 00 2007.
(122.01 KB)
“
Limitations of the Playstation 3 for High Performance Cluster Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.
(171.01 KB)
“
Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems,”
International Journal of High Performance Computer Applications (to appear), August 2007.
(157.4 KB)
“
Multithreading for synchronization tolerance in matrix factorization,”
Journal of Physics: Conference Series, SciDAC 2007, vol. 78, no. 2007, January 2007.
(577.73 KB)
“
Parallel Tiled QR Factorization for Multicore Architectures,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-598 (also LAPACK Working Note 190), 00 2007.
(277.92 KB)
“
SCOP3: A Rough Guide to Scientific Computing On the PlayStation 3,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-595, 00 2007.
(1.74 MB)
“
Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization,”
UT Computer Science Technical Report (Also LAPACK Working Note 184), no. UT-CS-07-596, January 2007.
(751.57 KB)
“
Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy,”
University of Tennessee Computer Science Tech Report, no. UT-CS-06-574, LAPACK Working Note #175, April 2006.
(221.39 KB)
“
The Impact of Multicore on Math Software,”
PARA 2006, Umea, Sweden, June 2006.
(223.53 KB)
“
Prospectus for the Next LAPACK and ScaLAPACK Libraries,”
PARA 2006, Umea, Sweden, June 2006.
(460.11 KB)
“
Performance Optimization and Modeling of Blocked Sparse Kernels,”
ICL Technical Report, no. ICL-UT-04-05, 00 2004.
(229.58 KB)
“