Publications
Towards a Parallel Tile LDL Factorization for Multicore Architectures,”
ICL Technical Report, no. ICL-UT-11-03, Seattle, WA, April 2011.
(425.45 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accelerating GPU Kernels for Dense Linear Algebra,”
Proc. of VECPAR'10, Berkeley, CA, June 2010.
(615.07 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accelerating the Reduction to Upper Hessenberg, Tridiagonal, and Bidiagonal Forms through Hybrid GPU-Based Computing,”
Parallel Computing, vol. 36, no. 12, pp. 645-654, 00 2010.
(1.39 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,”
Submitted to Concurrency and Computations: Practice and Experience, November 2010.
(1.65 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Autotuning Dense Linear Algebra Libraries on GPUs
, Basel, Switzerland, Sixth International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2010), June 2010.
(579.44 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
Can Hardware Performance Counters Produce Expected, Deterministic Results?,”
3rd Workshop on Functionality of Hardware Performance Monitoring, Atlanta, GA, December 2010.
(392.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Collecting Performance Data with PAPI-C,”
Tools for High Performance Computing 2009, 3rd Parallel Tools Workshop, Dresden, Germany, Springer Berlin / Heidelberg, pp. 157-173, May 2010.
DOI: 10.1007/978-3-642-11261-4_11
(4.45 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA,”
University of Tennessee Computer Science Technical Report, UT-CS-10-660, September 2010.
(366.26 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-10-02, 00 2010.
(400.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Divide & Conquer on Hybrid GPU-Accelerated Multicore Systems,”
SIAM Journal on Scientific Computing (submitted), August 2010.
“Faster, Cheaper, Better - A Hybridization Methodology to Develop Linear Algebra Software for GPUs,”
LAPACK Working Note, no. 230, 00 2010.
(334.48 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators,”
IEEE Transaction on Parallel and Distributed Systems (submitted), March 2010.
(3.75 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Improved MAGMA GEMM for Fermi GPUs,”
International Journal of High Performance Computing, vol. 24, no. 4, pp. 511-515, 00 2010.
“An Improved MAGMA GEMM for Fermi GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.
(486.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Introduction to the MAGMA project - Acceleration of Dense Linear Algebra
: NVIDIA Webinar, June 2010.
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,”
First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.
(1.24 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
OpenCL Evaluation for Numerical Linear Algebra Library Development,”
Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.
(2.69 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization for the CELL Processor,”
Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.
(194.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,”
Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.
(468.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)