Publications
Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.
(405.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,”
University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 2011.
(424.93 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,”
University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.
(866.68 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Implementation of the C++ API for Batch BLAS,”
SLATE Working Notes, no. 07, ICL-UT-18-04: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.07 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,”
Lawn 277, no. UT-CS-13-709, May 2013.
(298.63 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Improved MAGMA GEMM for Fermi GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.
(486.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,”
University of Tennessee Computer Science Technical Report (also LAWN 283), no. ut-eecs-13-720: University of Tennessee, October 2013.
(1.23 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Initial Integration and Evaluation of SLATE and STRUMPACK,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-11: University of Tennessee, December 2018.
(249.78 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Initial Integration and Evaluation of SLATE Parallel BLAS in LATTE,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-07: Innovative Computing Laboratory, University of Tennessee, June 2018.
(366.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Integrating Deep Learning in Domain Sciences at Exascale,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.
(1.09 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Interim Report on Benchmarking FFT Libraries on High Performance Systems,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.
(2.68 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Introduction to the HPCChallenge Benchmark Suite,”
ICL Technical Report, no. ICL-UT-05-01, January 2005.
(124.86 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
LAWN 294: Aasen's Symmetric Indenite Linear Solvers in LAPACK,”
LAPACK Working Note, no. LAWN 294, ICL-UT-17-13: University of Tennessee, December 2017.
(854.1 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Least Squares Performance Report,”
SLATE Working Notes, no. 09, ICL-UT-18-10: Innovative Computing Laboratory, University of Tennessee, December 2018.
(1.76 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Linear Systems Performance Report,”
SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.
(1.64 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.
(929.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-04, May 2022.
(706.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,”
ICL Technical Report, no. ICL-UT-21-05, August 2021.
(3.93 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
PAQR: Pivoting Avoiding QR factorization,”
ICL Technical Report, no. ICL-UT-22-06, June 2022.
(364.85 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Parallel BLAS Performance Report,”
SLATE Working Notes, no. 05, ICL-UT-18-01: University of Tennessee, April 2018.
(4.39 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited,”
University of Tennessee Computer Science Technical Report, UT-CS-08-624 (also LAPACK Working Note 208), August 2008.
(420.31 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Parallel Norms Performance Report,”
SLATE Working Notes, no. 06, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.13 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Parallel Tiled QR Factorization for Multicore Architectures,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-598 (also LAPACK Working Note 190), 00 2007.
(277.92 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.
(544.2 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)