Publications
Randomized Numerical Linear Algebra: A Perspective on the Field with an Eye to Software,”
University of California, Berkeley EECS Technical Report, no. UCB/EECS-2022-258: University of California, Berkeley, November 2022.
DOI: 10.48550/arXiv.2302.11474 (1.05 MB) (1.54 MB)
“QUARK Users' Guide: QUeueing And Runtime for Kernels,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.
(247.12 KB)
“QR Factorization for the CELL Processor,”
University of Tennessee Computer Science Technical Report, UT-CS-08-616 (also LAPACK Working Note 201), May 2008.
(194.95 KB)
“PULSAR Users’ Guide, Parallel Ultra-Light Systolic Array Runtime,”
University of Tennessee EECS Technical Report, no. UT-EECS-14-733: University of Tennessee, November 2014.
(561.56 KB)
“Providing GPU Capability to LU and QR within the ScaLAPACK Framework,”
University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.
(7.48 MB)
“Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),”
LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.
(1.41 MB)
“A Proposal for User-Level Failure Mitigation in the MPI-3 Standard,”
University of Tennessee Electrical Engineering and Computer Science Technical Report, no. ut-cs-12-693: University of Tennessee, February 2012.
(159.46 KB)
“The Problem with the Linpack Benchmark Matrix Generator,”
University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.
(136.41 KB)
“Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems: Formal Proof,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-15-01, April 2015.
(570.97 KB)
“A Portable Programming Interface for Performance Evaluation on Modern Processors,”
University of Tennessee Computer Science Technical Report, UT-CS-00-444, July 2000.
(655.17 KB)
“POMPEI: Programming with OpenMP4 for Exascale Investigations,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.
(1.1 MB)
“The PlayStation 3 for High Performance Scientific Computing,”
University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.
(2.45 MB)
“PLASMA 17.1 Functionality Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.
(1.8 MB)
“PLASMA 17 Performance Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.
(7.57 MB)
“Performance Tuning SLATE,”
SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.
(1.29 MB)
“Performance of Various Computers Using Standard Linear Equations Software,”
University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.
(539.24 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, no. CS-89-85, January 2000.
(354.1 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, 00 2011.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, UT-CS-89-85, 00 2010.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.
(514.64 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, CS-89-85, January 2008.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, January 2001.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, vol. –89-95, January 2006.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, CS-89-85, January 2004.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Dept. Technical Report CS-89-85, 00 2007.
(6.42 MB)
“Performance evaluation of LU factorization through hardware counter measurements,”
University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.
(794.82 KB)
“Performance, Design, and Autotuning of Batched GEMM for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-739: University of Tennessee, February 2016.
(1.27 MB)
“A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.
(544.2 KB)
“Parallel Tiled QR Factorization for Multicore Architectures,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-598 (also LAPACK Working Note 190), 00 2007.
(277.92 KB)
“Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels,”
University of Tennessee Computer Science Technical Report, UT-CS-11-677, (also Lawn254), August 2011.
(636.01 KB)
“Parallel Norms Performance Report,”
SLATE Working Notes, no. 06, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.13 MB)
“Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited,”
University of Tennessee Computer Science Technical Report, UT-CS-08-624 (also LAPACK Working Note 208), August 2008.
(420.31 KB)
“Parallel BLAS Performance Report,”
SLATE Working Notes, no. 05, ICL-UT-18-01: University of Tennessee, April 2018.
(4.39 MB)
“PAQR: Pivoting Avoiding QR factorization,”
ICL Technical Report, no. ICL-UT-22-06, June 2022.
(364.85 KB)
“Optimal Checkpointing Period: Time vs. Energy,”
University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.
(440.13 KB)
“Numerically Stable Real-Number Codes Based on Random Matrices,”
University of Tennessee Computer Science Department Technical Report, vol. –04-526, October 2004.
(91.66 KB)
“Numerical Libraries and The Grid: The Grads Experiments with ScaLAPACK,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-460, January 2001.
(91.78 KB)
“NetBuild: Automated Installation and Use of Network-Accessible Software Libraries,”
ICL Technical Report, no. ICL-UT-04-02, January 2004.
(80.52 KB)
“NetBuild,”
University of Tennessee Computer Science Technical Report, no. UT-CS-O1-461, January 2001.
(17.71 KB)
“Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.
(497.64 KB)
“MPI Collective Algorithm Selection and Quadtree Encoding,”
ICL Technical Report, no. ICL-UT-06-11, 00 2006.
(308.39 KB)
“A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“Modeling of L2 Cache Behavior for Thread-Parallel Scientific Programs on Chip Multi-Processors,”
University of Tennessee Computer Science Technical Report, no. UT-CS-06-583, January 2006.
(652.93 KB)
“Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,”
ICL Technical Report, no. ICL-UT-21-05, August 2021.
(3.93 MB)
“Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-04, May 2022.
(706.14 KB)
“MAGMA-sparse Interface Design Whitepaper,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.
(1.28 MB)
“MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.
(929.79 KB)
“Linear Systems Performance Report,”
SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.
(1.64 MB)
“Limitations of the Playstation 3 for High Performance Cluster Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.
(171.01 KB)
“