Publications
PAQR: Pivoting Avoiding QR factorization,”
ICL Technical Report, no. ICL-UT-22-06, June 2022.
(364.85 KB)
“
Parallel BLAS Performance Report,”
SLATE Working Notes, no. 05, ICL-UT-18-01: University of Tennessee, April 2018.
(4.39 MB)
“
Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited,”
University of Tennessee Computer Science Technical Report, UT-CS-08-624 (also LAPACK Working Note 208), August 2008.
(420.31 KB)
“
Parallel Norms Performance Report,”
SLATE Working Notes, no. 06, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.13 MB)
“
Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels,”
University of Tennessee Computer Science Technical Report, UT-CS-11-677, (also Lawn254), August 2011.
(636.01 KB)
“
Parallel Tiled QR Factorization for Multicore Architectures,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-598 (also LAPACK Working Note 190), 00 2007.
(277.92 KB)
“
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.
(544.2 KB)
“
Performance, Design, and Autotuning of Batched GEMM for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-739: University of Tennessee, February 2016.
(1.27 MB)
“
Performance evaluation of LU factorization through hardware counter measurements,”
University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.
(794.82 KB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, 00 2011.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software,”
University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.
(539.24 KB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, UT-CS-89-85, 00 2010.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, CS-89-85, January 2008.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, January 2001.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, vol. –89-95, January 2006.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.
(514.64 KB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, CS-89-85, January 2004.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Dept. Technical Report CS-89-85, 00 2007.
(6.42 MB)
“
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, no. CS-89-85, January 2000.
(354.1 KB)
“
Performance Tuning SLATE,”
SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.
(1.29 MB)
“
PLASMA 17 Performance Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.
(7.57 MB)
“
PLASMA 17.1 Functionality Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.
(1.8 MB)
“
The PlayStation 3 for High Performance Scientific Computing,”
University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.
(2.45 MB)
“
POMPEI: Programming with OpenMP4 for Exascale Investigations,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.
(1.1 MB)
“
A Portable Programming Interface for Performance Evaluation on Modern Processors,”
University of Tennessee Computer Science Technical Report, UT-CS-00-444, July 2000.
(655.17 KB)
“
Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems: Formal Proof,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-15-01, April 2015.
(570.97 KB)
“
The Problem with the Linpack Benchmark Matrix Generator,”
University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.
(136.41 KB)
“
A Proposal for User-Level Failure Mitigation in the MPI-3 Standard,”
University of Tennessee Electrical Engineering and Computer Science Technical Report, no. ut-cs-12-693: University of Tennessee, February 2012.
(159.46 KB)
“
Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),”
LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.
(1.41 MB)
“
Providing GPU Capability to LU and QR within the ScaLAPACK Framework,”
University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.
(7.48 MB)
“
PULSAR Users’ Guide, Parallel Ultra-Light Systolic Array Runtime,”
University of Tennessee EECS Technical Report, no. UT-EECS-14-733: University of Tennessee, November 2014.
(561.56 KB)
“
QR Factorization for the CELL Processor,”
University of Tennessee Computer Science Technical Report, UT-CS-08-616 (also LAPACK Working Note 201), May 2008.
(194.95 KB)
“
QUARK Users' Guide: QUeueing And Runtime for Kernels,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.
(247.12 KB)
“
Randomized Numerical Linear Algebra: A Perspective on the Field with an Eye to Software,”
University of California, Berkeley EECS Technical Report, no. UCB/EECS-2022-258: University of California, Berkeley, November 2022.
(1.05 MB)
(1.54 MB)
“

Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-538, 00 2005.
(241.36 KB)
“
Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,”
ICL Technical Report, no. ICL-UT-04-04, January 2004.
(241.36 KB)
“
Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution and Inversion,”
University of Tennessee Computer Science Technical Report, UT-CS-08-614 (also LAPACK Working Note 199), April 2008.
(896.03 KB)
“
Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-06, Knoxville, TN, Submitted to PPAM 2011, May 2011.
(145.76 KB)
“
Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-661, October 2010.
(287.87 KB)
“
Remote Software Toolkit Installer,”
ICL Technical Report, no. ICL-UT-05-04, June 2005.
(490.6 KB)
“
Report on the Fujitsu Fugaku System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06: University of Tennessee, June 2020.
(3.3 MB)
“
Report on the Oak Ridge National Laboratory's Frontier System,”
ICL Technical Report, no. ICL-UT-22-05, May 2022.
(16.87 MB)
“
Report on the Sunway TaihuLight System,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-742: University of Tennessee, June 2016.
“Report on the TianHe-2A System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.
(7.15 MB)
“
Request Sequencing: Enabling Workflow for Efficient Parallel Problem Solving in GridSolve,”
ICL Technical Report, no. ICL-UT-08-01, April 2008.
(1.64 MB)
“
Revisiting the Double Checkpointing Algorithm,”
University of Tennessee Computer Science Technical Report (LAWN 274), no. ut-cs-13-705, January 2013.
(682.22 KB)
“
RIBAPI - Repository in a Box Application Programmer's Interface,”
University of Tennessee Computer Science Technical Report, no. UT-CS-00-438, 00 2001.
(57.5 KB)
“
Roadmap for the Development of a Linear Algebra Library for Exascale Computing: SLATE: Software for Linear Algebra Targeting Exascale,”
SLATE Working Notes, no. 01, ICL-UT-17-02: Innovative Computing Laboratory, University of Tennessee, June 2017.
(2.8 MB)
“
On Scalability for MPI Runtime Systems,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-05, Knoxville, TN, May 2011.
(898.76 KB)
“
Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems,”
University of Tennessee Computer Science Technical Report, vol. –10-653, April 2010.
(3.42 MB)
“