Publications
P1673R3: A Free Function Linear algebra Interface Based on the BLAS,”
ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.
(858.89 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Optimal Checkpointing Period: Time vs. Energy,”
University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.
(440.13 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Numerically Stable Real-Number Codes Based on Random Matrices,”
University of Tennessee Computer Science Department Technical Report, vol. –04-526, October 2004.
(91.66 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Numerical Metadata API Reference,”
Innovative Computing Laboratory Technical Report, February 2007.
(454.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Numerical Libraries and The Grid: The Grads Experiments with ScaLAPACK,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-460, January 2001.
(91.78 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
New Robust ScaLAPACK Routine for Computing the QR Factorization with Column Pivoting,”
LAPACK Working Note, no. LAWN 296, ICL-UT-19-14: University of Tennessee, October 2019.
(454.83 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
NetBuild: Automated Installation and Use of Network-Accessible Software Libraries,”
ICL Technical Report, no. ICL-UT-04-02, January 2004.
(80.52 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
NetBuild,”
University of Tennessee Computer Science Technical Report, no. UT-CS-O1-461, January 2001.
(17.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.
(497.64 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
MPI Collective Algorithm Selection and Quadtree Encoding,”
ICL Technical Report, no. ICL-UT-06-11, 00 2006.
(308.39 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Modeling of L2 Cache Behavior for Thread-Parallel Scientific Programs on Chip Multi-Processors,”
University of Tennessee Computer Science Technical Report, no. UT-CS-06-583, January 2006.
(652.93 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,”
ICL Technical Report, no. ICL-UT-21-05, August 2021.
(3.93 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed Precision LU Factorization on GPU Tensor Cores: Reducing Data Movement and Memory Footprint,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-13: University of Tennessee, September 2020.
(409 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-04, May 2022.
(706.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Metacomputing: An Evaluation of Emerging Systems,”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-445, July 2000.
(280.21 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
MAGMA-sparse Interface Design Whitepaper,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.
(1.28 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.
(929.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Linear Systems Performance Report,”
SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.
(1.64 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Limitations of the Playstation 3 for High Performance Cluster Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.
(171.01 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Least Squares Performance Report,”
SLATE Working Notes, no. 09, ICL-UT-18-10: Innovative Computing Laboratory, University of Tennessee, December 2018.
(1.76 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
LAWN 294: Aasen's Symmetric Indenite Linear Solvers in LAPACK,”
LAPACK Working Note, no. LAWN 294, ICL-UT-17-13: University of Tennessee, December 2017.
(854.1 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Kernel Assisted Collective Intra-node Communication Among Multicore and Manycore CPUs,”
University of Tennessee Computer Science Technical Report, UT-CS-10-663, November 2010.
(384.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Introduction to the HPCChallenge Benchmark Suite,”
ICL Technical Report, no. ICL-UT-05-01, January 2005.
(124.86 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Internet Backplane Protocol - Test Language v. 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(22.43 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Internet Backplane Protocol: API 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(55.33 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
International Exascale Software Project Roadmap v1.0,”
University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.
(719.74 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Interim Report on Benchmarking FFT Libraries on High Performance Systems,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.
(2.68 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Integrating Deep Learning in Domain Sciences at Exascale,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.
(1.09 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Initial Integration and Evaluation of SLATE Parallel BLAS in LATTE,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-07: Innovative Computing Laboratory, University of Tennessee, June 2018.
(366.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Initial Integration and Evaluation of SLATE and STRUMPACK,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-11: University of Tennessee, December 2018.
(249.78 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,”
University of Tennessee Computer Science Technical Report (also LAWN 283), no. ut-eecs-13-720: University of Tennessee, October 2013.
(1.23 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Improved MAGMA GEMM for Fermi GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.
(486.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,”
Lawn 277, no. UT-CS-13-709, May 2013.
(298.63 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.
(578.11 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Implementation of the C++ API for Batch BLAS,”
SLATE Working Notes, no. 07, ICL-UT-18-04: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.07 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
IBP - Internet Backplane Protocol: Infrastructure for Distributed Storage (V O.2),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-430, February 1999.
(37.72 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,”
University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.
(866.68 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
HPCS Library Study Effort,”
University of Tennessee Computer Science Technical Report, UT-CS-08-617, January 2008.
(73.22 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,”
University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.
“How LAPACK library enables Microsoft Visual Studio support with CMake and LAPACKE,”
University of Tennessee Computer Science Technical Report (also LAWN 270), no. UT-CS-12-698, July 2012.
(501.53 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High-Performance Tensor Contractions for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.
(2.36 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Realtime Convex Solver for Embedded Systems,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-745, October 2016.
(225.43 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,”
University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 2011.
(424.93 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.
(405.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Hardware Software Server in NetSolve,”
ICL Technical Report, no. ICL-UT-02-02, January 2002.
(221.4 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
GridRPC: A Remote Procedure Call API for Grid Computing,”
ICL Technical Report, no. ICL-UT-02-06, November 2002.
(287.73 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The GrADS Project: Software Support for High-Level Grid Application Development,”
Technical Report, February 2000.
(347.41 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,”
University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.
(662.98 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)