Publications
P1673R3: A Free Function Linear algebra Interface Based on the BLAS,”
ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.
(858.89 KB)
“Optimal Checkpointing Period: Time vs. Energy,”
University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.
(440.13 KB)
“Numerically Stable Real-Number Codes Based on Random Matrices,”
University of Tennessee Computer Science Department Technical Report, vol. –04-526, October 2004.
(91.66 KB)
“Numerical Metadata API Reference,”
Innovative Computing Laboratory Technical Report, February 2007.
(454.79 KB)
“Numerical Libraries and The Grid: The Grads Experiments with ScaLAPACK,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-460, January 2001.
(91.78 KB)
“New Robust ScaLAPACK Routine for Computing the QR Factorization with Column Pivoting,”
LAPACK Working Note, no. LAWN 296, ICL-UT-19-14: University of Tennessee, October 2019.
(454.83 KB)
“NetBuild: Automated Installation and Use of Network-Accessible Software Libraries,”
ICL Technical Report, no. ICL-UT-04-02, January 2004.
(80.52 KB)
“NetBuild,”
University of Tennessee Computer Science Technical Report, no. UT-CS-O1-461, January 2001.
(17.71 KB)
“Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.
(497.64 KB)
“MPI Collective Algorithm Selection and Quadtree Encoding,”
ICL Technical Report, no. ICL-UT-06-11, 00 2006.
(308.39 KB)
“A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“Modeling of L2 Cache Behavior for Thread-Parallel Scientific Programs on Chip Multi-Processors,”
University of Tennessee Computer Science Technical Report, no. UT-CS-06-583, January 2006.
(652.93 KB)
“Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,”
ICL Technical Report, no. ICL-UT-21-05, August 2021.
(3.93 MB)
“Mixed Precision LU Factorization on GPU Tensor Cores: Reducing Data Movement and Memory Footprint,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-13: University of Tennessee, September 2020.
(409 KB)
“Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-04, May 2022.
(706.14 KB)
“Metacomputing: An Evaluation of Emerging Systems,”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-445, July 2000.
(280.21 KB)
“MAGMA-sparse Interface Design Whitepaper,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.
(1.28 MB)
“MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.
(929.79 KB)
“Linear Systems Performance Report,”
SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.
(1.64 MB)
“Limitations of the Playstation 3 for High Performance Cluster Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.
(171.01 KB)
“Least Squares Performance Report,”
SLATE Working Notes, no. 09, ICL-UT-18-10: Innovative Computing Laboratory, University of Tennessee, December 2018.
(1.76 MB)
“LAWN 294: Aasen's Symmetric Indenite Linear Solvers in LAPACK,”
LAPACK Working Note, no. LAWN 294, ICL-UT-17-13: University of Tennessee, December 2017.
(854.1 KB)
“Kernel Assisted Collective Intra-node Communication Among Multicore and Manycore CPUs,”
University of Tennessee Computer Science Technical Report, UT-CS-10-663, November 2010.
(384.75 KB)
“Introduction to the HPCChallenge Benchmark Suite,”
ICL Technical Report, no. ICL-UT-05-01, January 2005.
(124.86 KB)
“Internet Backplane Protocol - Test Language v. 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(22.43 KB)
“Internet Backplane Protocol: API 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(55.33 KB)
“International Exascale Software Project Roadmap v1.0,”
University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.
(719.74 KB)
“Interim Report on Benchmarking FFT Libraries on High Performance Systems,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.
(2.68 MB)
“Integrating Deep Learning in Domain Sciences at Exascale,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.
(1.09 MB)
“Initial Integration and Evaluation of SLATE Parallel BLAS in LATTE,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-07: Innovative Computing Laboratory, University of Tennessee, June 2018.
(366.6 KB)
“Initial Integration and Evaluation of SLATE and STRUMPACK,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-11: University of Tennessee, December 2018.
(249.78 KB)
“An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,”
University of Tennessee Computer Science Technical Report (also LAWN 283), no. ut-eecs-13-720: University of Tennessee, October 2013.
(1.23 MB)
“An Improved MAGMA GEMM for Fermi GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-CS-10-655 (also LAPACK working note 227), July 2010.
(486.71 KB)
“Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,”
Lawn 277, no. UT-CS-13-709, May 2013.
(298.63 KB)
“Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-14-727: University of Tennessee, April 2014.
(578.11 KB)
“Implementation of the C++ API for Batch BLAS,”
SLATE Working Notes, no. 07, ICL-UT-18-04: Innovative Computing Laboratory, University of Tennessee, June 2018.
(1.07 MB)
“IBP - Internet Backplane Protocol: Infrastructure for Distributed Storage (V O.2),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-430, February 1999.
(37.72 KB)
“Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,”
University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.
(866.68 KB)
“HPCS Library Study Effort,”
University of Tennessee Computer Science Technical Report, UT-CS-08-617, January 2008.
(73.22 KB)
“HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,”
University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.
“How LAPACK library enables Microsoft Visual Studio support with CMake and LAPACKE,”
University of Tennessee Computer Science Technical Report (also LAWN 270), no. UT-CS-12-698, July 2012.
(501.53 KB)
“High-Performance Tensor Contractions for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.
(2.36 MB)
“High Performance Realtime Convex Solver for Embedded Systems,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-745, October 2016.
(225.43 KB)
“High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures,”
University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 2011.
(424.93 KB)
“Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
University of Tennessee Computer Science Technical Report (also Lawn 257), no. UT-CS-11-684, October 2011.
(405.71 KB)
“Hardware Software Server in NetSolve,”
ICL Technical Report, no. ICL-UT-02-02, January 2002.
(221.4 KB)
“GridRPC: A Remote Procedure Call API for Grid Computing,”
ICL Technical Report, no. ICL-UT-02-06, November 2002.
(287.73 KB)
“The GrADS Project: Software Support for High-Level Grid Application Development,”
Technical Report, February 2000.
(347.41 KB)
“GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement,”
University of Tennessee Computer Science Technical Report UT-CS-11-690 (also Lawn 260), December 2011.
(662.98 KB)
“