Publications
Linear Algebra Software for High-Performance Computing (Part 2: Software for Hardware Accelerators and Coprocessors)
, Frankfurt, Germany, ISC High Performance (ISC18), Tutorial Presentation, June 2015.
(15.41 MB)

libCEED: Fast algebra for high-order element-based discretizations,”
Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021.
DOI: 10.21105/joss.02945
“Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,”
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.
(1.08 MB)
“
Lecture Notes in Computer Science: High Performance Computing
, vol. 12761: Springer International Publishing, 2021.
DOI: 10.1007/978-3-030-90539-2
Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,”
ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019.
DOI: https://dl.acm.org/doi/abs/10.1145/3330345.3330356
(1.63 MB)
“
Least Squares Performance Report,”
SLATE Working Notes, no. 09, ICL-UT-18-10: Innovative Computing Laboratory, University of Tennessee, December 2018.
(1.76 MB)
“
LAWN 294: Aasen's Symmetric Indenite Linear Solvers in LAPACK,”
LAPACK Working Note, no. LAWN 294, ICL-UT-17-13: University of Tennessee, December 2017.
(854.1 KB)
“
Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community,”
IEEE Computing in Science & Engineering, vol. 13, issue 5, pp. 90-95, August 2011.
DOI: 10.1109/MCSE.2011.83
(932.57 KB)
“
A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs,”
SBAC-PAD, Lyon, France, IEEE, 2018.
(237.68 KB)
“
Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020.
DOI: 10.1007/978-3-030-50417-5_18
(702.38 KB)
“
Investigating Power Capping toward Energy-Efficient Scientific Applications,”
Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018.
DOI: 10.1002/cpe.4485
(1.2 MB)
“
Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers,”
ScalA17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, ACM.
(766.35 KB)
“
An Introduction to the MAGMA project - Acceleration of Dense Linear Algebra
: NVIDIA Webinar, June 2010.
Introduction to the HPCChallenge Benchmark Suite,”
ICL Technical Report, no. ICL-UT-05-01, January 2005.
(124.86 KB)
“
An Introduction to High Performance Computing and Its Intersection with Advances in Modeling Rare Earth Elements and Actinides,”
Rare Earth Elements and Actinides: Progress in Computational Science Applications, vol. 1388, Washington, DC, American Chemical Society, pp. 3-53, October 2021.
DOI: 10.1021/bk-2021-1388.ch001
“Interoperable Convergence of Storage, Networking, and Computation,”
Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), no. 2: Springer International Publishing, pp. 667-690, 2020.
(1.8 MB)
“
An international survey on MPI users,”
Parallel Computing, vol. 108, December 2021.
DOI: 10.1016/j.parco.2021.102853
(1.49 MB)
“
The International Exascale Software Project Roadmap,”
International Journal of High Performance Computing, vol. 25, no. 1, pp. 3-60, January 2011.
DOI: 10.1177/1094342010391989
(719.74 KB)
“
Interim Report on Benchmarking FFT Libraries on High Performance Systems,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.
(2.68 MB)
“
Interface for Sparse Linear Algebra Operations
, November 2024.
DOI: 10.48550/arXiv.2411.13259