Publications
Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications,”
2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Press, November 2022.
“Resiliency in numerical algorithm design for extreme scale simulations,”
The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.
“Surrogate ML/AI Model Benchmarking for FAIR Principles' Conformance,”
2022 IEEE High Performance Extreme Computing Conference (HPEC): IEEE, September 2022.
“Threshold Pivoting for Dense LU Factorization,”
ScalAH22: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems , Dallas, Texas, IEEE, November 2022.
(721.77 KB)
“
Using long vector extensions for MPI reductions,”
Parallel Computing, vol. 109, pp. 102871, March 2022.
“20 years of computational science: Selected papers from 2020 International Conference on Computational Science,”
Journal of Computational Science, vol. 53, pp. 101395–101395, 2021.
“Accelerating FFT towards Exascale Computing
: NVIDIA GPU Technology Conference (GTC2021), 2021.
(27.23 MB)

Accelerating Restarted GMRES with Mixed Precision Arithmetic,”
IEEE Transactions on Parallel and Distributed Systems, June 2021.
(572.4 KB)
“
Budget-aware scheduling algorithms for scientific workflows with stochastic task weights on IaaS Cloud platforms,”
Concurrency and Computation: Practice and Experience, vol. 33, no. 17, pp. e6065, 2021.
(1.99 MB)
“
Callback-based completion notification using MPI Continuations,”
Parallel Computing, vol. 21238566, issue 0225, pp. 102793, May Jan.
“Distributed-Memory Multi-GPU Block-Sparse Tensor Contraction for Electronic Structure,”
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.
“DTE: PaRSEC Enabled Libraries and Applications
: 2021 Exascale Computing Project Annual Meeting, April 2021.
(3.24 MB)

Dynamic DAG scheduling under memory constraints for shared-memory platforms,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 27-49, 2021.
(574.64 KB)
“
Efficient exascale discretizations: High-order finite element methods,”
The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021.
“Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,”
Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.
“Evaluating Task Dropping Strategies for Overloaded Real-Time Systems (Work-In-Progress),”
42nd Real Time Systems Symposium (RTSS): IEEE Computer Society Press, 2021.
(217.13 KB)
“
Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems,”
IEEE Access, 2021.
(1.35 MB)
“
Gingko: A Sparse Linear Algebrea Library for HPC
: 2021 ECP Annual Meeting, April 2021.
(893.04 KB)

GPU algorithms for Efficient Exascale Discretizations,”
Parallel Computing, vol. 108, pp. 102841, 2021.
“Interim Report on Benchmarking FFT Libraries on High Performance Systems,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.
(2.68 MB)
“
An international survey on MPI users,”
Parallel Computing, vol. 108, December 2021.
(1.49 MB)
“
An Introduction to High Performance Computing and Its Intersection with Advances in Modeling Rare Earth Elements and Actinides,”
Rare Earth Elements and Actinides: Progress in Computational Science Applications, vol. 1388, Washington, DC, American Chemical Society, pp. 3-53, October 2021.
“Lecture Notes in Computer Science: High Performance Computing
, vol. 12761: Springer International Publishing, 2021.
Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,”
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.
(1.08 MB)
“
libCEED: Fast algebra for high-order element-based discretizations,”
Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021.
“Linear Algebra Prepara.on for Emergent Neural Network Architectures: MAGMA, BLAS, and Batched GPU Computing
, Virtual, LAPENNA Workshop, November 2021.
(17.8 MB)

Materials fingerprinting classification,”
Computer Physics Communications, pp. 108019, May Jan.
(3.8 MB)
“
Max-Stretch Minimization on an Edge-Cloud Platform,”
IPDPS'2021, the 34th IEEE International Parallel and Distributed Processing Symposium: IEEE Computer Society Press, 2021.
(4.94 MB)
“
Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,”
ICL Technical Report, no. ICL-UT-21-05, August 2021.
(3.93 MB)
“
A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,”
ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.
(493.17 KB)
“
P1673R3: A Free Function Linear algebra Interface Based on the BLAS,”
ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.
(858.89 KB)
“
Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication,”
EuroMPI'21, Garching, Munich Germany, 2021.
(835.27 KB)
“
Rare Earth Elements and Actinides: Progress in Computational Science Applications,”
ACS Symposium Series, vol. 1388, Washington, DC, American Chemical Society, October 2021.
“Rare Earth Elements and Critical Materials: Uses and Availability,”
Rare Earth Elements and Actinides: Progress in Computational Science Applications, vol. 1388, Washington, DC, American Chemical Society, pp. 63-74, October 2021.
“Resilient scheduling heuristics for rigid parallel jobs,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 2-26, 2021.
(8.67 MB)
“
Revisiting Credit Distribution Algorithms for Distributed Termination Detection,”
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW): IEEE, pp. 611–620, 2021.
“Scalability Issues in FFT Computation,”
International Conference on Parallel Computing Technologies: Springer, pp. 279–287, 2021.
“A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,”
ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021.
“SLATE Performance Improvements: QR and Eigenvalues,”
SLATE Working Notes, no. 17, ICL-UT-21-02, April 2021.
(2 MB)
“
SLATE Port to AMD and Intel Platforms,”
SLATE Working Notes, no. 16, ICL-UT-21-01, April 2021.
(890.75 KB)
“
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic,”
The International Journal of High Performance Computing Applications, vol. 35, no. 4, pp. 344–369, 2021.
“Task-graph scheduling extensions for efficient synchronization and communication,”
Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021.
“Translational process: Mathematical software perspective,”
Journal of Computational Science, vol. 52, pp. 101216, 2021.
“ASCR@40: Four Decades of Department of Energy Leadership in Advanced Scientific Computing Research
: Advanced Scientific Computing Advisory Committee (ASCAC), US Department of Energy, August 2020.
ASCR@40: Highlights and Impacts of ASCR’s Programs
: US Department of Energy’s Office of Advanced Scientific Computing Research, June 2020.
Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,”
Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL 2020), May 2020.
(188.51 KB)
“
Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-04: University of Tennessee, Knoxville, March 2020.
(188.51 KB)
“
CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra,”
ECP Milestone Reports: Zenodo, May 2020.
(28.12 MB)
“
Clover: Computational Libraries Optimized via Exascale Research
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(872 KB)

Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,”
2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, IEEE, May 2020.
(1.33 MB)
“