Publications
Ginkgo - A math library designed to accelerate Exascale Computing Project science applications,”
The International Journal of High Performance Computing Applications, August 2024.
DOI: 10.1177/10943420241268323
“Ginkgo—A math library designed for platform portability,”
Parallel Computing, vol. 111, pp. 102902, February 2022.
DOI: 10.1016/j.parco.2022.102902
“Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,”
Journal of Parallel and Distributed Computing, vol. 119, pp. 219–230, November 2018.
DOI: 10.1016/j.jpdc.2018.04.017
(273.53 KB)
“
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance,”
Computer Physics Communications, vol. 97, issue 1-2, pp. 1-15, August 1996.
DOI: 10.1016/0010-4655(96)00017-3
“Accelerating 2D FFT: Exploit GPU Tensor Cores through Mixed-Precision
, Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), ACM Student Research Poster, November 2018.
(740.37 KB)

Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing,”
IEEE Transactions on Computers, vol. 58, issue 11, pp. 1512-1524, November 2009.
DOI: 10.1109/TC.2009.42
(1.81 MB)
“
Porting Sparse Linear Algebra to Intel GPUs,”
Euro-Par 2021: Parallel Processing Workshops, vol. 13098, Lisbon, Portugal, Springer International Publishing, pp. 57 - 68, June 2022.
DOI: 10.1007/978-3-031-06156-1_5
“SLATE Mixed Precision Performance Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-19-03: University of Tennessee, April 2019.
(1.04 MB)
“
SLATE Developers' Guide,”
SLATE Working Notes, no. 11, ICL-UT-19-02: Innovative Computing Laboratory, University of Tennessee, December 2019.
(1.68 MB)
“
Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs,”
2022 IEEE International Conference on Cluster Computing (CLUSTER), pp. 152-160, September 2022.
DOI: 10.1109/CLUSTER51413.2022.00029
“Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,”
ICL Technical Report, no. ICL-UT-22-04, May 2022.
(706.14 KB)
“
PMIx: Process Management for Exascale Environments,”
Parallel Computing, vol. 79, pp. 9–29, January 2018.
DOI: 10.1016/j.parco.2018.08.002
“PMIx: Process Management for Exascale Environments,”
Proceedings of the 24th European MPI Users' Group Meeting, New York, NY, USA, ACM, pp. 14:1–14:10, 2017.
DOI: 10.1145/3127024.3127027
“Computing the Expected Makespan of Task Graphs in the Presence of Silent Errors,”
Parallel Computing, vol. 75, pp. 41–60, July 2018.
DOI: 10.1016/j.parco.2018.03.004
(2.56 MB)
“
Budget-aware scheduling algorithms for scientific workflows with stochastic task weights on IaaS Cloud platforms,”
Concurrency and Computation: Practice and Experience, vol. 33, no. 17, pp. e6065, 2021.
DOI: 10.1002/cpe.6065
(1.99 MB)
“
Evaluating PaRSEC Through Matrix Computations in Scientific Applications,”
Asynchronous Many-Task Systems and Applications - Second International Workshop, WAMTA 2024, Knoxville, TN, USA, February 14-16, 2024, Proceedings, vol. 14626: Springer, pp. 22–33, 2024.
DOI: 10.1007/978-3-031-61763-8_3
(600.76 KB)
“
Evaluating Data Redistribution in PaRSEC,”
IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 8, pp. 1856-1872, August 2022.
DOI: 10.1109/TPDS.2021.3131657
(3.19 MB)
“
Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,”
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.
(1.08 MB)
“
Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications,”
2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Press, November 2022.
“Design for a Soft Error Resilient Dynamic Task-based Runtime,”
ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.
(2.61 MB)
“
Flexible Data Redistribution in a Task-Based Runtime System,”
IEEE International Conference on Cluster Computing (Cluster 2020), Kobe, Japan, IEEE, September 2020.
DOI: 10.1109/CLUSTER49012.2020.00032
(354.8 KB)
“
Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,”
2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023.
DOI: 10.1109/CLUSTER52292.2023.00035
“