Publications
Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,”
2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.
DOI: 10.1109/IPDPS53621.2022.00024
(1.26 MB)
“
Power-aware Computing on GPGPUs
, Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011.
(2.89 MB)

Power Aware Computing on GPUs,”
SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.
(658.06 KB)
“
Computing Dense Tensor Decompositions with Optimal Dimension Trees,”
Algorithmica, vol. 81, issue 5, pp. 2092–2121, May 2019.
DOI: 10.1007/s00453-018-0525-3
(638.4 KB)
“
Efficient exascale discretizations: High-order finite element methods,”
The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021.
DOI: 10.1177/10943420211020803
“CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra,”
ECP Milestone Reports: Zenodo, May 2020.
DOI: 10.5281/zenodo.3860804
(28.12 MB)
“
Computation at the Cutting Edge of Science,”
Journal of Computational Science, June 2024.
DOI: 10.1016/j.jocs.2024.102379
“Computational science for a better future,”
Journal of Computational Science, vol. 62, pp. 101745, July 2022.
DOI: 10.1016/j.jocs.2022.101745
“20 years of computational science: Selected papers from 2020 International Conference on Computational Science,”
Journal of Computational Science, vol. 53, pp. 101395–101395, 2021.
DOI: 10.1016/j.jocs.2021.101395
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part II,”
Lecture Notes in Computer Science, 1, no. 12138: Springer International Publishing, pp. 697, June 2020.
DOI: 10.1007/978-3-030-50417-5
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part VI,”
Lecture Notes in Computer Science, 1, no. 12142: Springer International Publishing, pp. 667, June 2020.
DOI: 10.1007/978-3-030-50433-5
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part I,”
Lecture Notes in Computer Science, 1, no. 12137: Springer International Publishing, pp. 707, June 2020.
DOI: 10.1007/978-3-030-50371-0
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part V,”
Lecture Notes in Computer Science, 1, no. 12141: Springer International Publishing, pp. 618, June 2020.
DOI: 10.1007/978-3-030-50426-7
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part IV,”
Lecture Notes in Computer Science, 1, no. 12140: Springer International Publishing, pp. 668, June 2020.
DOI: 10.1007/978-3-030-50423-6
“Twenty Years of Computational Science,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.
(149.66 KB)
“
Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part III,”
Lecture Notes in Computer Science, 1, no. 12139: Springer International Publishing, pp. 648, June 2020.
DOI: 10.1007/978-3-030-50420-5
“Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part VII,”
Lecture Notes in Computer Science, 1, no. 12143: Springer International Publishing, pp. 775, June 2020.
DOI: 10.1007/978-3-030-50436-6
“QR Factorization for the CELL Processor,”
Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.
(194.95 KB)
“
LU Factorization with Partial Pivoting for a Multicore System with Accelerators,”
IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013.
DOI: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.242
(1.08 MB)
“
QR Factorization for the CELL Processor,”
University of Tennessee Computer Science Technical Report, UT-CS-08-616 (also LAPACK Working Note 201), May 2008.
(194.95 KB)
“
Linear Systems Performance Report,”
SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.
(1.64 MB)
“
Designing SLATE: Software for Linear Algebra Targeting Exascale,”
SLATE Working Notes, no. 03, ICL-UT-17-06: Innovative Computing Laboratory, University of Tennessee, October 2017.
(2.8 MB)
“
Massively Parallel Automated Software Tuning,”
48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan, ACM Press, August 2019.
DOI: 10.1145/3337821.3337908
(911.88 KB)
“
Multithreading in the PLASMA Library,”
Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications: Taylor & Francis, 00 2013.
(536.28 KB)
“
Autotuning GEMM Kernels for the Fermi GPU,”
IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 11, November 2012.
DOI: 10.1109/TPDS.2011.311
(742.5 KB)
“
Design and Implementation of the PULSAR Programming System for Large Scale Computing,”
Supercomputing Frontiers and Innovations, vol. 4, issue 1, 2017.
DOI: 10.14529/jsfi170101
(764.96 KB)
“
Programming the LU Factorization for a Multicore System with Accelerators,”
Proceedings of VECPAR’12, Kobe, Japan, April 2012.
(414.33 KB)
“
SLATE Working Note 12: Implementing Matrix Inversions,”
SLATE Working Notes, no. 12, ICL-UT-19-04: Innovative Computing Laboratory, University of Tennessee, June 2019.
(1.95 MB)
“