Publications
Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part I,”
Lecture Notes in Computer Science, 1, no. 12137: Springer International Publishing, pp. 707, June 2020.
DOI: 10.1007/978-3-030-50371-0
“Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-12: University of Tennessee, August 2020.
(476.36 KB)
“Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,”
2020 IEEE High Performance Extreme Computing Virtual Conference: IEEE, September 2020.
(476.36 KB)
“DTE: PaRSEC Enabled Libraries and Applications (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(979.27 KB)
DTE: PaRSEC Systems and Interfaces (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(840.54 KB)
Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations,”
2020 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS): IEEE, November 2020.
(1.9 MB)
“Exa-PAPI: The Exascale Performance API with Modern C++
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(556.78 KB)
Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,”
Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020.
DOI: 10.1145/3394277.3401846 (2.71 MB)
“FFT-ECP API and High-Performance Library Prototype for 2-D and 3-D FFTs on Large-Scale Heterogeneous Systems with GPUs,”
ECP Milestone Report, no. FFT-ECP STML13-27: Innovative Computing Laboratory, University of Tennessee, January 2020.
(9.71 MB)
“Flexible Data Redistribution in a Task-Based Runtime System,”
IEEE International Conference on Cluster Computing (Cluster 2020), Kobe, Japan, IEEE, September 2020.
DOI: 10.1109/CLUSTER49012.2020.00032 (354.8 KB)
“Formulation of Requirements for New PAPI++ Software Package: Part I: Survey Results,”
PAPI++ Working Notes, no. 1, ICL-UT-20-02: Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020.
(1.49 MB)
“Ginkgo: A Node-Level Sparse Linear Algebra Library for HPC (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(699 KB)
HAN: A Hierarchical AutotuNed Collective Communication Framework,”
IEEE Cluster Conference, Kobe, Japan, Best Paper Award, IEEE Computer Society Press, September 2020.
(764.05 KB)
“Harnessing the Computing Continuum for Programming Our World,”
Fog Computing: Theory and Practice: John Wiley & Sons, Inc., 2020.
DOI: 10.1002/9781119551713.ch7 (1.4 MB)
“heFFTe: Highly Efficient FFT for Exascale,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.
DOI: 10.1007/978-3-030-50371-0_19 (2.62 MB)
“heFFTe: Highly Efficient FFT for Exascale (Poster)
: NVIDIA GPU Technology Conference (GTC2020), October 2020.
(866.88 KB)
heFFTe: Highly Efficient FFT for Exascale (Poster)
, Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.
(1.54 MB)
heFFTe: Highly Efficient FFT for Exascale (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(6.2 MB)
High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,”
2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.
(1.3 MB)
“hipMAGMA v1.0
: Zenodo, March 2020.
DOI: 10.5281/zenodo.3908549
hipMAGMA v2.0
: Zenodo, July 2020.
DOI: 10.5281/zenodo.3928667
Improving the Performance of the GMRES Method using Mixed-Precision Techniques,”
Smoky Mountains Computational Sciences & Engineering Conference (SMC2020), August 2020.
(600.33 KB)
“Integrating Deep Learning in Domain Science at Exascale (MagmaDNN)
, virtual, DOD HPCMP seminar, December 2020.
(11.12 MB)
Integrating Deep Learning in Domain Sciences at Exascale,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.
(1.09 MB)
“Integrating Deep Learning in Domain Sciences at Exascale,”
2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020), August 2020.
“Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020.
DOI: 10.1007/978-3-030-50417-5_18 (702.38 KB)
“Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,”
ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020.
DOI: 10.1145/3380930 (5.67 MB)
“MAGMA Templates for Scalable Linear Algebra on Emerging Architectures,”
The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020.
DOI: 10.1177/1094342020938421
“Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,”
Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.
DOI: 10.1016/j.jpdc.2020.07.001 (1.3 MB)
“Mixed-Precision Iterative Refinement using Tensor Cores on GPUs to Accelerate Solution of Linear Systems,”
Proceedings of the Royal Society A, vol. 476, issue 2243, November 2020.
DOI: 10.1098/rspa.2020.0110 (2.24 MB)
“Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.
(1.03 MB)
“Numerical Algorithms for High-Performance Computational Science,”
Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020.
DOI: 10.1098/rsta.2019.0066 (724.37 KB)
“Parallel Processing and Applied Mathematics: 13th International Conference, PPAM 2019, Bialystok, Poland, September 8–11, 2019, Revised Selected Papers, Part II,”
Lecture Notes in Computer Science, no. 12044: Springer International Publishing, pp. 503, March 2020.
DOI: 10.1007/978-3-030-43222-5
“Parallel Processing and Applied Mathematics: 13th International Conference, PPAM 2019, Bialystok, Poland, September 8–11, 2019, Revised Selected Papers, Part I,”
Lecture Notes in Computer Science, 1, no. 12043: Springer International Publishing, pp. 581, March 2020.
DOI: 10.1007/978-3-030-43229-4
“Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster)
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.
(2.53 MB)
Performance Tuning SLATE,”
SLATE Working Notes, no. 14, ICL-UT-20-01: Innovative Computing Laboratory, University of Tennessee, January 2020.
(1.29 MB)
“The PLASMA Library on CORAL Systems and Beyond (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(550.86 KB)
Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,”
The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020.
DOI: 10.22369/issn.2153-4136/11/1/7 (4.4 MB)
“Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),”
LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.
(1.41 MB)
“Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD,”
Concurrency and Computation: Practice and Experience, April 2020.
DOI: 10.1002/cpe.5754 (1.43 MB)
“Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques,”
2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), Atlanta, GA, IEEE, November 2020.
(184.6 KB)
“Report on the Fujitsu Fugaku System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06: University of Tennessee, June 2020.
(3.3 MB)
“Scalable Data Generation for Evaluating Mixed-Precision Solvers,”
2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, IEEE, September 2020.
DOI: 10.1109/HPEC43674.2020.9286145 (1.3 MB)
“A Set of Batched Basic Linear Algebra Subprograms,”
ACM Transactions on Mathematical Software, October 2020.
“SLATE Performance Report: Updates to Cholesky and LU Factorizations,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-14: University of Tennessee, October 2020.
(1.64 MB)
“SLATE: Software for Linear Algebra Targeting Exascale (POSTER)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(546.56 KB)
SLATE Tutorial
, Houston, TX, 2020 ECP Annual Meeting, February 2020.
(12.14 MB)
SLATE Users' Guide,”
SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, July 2020.
(1.51 MB)
“A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,”
SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.
(3.98 MB)
“Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
DOI: 10.1016/j.jocs.2020.101216 (752.59 KB)
“