Publications
Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,”
Parallel Computing, vol. 81, pp. 1–21, January 2019.
(3.27 MB)
“Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.
(3.74 MB)
“Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched
, Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.
(9.29 MB)
High-performance Matrix-matrix Multiplications of Very Small Matrices,”
22nd International European Conference on Parallel and Distributed Computing (Euro-Par'16), Grenoble, France, Springer International Publishing, August 2016.
“High-Performance Tensor Contractions for GPUs,”
International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.
(2.36 MB)
“High-Performance Tensor Contractions for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.
(2.36 MB)
“Towards a High-Performance Tensor Algebra Package for Accelerators
, Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.
(1.76 MB)
A Parallel Solver for Incompressible Fluid Flows,”
International Conference on Computational Science (ICCS 2013), Barcelona, Spain, Elsevier B.V., June 2013.
(588.79 KB)
“