High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,” 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.“
High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.“
A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.“
libCEED: Fast algebra for high-order element-based discretizations,” Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021.“
Accelerating Tensor Contractions for High-Order FEM on CPUs, GPUs, and KNLs , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC16), Poster, September 2016.
Acceleration of the BLAST Hydro Code on GPU,” Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.“
Towards a High-Performance Tensor Algebra Package for Accelerators , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.
Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.
CEED ECP Milestone Report: Performance Tuning of CEED Software and 1st and 2nd Wave Apps : Zenodo, October 2019.
CEED ECP Milestone Report: Public release of CEED 2.0 : Zenodo, April 2019.
High-Performance Tensor Contractions for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.“
Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.“
Small Tensor Operations on Advanced Architectures for High-Order Applications,” University of Tennessee Computer Science Technical Report, no. UT-EECS-17-749: Innovative Computing Laboratory, University of Tennessee, April 2017.“