Publications
P1673R3: A Free Function Linear algebra Interface Based on the BLAS,”
ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.
(858.89 KB)
“
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic,”
The International Journal of High Performance Computing Applications, vol. 35, no. 4, pp. 344–369, 2021.
DOI: 10.1177/10943420211003313
“Improving the Performance of the GMRES Method using Mixed-Precision Techniques,”
Smoky Mountains Computational Sciences & Engineering Conference (SMC2020), August 2020.
(600.33 KB)
“
Symmetric Indefinite Linear Solver using OpenMP Task on Multicore Architectures,”
IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 8, pp. 1879–1892, August 2018.
DOI: 10.1109/TPDS.2018.2808964
(2.88 MB)
“
Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results,”
Parallel Computing, vol. 52, pp. 22-41, February 2016.
DOI: doi:10.1016/j.parco.2015.09.005
(2.06 MB)
“
Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,”
ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015.
DOI: 10.1145/2686892
(1.14 MB)
“
Heterogeneous Acceleration for Linear Algebra in Mulit-Coprocessor Environments,”
VECPAR 2014, Eugene, OR, June 2014.
(276.52 KB)
“
A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,”
IPDPS 2014, Phoenix, AZ, IEEE, May 2014.
(1.01 MB)
“
Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,”
IPDPS 2014, Phoenix, AZ, IEEE, May 2014.
(1.51 MB)
“
Standards for Graph Algorithm Primitives,”
17th IEEE High Performance Extreme Computing Conference (HPEC '13), Waltham, MA, IEEE, September 2013.
DOI: 10.1109/HPEC.2013.6670338
(108.86 KB)
“
Automatically Tuned Linear Algebra Software,”
1998 ACM/IEEE conference on Supercomputing (SC '98), Orlando, FL, IEEE Computer Society, November 1998.
“