Publications
Using Quantized Integer in LU Factorization with Partial Pivoting (Poster)
, Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.
(6.65 MB)
The PLASMA Library on CORAL Systems and Beyond (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(550.86 KB)
Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects
, Portland, OR, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.
(3.53 MB)
Clover: Computational Libraries Optimized via Exascale Research
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(872 KB)
With Extreme Computing, the Rules Have Changed,”
Computing in Science & Engineering, vol. 19, issue 3, pp. 52-62, May 2017.
DOI: 10.1109/MCSE.2017.48 (485.34 KB)
“Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy,”
ACM Transactions on Mathematical Software, vol. 34, no. 4, pp. 17-22, 00 2008.
(364.48 KB)
“Using MAGMA with PGI Fortran,”
PGI Insider, November 2010.
(176.67 KB)
“Translational process: Mathematical software perspective,”
Journal of Computational Science, vol. 52, pp. 101216, 2021.
DOI: 10.1016/j.jocs.2020.101216
“Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
DOI: 10.1016/j.jocs.2020.101216 (752.59 KB)
“Task Based Cholesky Decomposition on Xeon Phi Architectures using OpenMP,”
International Journal of Computational Science and Engineering (IJCSE), vol. 17, no. 3, October 2018.
DOI: http://dx.doi.org/10.1504/IJCSE.2018.095851
“A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,”
Concurrency and Computation: Practice and Experience, vol. 27, issue 5, pp. 1292-1309, April 2015.
DOI: 10.1002/cpe.3306 (783.45 KB)
“The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale,”
SIAM Review, vol. 60, issue 4, pp. 808–865, November 2018.
DOI: 10.1137/17M1117732 (2.5 MB)
“A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,”
ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021.
DOI: 10.1145/3431921
“A Set of Batched Basic Linear Algebra Subprograms,”
ACM Transactions on Mathematical Software, October 2020.
“Power Aware Computing on GPUs,”
SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.
(658.06 KB)
“PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,”
ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.
DOI: 10.1145/3264491 (7.5 MB)
“Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems,”
Supercomputing Frontiers and Innovations, vol. 2, no. 4, October 2015.
DOI: 10.14529/jsfi1504 (3.68 MB)
“Parallel Programming in MATLAB,”
The International Journal of High Performance Computing Applications, vol. 23, no. 3, pp. 277-283, July 2009.
(215.71 KB)
“A New Metric for Ranking High-Performance Computing Systems,”
National Science Review, vol. 3, issue 1, pp. 30-35, January 2016.
DOI: 10.1093/nsr/nwv084 (393.55 KB)
“Multithreading in the PLASMA Library,”
Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications: Taylor & Francis, 00 2013.
(536.28 KB)
“Materials fingerprinting classification,”
Computer Physics Communications, pp. 108019, May Jan.
DOI: 10.1016/j.cpc.2021.108019 (3.8 MB)
“LU Factorization with Partial Pivoting for a Multicore System with Accelerators,”
IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013.
DOI: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.242 (1.08 MB)
“Linear Algebra Software for Large-Scale Accelerated Multicore Computing,”
Acta Numerica, vol. 25, pp. 1-160, May 2016.
DOI: 10.1017/S0962492916000015
“The Impact of Multicore on Math Software,”
PARA 2006, Umea, Sweden, June 2006.
(223.53 KB)
“