Publications
QR Factorization for the CELL Processor,”
Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.
(194.95 KB)
“
Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
DOI: 10.1109/MCSE.2018.2882574
(106.97 KB)
“
Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD,”
Concurrency and Computation: Practice and Experience, April 2020.
DOI: 10.1002/cpe.5754
(1.43 MB)
“
Resiliency in numerical algorithm design for extreme scale simulations,”
The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.
DOI: 10.1177/10943420211055188
“Resilient scheduling heuristics for rigid parallel jobs,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 2-26, 2021.
(8.67 MB)
“
Review of Performance Analysis Tools for MPI Parallel Programs,”
European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting, Lecture Notes in Computer Science 2131, Greece, Springer Verlag, Berlin, pp. 241-248, September 2001.
DOI: 10.1007/3-540-45417-9_34
(39.61 KB)
“
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators,”
Proc. of VECPAR'10 (to appear), Berkeley, CA, June 2010.
(870.46 KB)
“
Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems,”
SC'10, New Orleans, LA, ACM SIGARCH/ IEEE Computer Society, November 2010.
(3.42 MB)
“
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance,”
Computer Physics Communications, vol. 97, issue 1-2, pp. 1-15, August 1996.
DOI: 10.1016/0010-4655(96)00017-3
“Scheduling Dense Linear Algebra Operations on Multicore Processors,”
Concurrency and Computation: Practice and Experience, vol. 22, no. 1, pp. 15-44, January 2010.
(1.23 MB)
“
Scheduling Independent Stochastic Tasks under Deadline and Budget Constraints,”
International Journal of High Performance Computing Applications, vol. 34, issue 2, pp. 246-264, June 2019.
DOI: 10.1177/1094342019852135
(427.92 KB)
“
Scheduling Two-sided Transformations using Tile Algorithms on Multicore Architectures,”
Journal of Scientific Computing, vol. 18, no. 1, pp. 33-50, 00 2010.
(334.5 KB)
“
A Set of Batched Basic Linear Algebra Subprograms,”
ACM Transactions on Mathematical Software, October 2020.
“