Publications
A Parallel Implementation of the Nonsymmetric QR Algorithm for Disitributed Memory Architectures,”
SIAM Journal on Scientific Computing, vol. 16, no. 2, pp. 284-311, October 2002.
(224.7 KB)
“Parallel IO Support for Meta-Computing Applications: MPI_Connect IO Applied to PACX-MPI,”
8th European PVM/MPI User's Group Meeting, Lecture Notes in Computer Science, vol. 2131, Greece, Springer Verlag, Berlin, September 2001.
(129.3 KB)
““Parallel Processing and Applied Mathematics, 9th International Conference, PPAM 2011,”
Lecture Notes in Computer Science, vol. 7203, Torun, Poland, 00 2012.
Parallel Programming in MATLAB,”
The International Journal of High Performance Computing Applications, vol. 23, no. 3, pp. 277-283, July 2009.
(215.71 KB)
“Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems,”
Supercomputing Frontiers and Innovations, vol. 2, no. 4, October 2015.
DOI: 10.14529/jsfi1504 (3.68 MB)
“Parallel Selection on GPUs,”
Parallel Computing, vol. 91, March 2020, 2019.
DOI: 10.1016/j.parco.2019.102588 (1.43 MB)
“On the Parallel Solution of Large Industrial Wave Propagation Problems,”
Journal of Computational Acoustics (to appear), January 2005.
(1.08 MB)
“Parallel Tiled QR Factorization for Multicore Architectures,”
Concurrency and Computation: Practice and Experience, vol. 20, pp. 1573-1590, January 2008.
(277.92 KB)
“A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,”
IPDPS 2012, Shanghai, China, May 2012.
(544.09 KB)
“Parallelizing the Divide and Conquer Algorithm for the Symmetric Tridiagonal Eigenvalue Problem on Distributed Memory Architectures,”
SIAM Journal on Scientific Computing, vol. 6, no. 20, pp. 2223-2236, October 2002.
(321.36 KB)
“Paravirtualization Effect on Single- and Multi-threaded Memory-Intensive Linear Algebra Software,”
Cluster Computing Journal: Special Issue on High Performance Distributed Computing, vol. 12, no. 2: Springer Netherlands, pp. 101-122, 00 2009.
(451.07 KB)
“ParILUT - A New Parallel Threshold ILU,”
SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.
DOI: 10.1137/16M1079506 (19.26 MB)
“PaRSEC: Exploiting Heterogeneity to Enhance Scalability,”
IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013.
DOI: 10.1109/MCSE.2013.98 (2.16 MB)
“PaRSEC: Scalability, flexibility, and hybrid architecture support for task-based applications in ECP,”
The International Journal of High Performance Computing Applications, October 2024.
DOI: 10.1177/10943420241290520
“Performance Analysis of MPI Collective Operations,”
Cluster computing, vol. 10, no. 2: Springer Netherlands, pp. 127-143, June 2007.
(1018.28 KB)
“Performance Analysis of MPI Collective Operations,”
Cluster Computing Journal (to appear), January 2005.
(1018.28 KB)
“On the performance and energy efficiency of sparse linear algebra on GPUs,”
International Journal of High Performance Computing Applications, October 2016.
DOI: 10.1177/1094342016672081 (1.19 MB)
“Performance and Reliability Trade-offs for the Double Checkpointing Algorithm,”
International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41.
(859.04 KB)
“Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,”
Lecture Notes in Computer Science, OpenMP Shared Memory Parallel Programming, vol. 4315: Springer Berlin / Heidelberg, 00 2008.
(350.9 KB)
“Performance of Asynchronous Optimized Schwarz with One-sided Communication,”
Parallel Computing, vol. 86, pp. 66-81, August 2019.
DOI: 10.1016/j.parco.2019.05.004 (3.09 MB)
“Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs,”
Concurrency and Computation: Practice and Experience, vol. 28, issue 12, pp. 3447 - 3465, May 2016.
DOI: 10.1002/cpe.v28.1210.1002/cpe.3874 (3.21 MB)
“Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,”
IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.
(290.98 KB)
“PERI Auto-tuning,”
Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.
(873.75 KB)
“PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,”
ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.
DOI: 10.1145/3264491 (7.5 MB)
“The PlayStation 3 for High Performance Scientific Computing,”
Computing in Science and Engineering, pp. 80-83, January 2008.
(2.45 MB)
“PMIx: Process Management for Exascale Environments,”
Parallel Computing, vol. 79, pp. 9–29, January 2018.
DOI: 10.1016/j.parco.2018.08.002
“A Portable Programming Interface for Performance Evaluation on Modern Processors,”
The International Journal of High Performance Computing Applications, vol. 14, no. 3, pp. 189-204, September 2000.
DOI: 10.1177/109434200001400303 (655.17 KB)
“Porting the PLASMA Numerical Library to the OpenMP Standard,”
International Journal of Parallel Programming, June 2016.
DOI: 10.1007/s10766-016-0441-6 (1.66 MB)
“Post-failure recovery of MPI communication capability: Design and rationale,”
International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013.
DOI: 10.1177/1094342013488238 (285.77 KB)
“Power Aware Computing on GPUs,”
SAAHPC '12 (Best Paper Award), Argonne, IL, July 2012.
(658.06 KB)
“Preconditioned Krylov Solvers on GPUs,”
Parallel Computing, June 2017.
DOI: 10.1016/j.parco.2017.05.006 (1.19 MB)
“Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures,”
J. Phys.: Conf. Ser. 46, vol. :101088/1742-6596/46/1/040, pp. 292-298, January 2006.
(644.1 KB)
“Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture,”
LAWN 267, 00 2012.
(1.14 MB)
“The Problem with the Linpack Benchmark Matrix Generator,”
International Journal of High Performance Computing Applications, vol. 23, no. 1, pp. 5-14, 00 2009.
(136.41 KB)
““Proceedings of the International Conference on Computational Science,”
ICCS 2010, Amsterdam, Elsevier, May 2010.
Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing,”
International Journal for High Performance Applications and Supercomputing (to appear), April 2004.
(186.9 KB)
“Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,”
The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020.
DOI: 10.22369/issn.2153-4136/11/1/7 (4.4 MB)
“Prospectus for the Next LAPACK and ScaLAPACK Libraries,”
PARA 2006, Umea, Sweden, June 2006.
(460.11 KB)
“Providing Infrastructure and Interface to High Performance Applications in a Distributed Setting,”
ASTC-HPC 2000, Washington, DC, April 2000.
(96.04 KB)
“Providing performance portable numerics for Intel GPUs,”
Concurrency and Computation: Practice and Experience, vol. 17, October 2022.
DOI: 10.1002/cpe.7400 (3.16 MB)
“QCG-OMPI: MPI Applications on Grids,”
Future Generation Computer Systems, vol. 27, no. 4, pp. 357-369, March 2010.
(1.48 MB)
“QCG-OMPI: MPI Applications on Grids.,”
Future Generation Computer Systems, vol. 27, no. 4, pp. 435-369, January 2011.
(1.48 MB)
“QR Factorization for the CELL Processor,”
Scientific Programming (to appear), 00 2009.
(234.02 KB)
“QR Factorization for the CELL Processor,”
Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.
(194.95 KB)
“The Quest for Petascale Computing,”
Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.
(178.3 KB)
“Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
DOI: 10.1109/MCSE.2018.2882574 (106.97 KB)
“Recent Advances in Parallel Virtual Machine and Message Passing Interface,”
Lecture Notes in Computer Science, vol. 2840: Springer-Verlag, Berlin, January 2003.
““Recent Advances in the Message Passing Interface: 19th European MPI Users' Group Meeting, EuroMPI 2012,”
Lecture Notes in Computer Science, vol. 7490, Vienna, Austria, 00 2012.
Recent Developments in GridSolve,”
International Journal of High Performance Computing Applications (Special Issue: Scheduling for Large-Scale Heterogeneous Platforms), vol. 20, no. 1: Sage Science Press, 00 2006.
(496.69 KB)
“Recent Trends in High Performance Computing,”
in Birth of Numerical Analysis (to appear), 00 2009.
“