Publications
Implementation and Usage of the PERUSE-Interface in Open MPI,”
Euro PVM/MPI 2006, Bonn, Germany, September 2006.
(310.76 KB)
“The Impact of Multicore on Computational Science Software,”
CTWatch Quarterly, vol. 3, issue 1, February 2007.
“Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems,”
Parallel Computing, vol. 39, issue 4-5, pp. 212-232, May 2013.
(1.43 MB)
“HARNESS Fault Tolerant MPI Design, Usage and Performance Issues,”
Future Generation Computer Systems, vol. 18, no. 8, pp. 1127-1142, January 2002.
(403.41 KB)
“HARNESS and Fault Tolerant MPI,”
Parallel Computing, vol. 27, no. 11, pp. 1479-1496, January 2001.
(164.2 KB)
“HARNESS: A Next Generation Distributed Virtual Machine,”
International Journal on Future Generation Computer Systems, vol. 15, no. 5-6, pp. 571-582, January 1999.
(183.78 KB)
“The GrADS Project: Software Support for High-Level Grid Application Development,”
International Journal of High Performance Applications and Supercomputing, vol. 15, no. 4, pp. 327-344, January 2001.
(271.52 KB)
“Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing,”
ACM Transactions on Mathematical Software, vol. 48, issue 12, pp. 1 - 33, March 2022.
(4.2 MB)
“FT-MPI, Fault-Tolerant Metacomputing and Generic Name Services: A Case Study,”
Lecture Notes in Computer Science, vol. 4192, no. ICL-UT-06-14: Springer Berlin / Heidelberg, pp. 133-140, 00 2006.
(362.44 KB)
“Flexible collective communication tuning architecture applied to Open MPI,”
2006 Euro PVM/MPI (submitted), Bonn, Germany, January 2006.
(206.58 KB)
“Experiences with Windows 95/NT as a Cluster Computing Platform for Parallel Computing,”
Parallel and Distributed Computing Practices, Special Issue: Cluster Computing, vol. 2, no. 2: Nova Science Publishers, USA, pp. 119-128, February 1999.
(164.04 KB)
“Evolution of the SLATE linear algebra library,”
The International Journal of High Performance Computing Applications, September 2024.
“Evaluation of Dataflow Programming Models for Electronic Structure Theory,”
Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms, vol. 2018, issue e4490, pp. 1–20, May 2018.
(1.69 MB)
“Evaluating The Performance Of MPI-2 Dynamic Communicators And One-Sided Communication,”
Lecture Notes in Computer Science, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 10th European PVM/MPI User's Group Meeting, vol. 2840, Venice, Italy, Springer-Verlag, Berlin, pp. 88-97, September 2003.
(254.08 KB)
“Efficient exascale discretizations: High-order finite element methods,”
The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021.
“Decision Trees and MPI Collective Algorithm Selection Problem,”
Euro-Par 2007, Rennes, France, Springer, pp. 105–115, August 2007.
(552.94 KB)
“A Customized Precision Format Based on Mantissa Segmentation for Accelerating Sparse Linear Algebra,”
Concurrency and Computation: Practice and Experience, vol. 40319, issue 262, January 2019.
“Cray X1 Evaluation Status Report,”
Oak Ridge National Laboratory Report, vol. /-2004/13, January 2004.
(817.33 KB)
“Computational Benefit of GPU Optimization for Atmospheric Chemistry Modeling,”
Journal of Advances in Modeling Earth Systems, vol. 10, issue 8, pp. 1952–1969, August 2018.
(3.4 MB)
“The Component Structure of a Self-Adapting Numerical Software System,”
International Journal of Parallel Programming, vol. 33, no. 2, June 2005.
(64.88 KB)
“Checkpointing Strategies for Shared High-Performance Computing Platforms,”
International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
(490.5 KB)
“Capturing and Analyzing the Execution Control Flow of OpenMP Applications,”
International Journal of Parallel Programming, vol. 37, no. 3, pp. 266-276, 00 2009.
“Building and using a Fault Tolerant MPI implementation,”
International Journal of High Performance Applications and Supercomputing (to appear), 00 2004.
“Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,”
The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 435–479, July 2018.
(1.29 MB)
“Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,”
The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 435–479, July 2018.
(1.29 MB)
“Application of Machine Learning to the Selection of Sparse Linear Solvers,”
International Journal of High Performance Computing Applications (submitted), 00 2006.
(392.96 KB)
“Application of Machine Learning to the Selection of Sparse Linear Solvers,”
International Journal of High Performance Computing Applications (submitted), 00 2006.
(392.96 KB)
“Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,”
Parallel Computing, vol. 81, pp. 1–21, January 2019.
(3.27 MB)
“Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers,”
Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, March 2019.
(341.54 KB)
“Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,”
Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014.
(1.96 MB)
“Accelerating Time-To-Solution for Computational Science and Engineering,”
SciDAC Review, 00 2009.
(739.11 KB)
“
“Visualizing the Program Execution Control Flow of OpenMP Applications,”
Proc. 4th International Workshop on OpenMP (IWOMP 2008), West Lafayette, Indiana, Lecture Notes in Computer Science 5004, pp. 181-190, January 2008.
(194.25 KB)
“Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning,”
International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.
(512.57 KB)
“On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications,”
Proceedings of the 13th International Euro-Par Conference on Parallel Processing (Euro-Par '07), Rennes, France, Springer LNCS, January 2007.
“Usage of the Scalasca Toolset for Scalable Performance Analysis of Large-scale Parallel Applications,”
Proceedings of the 2nd International Workshop on Tools for High Performance Computing, Stuttgart, Germany, Springer, pp. 157-167, January 2008.
(229.2 KB)
“Usage of the Scalasca Toolset for Scalable Performance Analysis of Large-scale Parallel Applications,”
Proceedings of the 2nd International Workshop on Tools for High Performance Computing, Stuttgart, Germany, Springer, pp. 157-167, January 2008.
(229.2 KB)
“Toward a Framework for Preparing and Executing Adaptive Grid Programs,”
International Parallel and Distributed Processing Symposium: IPDPS 2002 Workshops, Fort Lauderdale, FL, pp. 0171, April 2002.
(64.5 KB)
“Self-Healing Network for Scalable Fault Tolerant Runtime Environments,”
DAPSYS 2006, 6th Austrian-Hungarian Workshop on Distributed and Parallel Systems, Innsbruck, Austria, January 2006.
(162.83 KB)
“Self Adapting Linear Algebra Algorithms and Software,”
IEEE Proceedings (to appear), 00 2004.
(587.67 KB)
“Self Adapting Application Level Fault Tolerance for Parallel and Distributed Computing,”
Proceedings of Workshop on Self Adapting Application Level Fault Tolerance for Parallel and Distributed Computing at IPDPS, pp. 1-8, March 2007.
(162.47 KB)
“Secure Remote Access to Numerical Software and Computational Hardware,”
Proceedings of the DoD HPC Users Group Conference (HPCUG) 2000, Albuquerque, NM, June 2000.
(172.6 KB)
“Scalable Fault Tolerant MPI: Extending the Recovery Algorithm,”
Proceedings of 12th European Parallel Virtual Machine and Message Passing Interface Conference - Euro PVM/MPI, vol. 3666, Sorrento (Naples) , Italy, Springer-Verlag Berlin, pp. 67, September 2005.
(144.86 KB)
“Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale Shared Memory Multiprocessors,”
Proceedings of the 2007 International Conference on Computational Science (ICCS 2007), vol. 4487-4490, Beijing, China, Springer LNCS, pp. 815-822, 2007.
(145.84 KB)
“Reliability Analysis of Self-Healing Network using Discrete-Event Simulation,”
Proceedings of Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07): IEEE Computer Society, pp. 437-444, May 2007.
“QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,”
Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.
(468.17 KB)
“Proposal of MPI operation level Checkpoint/Rollback and one implementation,”
Proceedings of IEEE CCGrid 2006: IEEE Computer Society, January 2006.
(277.27 KB)
“Programming the LU Factorization for a Multicore System with Accelerators,”
Proceedings of VECPAR’12, Kobe, Japan, April 2012.
(414.33 KB)
“Prediction of Optimal Solvers for Sparse Linear Systems Using Deep Learning,”
2022 SIAM Conference on Parallel Processing for Scientific Computing (PP), Philadelphia, PA, Society for Industrial and Applied Mathematics, pp. 14 - 24.
“Performance Modeling for Self Adapting Collective Communications for MPI,”
LACSI Symposium 2001, Santa Fe, NM, October 2001.
(105.49 KB)
“