Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1296 results:

2024

Lin, P. T., P. Nayak, A. Kashi, D. Kulkarni, A. Scheinberg, and H. Anzt, “Accelerating Fusion Plasma Collision Operator Solves with Portable Batched Iterative Solvers on GPUs,” ISC High Performance 2024 International Workshops , vol. 15058, Hamburg, Germany, Springer, Cham, pp. 127 - 140, December 2024. DOI: 10.1007/978-3-031-73716-9

Jagode, H., A. Danalis, G. Congiu, D. Barry, A. Castaldo, and J. Dongarra, “Advancements of PAPI for the exascale generation,” The International Journal of High Performance Computing Applications, December 2024. DOI: 10.1177/10943420241303884

Whitlock, M., H. Kolla, A. Bouteiller, J. R. Mayo, N. M. Morales, K. Teranishi, and G. Bosilca, “Asynchrony and Failure Masking via Pseudo-Local Process Recovery in MPI Applications,” 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), San Francisco, CA, USA, IEEE, May 2024. DOI: 10.1109/IPDPSW63119.2024.00193

Barry, D., A. Danalis, and J. Dongarra, “Automated Data Analysis for Defining Performance Metrics from Raw Hardware Events,” 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), San Francisco, CA, USA, IEEE, May 2024. DOI: 10.1109/IPDPSW63119.2024.00134

Luszczek, P., A. Abdelfattah, H. Anzt, A. Suzuki, and S. Tomov, “Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications,” Future Generation Computer Systems, vol. 160, pp. 359 - 374, November 2024. DOI: 10.1016/j.future.2024.06.004

Melnichenko, M., O. Balabanov, R. Murray, J. Demmel, M. W. Mahoney, and P. Luszczek, CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT) : arXiv, February 2024.

Dongarra, J., and D. Keyes, “The co-evolution of computational physics and high-performance computing,” Nature Reviews Physics, August 2024. DOI: 10.1038/s42254-024-00750-z

Kovalchuk, S. V., C. de Mulatier, V. V. Krzhizhanovskaya, J. Mikyška, M. Paszyński, J. Dongarra, and P. M. A. Sloot, “Computation at the Cutting Edge of Science,” Journal of Computational Science, June 2024. DOI: 10.1016/j.jocs.2024.102379

Slattery, S. A., K. A. Surjuse, C. Peterson, D. A. Penchoff, and E. Valeev, “Economical Quasi-Newton Unitary Optimization of Electronic Orbitals,” Physical Chemistry Chemical Physics, December 2023, 2024. DOI: 10.1039/D3CP05557D

Cao, Q., T. Herault, A. Bouteiller, J. Schuchart, and G. Bosilca, “Evaluating PaRSEC Through Matrix Computations in Scientific Applications,” Asynchronous Many-Task Systems and Applications - Second International Workshop, WAMTA 2024, Knoxville, TN, USA, February 14-16, 2024, Proceedings, vol. 14626: Springer, pp. 22–33, 2024. DOI: 10.1007/978-3-031-61763-8_3

(600.76 KB)

Gates, M., A. Abdelfattah, K. Akbudak, M. Al Farhan, R. Alomairy, D. Bielich, T. Burgess, S. Cayrols, N. Lindquist, D. Sukkari, et al., “Evolution of the SLATE linear algebra library,” The International Journal of High Performance Computing Applications, September 2024. DOI: 10.1177/10943420241286531

Cojean, T., P. Nayak, T. Ribizel, N. Beams, Y-H. Mike Tsai, M. Koch, F. Göbel, T. Grützmacher, and H. Anzt, “Ginkgo - A math library designed to accelerate Exascale Computing Project science applications,” The International Journal of High Performance Computing Applications, August 2024. DOI: 10.1177/10943420241268323

Abdelfattah, A., W. Ahrens, H. Anzt, C. Armstrong, B. Brock, A. Buluc, F. Busato, T. Cojean, T. Davis, J. Demmel, et al., Interface for Sparse Linear Algebra Operations , November 2024. DOI: 10.48550/arXiv.2411.13259

Abdelfattah, A., N. Beams, R. Carson, P. Ghysels, T. Kolev, T. Stitt, A. Vargas, S. Tomov, and J. Dongarra, “MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures,” The International Journal of High Performance Computing Applications, June 2024. DOI: 10.1177/10943420241261960

John, J., J. Milthorpe, T. Herault, and G. Bosilca, “Multi-GPU work sharing in a task-based dataflow programming model,” Future Generation Computer Systems, vol. 156, pp. 313 - 324, July 2024. DOI: 10.1016/j.future.2024.03.017

Luszczek, P., A. Castaldo, Y. M. Tsai, D. Mishler, and J. Dongarra, “Numerical eigen-spectrum slicing, accurate orthogonal eigen-basis, and mixed-precision eigenvalue refinement using OpenMP data-dependent tasks and accelerator offload,” The International Journal of High Performance Computing Applications, vol. 303, issue 136, September 2024. DOI: 10.1177/10943420241281050

Bouteiller, A., T. Herault, Q. Cao, J. Schuchart, and G. Bosilca, “PaRSEC: Scalability, flexibility, and hybrid architecture support for task-based applications in ECP,” The International Journal of High Performance Computing Applications, October 2024. DOI: 10.1177/10943420241290520

Bautista-Gomez, L., A. Benoit, S. Di, T. Herault, Y. Robert, and H. Sun, “A survey on checkpointing strategies: Should we always checkpoint à la Young/Daly?,” Future Generation Computer Systems, July 2024. DOI: 10.1016/j.future.2024.07.022

Bernholdt, D. E., G. Bosilca, A. Bouteiller, R. Brightwell, J. Ciesko, M. G. F. Dosanjh, G. Georgakoudis, I. Laguna, S. Levy, T. Naughton, et al., “Taking the MPI standard and the open MPI library to exascale,” The International Journal of High Performance Computing Applications, July 2024. DOI: 10.1177/10943420241265936

Anzt, H., A. Huebl, and X. S. Li, “Then and Now: Improving Software Portability, Productivity, and 100× Performance,” Computing in Science & Engineering, pp. 1 - 10, April 2024. DOI: 10.1109/MCSE.2024.3387302

Tahmid, T., M. Gates, P. Luszczek, and C. Schuman, “Towards Scalable and Efficient Spiking Reinforcement Learning for Continuous Control Tasks,” 2024 International Conference on Neuromorphic Systems (ICONS), Arlington, VA, USA, IEEE, 2024. DOI: 10.1109/ICONS62911.2024.00057

Hoefler, T., M. Copik, P. Beckman, A. Jones, I. Foster, M. Parashar, D. Reed, M. Troyer, T. Schulthess, D. Ernst, et al., “XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing,” Computing in Science & Engineering, vol. 26, issue 3, pp. 40 - 51, July 2024. DOI: 10.1109/MCSE.2024.3382154

2023

Thiyagalingam, J., G. von Laszewski, J. Yin, M. Emani, J. Papay, G. Barrett, P. Luszczek, A. Tsaris, C. Kirkpatrick, F. Wang, et al., “AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,” Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023. DOI: 10.1007/978-3-031-23220-610.1007/978-3-031-23220-6_4

Deshmukh, S., R. Yokota, and G. Bosilca, “Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors,” ACM Transactions on Mathematical Software, vol. 49, issue 3, pp. 1 - 29, September 2023. DOI: 10.1145/3595178

Luszczek, P., W. M. Sid-Lakhdar, and J. Dongarra, “Combining multitask and transfer learning with deep Gaussian processes for autotuning-based performance engineering,” The International Journal of High Performance Computing Applications, March 2023. DOI: 10.1177/10943420231166365

Valeev, E. F., R. J. Harrison, A. A. Holmes, C. C. Peterson, and D. A. Penchoff, “Direct Determination of Optimal Real-Space Orbitals for Correlated Electronic Structure of Molecules,” Journal of Chemical Theory and Computation, vol. 19, issue 20, pp. 7230 - 7241, October 2023. DOI: 10.1021/acs.jctc.3c00732

Deshmukh, S., R. Yokota, G. Bosilca, and Q. Ma, “O(N) distributed direct factorization of structured dense matrices using runtime systems,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, August 2023. DOI: 10.1145/3605573.3605606

Hoefler, T., B. Stevens, A. F. Prein, J. Baehr, T. Schulthess, T. F. Stocker, J. Taylor, D. Klocke, P. Manninen, P. M. Forster, et al., Earth Virtualization Engines - A Technical Perspective , September 2023.

Li, J., G. Bosilca, A. Bouteiller, and B. Nicolae, “Elastic deep learning through resilient collective operations,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3626080

Lindquist, N., P. Luszczek, and J. Dongarra, Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes : arXiv, December 2023.

Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, “GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624247

Reed, D., D. Gannon, and J. Dongarra, “HPC Forecast: Cloudy and Uncertain,” Communications of the ACM, vol. 66, issue 2, pp. 82 - 90, January 2023. DOI: 10.1145/3552309

Mor, O., G. Bosilca, and M. Snir, “Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, September 2023. DOI: 10.1145/3605573.3605642

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, “Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023. DOI: 10.1109/IPDPSW59300.2023.00070

(1.81 MB)

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements , St. Petersburg, FL, 28th HIPS Workshop, May 2023.

(3.99 MB)

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Mixed Precision Algebraic Multigrid on GPUs,” Parallel Processing and Applied Mathematics (PPAM 2022), vol. 13826, Cham, Springer International Publishing, April 2023. DOI: 10.1007/978-3-031-30442-2_9

Schuchart, J., and G. Bosilca, “MPI Continuations And How To Invoke Them,” Sustained Simulation Performance 2021, Cham, Springer International Publishing, pp. 67 - 83, February 2023. DOI: 10.1007/978-3-031-18046-010.1007/978-3-031-18046-0_5

Sid-Lakhdar, W., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. Davis, et al., “PAQR: Pivoting Avoiding QR factorization,” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, FL, USA, IEEE, 2023. DOI: 10.1109/IPDPS54959.2023.00040

Ribizel, T., and H. Anzt, “Parallel Symbolic Cholesky Factorization,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624253

Mishler, D., J. Ciesko, S. Olivier, and G. Bosilca, “Performance Insights into Device-initiated RMA Using Kokkos Remote Spaces,” 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTERWorkshops61457.2023.00028

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023. DOI: 10.1007/978-3-031-23606-810.1007/978-3-031-23606-8_3

Cao, Q., S. Abdulah, H. Ltaief, M. G. Genton, D. Keyes, and G. Bosilca, “Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,” 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTER52292.2023.00035

Benoit, A., T. Herault, L. Perotin, Y. Robert, and F. Vivien, “Revisiting I/O bandwidth-sharing strategies for HPC applications,” INRIA Research Report, no. RR-9502: INRIA, March 2023.

Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, “Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023. DOI: 10.1002/cpe.7871

Schuchart, J., S. Hunold, and G. Bosilca, “Synchronizing MPI Processes in Space and Time,” EUROMPI '23: 30th European MPI Users' Group Meeting, Bristol, United Kingdom, ACM, September 2023. DOI: 10.1145/3615318.3615325

Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, “Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624248

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Three-precision algebraic multigrid on GPUs,” Future Generation Computer Systems, July 2023. DOI: 10.1016/j.future.2023.07.024

Lindquist, N., P. Luszczek, and J. Dongarra, “Using Additive Modifications in LU Factorization Instead of Pivoting,” 37th ACM International Conference on Supercomputing (ICS'23), Orlando, FL, ACM, June 2023. DOI: 10.1145/3577193.3593731

(624.18 KB)

Main menu

Publications

Pages