Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1287 results:

Filters: 10.1016 is 10.1109 and ICPP.2017.18 is [Clear All Filters]

2024

Luszczek, P., A. Abdelfattah, H. Anzt, A. Suzuki, and S. Tomov, “Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications,” Future Generation Computer Systems, vol. 160, pp. 359 - 374, November 2024. DOI: 10.1016/j.future.2024.06.004

Melnichenko, M., O. Balabanov, R. Murray, J. Demmel, M. W. Mahoney, and P. Luszczek, CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT) : arXiv, February 2024.

Dongarra, J., and D. Keyes, “The co-evolution of computational physics and high-performance computing,” Nature Reviews Physics, August 2024. DOI: 10.1038/s42254-024-00750-z

Kovalchuk, S. V., C. de Mulatier, V. V. Krzhizhanovskaya, J. Mikyška, M. Paszyński, J. Dongarra, and P. M. A. Sloot, “Computation at the Cutting Edge of Science,” Journal of Computational Science, June 2024. DOI: 10.1016/j.jocs.2024.102379

Slattery, S. A., K. A. Surjuse, C. Peterson, D. A. Penchoff, and E. Valeev, “Economical Quasi-Newton Unitary Optimization of Electronic Orbitals,” Physical Chemistry Chemical Physics, December 2023, 2024. DOI: 10.1039/D3CP05557D

Gates, M., A. Abdelfattah, K. Akbudak, M. Al Farhan, R. Alomairy, D. Bielich, T. Burgess, ébastien. Cayrols, N. Lindquist, D. Sukkari, et al., “Evolution of the SLATE linear algebra library,” The International Journal of High Performance Computing Applications, September 2024. DOI: 10.1177/10943420241286531

Cojean, T., P. Nayak, T. Ribizel, N. Beams, Y-H. Mike Tsai, M. Koch, F. Göbel, T. Grützmacher, and H. Anzt, “Ginkgo - A math library designed to accelerate Exascale Computing Project science applications,” The International Journal of High Performance Computing Applications, August 2024. DOI: 10.1177/10943420241268323

Abdelfattah, A., N. Beams, R. Carson, P. Ghysels, T. Kolev, T. Stitt, A. Vargas, S. Tomov, and J. Dongarra, “MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures,” The International Journal of High Performance Computing Applications, June 2024. DOI: 10.1177/10943420241261960

John, J., J. Milthorpe, T. Herault, and G. Bosilca, “Multi-GPU work sharing in a task-based dataflow programming model,” Future Generation Computer Systems, vol. 156, pp. 313 - 324, July 2024. DOI: 10.1016/j.future.2024.03.017

Luszczek, P., A. Castaldo, Y. M. Tsai, D. Mishler, and J. Dongarra, “Numerical eigen-spectrum slicing, accurate orthogonal eigen-basis, and mixed-precision eigenvalue refinement using OpenMP data-dependent tasks and accelerator offload,” The International Journal of High Performance Computing Applications, vol. 303, issue 136, September 2024. DOI: 10.1177/10943420241281050

Bouteiller, A., T. Herault, Q. Cao, J. Schuchart, and G. Bosilca, “PaRSEC: Scalability, flexibility, and hybrid architecture support for task-based applications in ECP,” The International Journal of High Performance Computing Applications, October 2024. DOI: 10.1177/10943420241290520

Bautista-Gomez, L., A. Benoit, S. Di, T. Herault, Y. Robert, and H. Sun, “A survey on checkpointing strategies: Should we always checkpoint à la Young/Daly?,” Future Generation Computer Systems, July 2024. DOI: 10.1016/j.future.2024.07.022

Bernholdt, D. E., G. Bosilca, A. Bouteiller, R. Brightwell, J. Ciesko, M. G. F. Dosanjh, G. Georgakoudis, I. Laguna, S. Levy, T. Naughton, et al., “Taking the MPI standard and the open MPI library to exascale,” The International Journal of High Performance Computing Applications, July 2024. DOI: 10.1177/10943420241265936

Anzt, H., A. Huebl, and X. S. Li, “Then and Now: Improving Software Portability, Productivity, and 100× Performance,” Computing in Science & Engineering, pp. 1 - 10, April 2024. DOI: 10.1109/MCSE.2024.3387302

Hoefler, T., M. Copik, P. Beckman, A. Jones, I. Foster, M. Parashar, D. Reed, M. Troyer, T. Schulthess, D. Ernst, et al., XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing : arXiv, January 2024.

2023

Thiyagalingam, J., G. von Laszewski, J. Yin, M. Emani, J. Papay, G. Barrett, P. Luszczek, A. Tsaris, C. Kirkpatrick, F. Wang, et al., “AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,” Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023. DOI: 10.1007/978-3-031-23220-610.1007/978-3-031-23220-6_4

Deshmukh, S., R. Yokota, and G. Bosilca, “Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors,” ACM Transactions on Mathematical Software, vol. 49, issue 3, pp. 1 - 29, September 2023. DOI: 10.1145/3595178

Luszczek, P., W. M. Sid-Lakhdar, and J. Dongarra, “Combining multitask and transfer learning with deep Gaussian processes for autotuning-based performance engineering,” The International Journal of High Performance Computing Applications, March 2023. DOI: 10.1177/10943420231166365

Valeev, E. F., R. J. Harrison, A. A. Holmes, C. C. Peterson, and D. A. Penchoff, “Direct Determination of Optimal Real-Space Orbitals for Correlated Electronic Structure of Molecules,” Journal of Chemical Theory and Computation, vol. 19, issue 20, pp. 7230 - 7241, October 2023. DOI: 10.1021/acs.jctc.3c00732

Deshmukh, S., R. Yokota, G. Bosilca, and Q. Ma, “O(N) distributed direct factorization of structured dense matrices using runtime systems,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, August 2023. DOI: 10.1145/3605573.3605606

Hoefler, T., B. Stevens, A. F. Prein, J. Baehr, T. Schulthess, T. F. Stocker, J. Taylor, D. Klocke, P. Manninen, P. M. Forster, et al., Earth Virtualization Engines - A Technical Perspective , September 2023.

Li, J., G. Bosilca, A. Bouteiller, and B. Nicolae, “Elastic deep learning through resilient collective operations,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3626080

Lindquist, N., P. Luszczek, and J. Dongarra, Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes : arXiv, December 2023.

Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, “GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624247

Reed, D., D. Gannon, and J. Dongarra, “HPC Forecast: Cloudy and Uncertain,” Communications of the ACM, vol. 66, issue 2, pp. 82 - 90, January 2023. DOI: 10.1145/3552309

Mor, O., G. Bosilca, and M. Snir, “Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, September 2023. DOI: 10.1145/3605573.3605642

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements , St. Petersburg, FL, 28th HIPS Workshop, May 2023.

(3.99 MB)

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, “Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023. DOI: 10.1109/IPDPSW59300.2023.00070

(1.81 MB)

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Mixed Precision Algebraic Multigrid on GPUs,” Parallel Processing and Applied Mathematics (PPAM 2022), vol. 13826, Cham, Springer International Publishing, April 2023. DOI: 10.1007/978-3-031-30442-2_9

Schuchart, J., and G. Bosilca, “MPI Continuations And How To Invoke Them,” Sustained Simulation Performance 2021, Cham, Springer International Publishing, pp. 67 - 83, February 2023. DOI: 10.1007/978-3-031-18046-010.1007/978-3-031-18046-0_5

Sid-Lakhdar, W., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. Davis, et al., “PAQR: Pivoting Avoiding QR factorization,” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, FL, USA, IEEE, 2023. DOI: 10.1109/IPDPS54959.2023.00040

Ribizel, T., and H. Anzt, “Parallel Symbolic Cholesky Factorization,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624253

Mishler, D., J. Ciesko, S. Olivier, and G. Bosilca, “Performance Insights into Device-initiated RMA Using Kokkos Remote Spaces,” 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTERWorkshops61457.2023.00028

Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, “Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023. DOI: 10.1007/978-3-031-23606-810.1007/978-3-031-23606-8_3

Cao, Q., S. Abdulah, H. Ltaief, M. G. Genton, D. Keyes, and G. Bosilca, “Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,” 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTER52292.2023.00035

Benoit, A., T. Herault, L. Perotin, Y. Robert, and F. Vivien, “Revisiting I/O bandwidth-sharing strategies for HPC applications,” INRIA Research Report, no. RR-9502: INRIA, March 2023.

Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, “Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023. DOI: 10.1002/cpe.7871

Schuchart, J., S. Hunold, and G. Bosilca, “Synchronizing MPI Processes in Space and Time,” EUROMPI '23: 30th European MPI Users' Group Meeting, Bristol, United Kingdom, ACM, September 2023. DOI: 10.1145/3615318.3615325

Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, “Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624248

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Three-precision algebraic multigrid on GPUs,” Future Generation Computer Systems, July 2023. DOI: 10.1016/j.future.2023.07.024

Lindquist, N., P. Luszczek, and J. Dongarra, “Using Additive Modifications in LU Factorization Instead of Pivoting,” 37th ACM International Conference on Supercomputing (ICS'23), Orlando, FL, ACM, June 2023. DOI: 10.1145/3577193.3593731

(624.18 KB)

Grützmacher, T., H. Anzt, and E. S. Quintana‐Ortí, “Using Ginkgo's memory accessor for improving the accuracy of memory‐bound low precision BLAS,” Software: Practice and Experience, vol. 532, issue 1, pp. 81 - 98, January Jan. DOI: 10.1002/spe.v53.110.1002/spe.3041

Barbut, Q., A. Benoit, T. Herault, Y. Robert, and F. Vivien, “When to checkpoint at the end of a fixed-length reservation?,” Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.

2022

Abdulah, S., Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief, and Y. Sun, “Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, issue 4, pp. 964 - 976, April 2022. DOI: 10.1109/TPDS.2021.3084071

Abdelfattah, A., P. Ghysels, W. Boukaram, S. Tomov, X. Sherry Li, and J. Dongarra, “Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.

(1.57 MB)

Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, “Analysis of the Communication and Computation Cost of FFT Libraries towards Exascale,” ICL Technical Report, no. ICL-UT-22-07: Innovative Computing Laboratory, July 2022.

(5.91 MB)

Anzt, H., M. Casas, C. I. Malossi, E. S. Quintana-Ortí, F. Scheidegger, and S. Zhuang, “Approximate Computing for Scientific Applications,” Approximate Computing Techniques, 322: Springer International Publishing, pp. 415 - 465, January 2022. DOI: 10.1007/978-3-030-94705-7_14

Abdelfattah, A., S. Tomov, and J. Dongarra, “Batch QR Factorization on GPUs: Design, Optimization, and Tuning,” Lecture Notes in Computer Science, vol. 13350, Cham, Springer International Publishing, June 2022. DOI: 10.1007/978-3-031-08751-6_5

Kashi, A., P. Nayak, D. Kulkarni, A. Scheinberg, P. Lin, and H. Anzt, “Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022. DOI: 10.1109/IPDPS53621.2022.00024

(1.26 MB)

Benoit, A., Y. Du, T. Herault, L. Marchal, G. Pallez, L. Perotin, Y. Robert, H. Sun, and F. Vivien, “Checkpointing à la Young/Daly: An Overview,” IC3-2022: Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, Noida, India, ACM Press, pp. 701-710, August 2022. DOI: 10.1145/3549206

(639.77 KB)

Main menu

Publications

Pages