Publications

Export 1300 results:
Conference Paper
Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624248
Boillot, L., G. Bosilca, E. Agullo, and H. Calandra, Task-Based Programming for Seismic Imaging: Preliminary Results,” 2014 IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.  (625.86 KB)
Bosilca, G., R. Harrison, T. Herault, M. Mahdi Javanmard, P. Nookala, and E. Valeev, The Template Task Graph (TTG) - An Emerging Practical Dataflow Programming Paradigm for Scientific Simulation at Extreme Scale,” 2020 IEEE/ACM 5th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2): IEEE, November 2020. DOI: 10.1109/ESPM251964.2020.00011  (139.6 KB)
Lindquist, N., M. Gates, P. Luszczek, and J. Dongarra, Threshold Pivoting for Dense LU Factorization,” ScalAH22: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems , Dallas, Texas, IEEE, November 2022. DOI: 10.1109/ScalAH56622.2022.00010  (721.77 KB)
Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019. DOI: 10.1145/3324989.3325719  (1.51 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs,” ScalA19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.  (523.87 KB) (3.42 MB)
Luszczek, P., J. Kurzak, I. Yamazaki, and J. Dongarra, Towards Numerical Benchmark for Half-Precision Floating Point Arithmetic,” 2017 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, IEEE, September 2017. DOI: 10.1109/HPEC.2017.8091031  (1.67 MB)
Tseng, S-M., B. Nicolae, G. Bosilca, E. Jeannot, A. Chandramowlishwaran, and F. Cappello, Towards Portable Online Prediction of Network Utilization Using MPI-Level Monitoring,” 2019 European Conference on Parallel Processing (Euro-Par 2019), Göttingen, Germany, Springer, August 2019. DOI: 10.1007/978-3-030-29400-7_4  (1.07 MB)
Tahmid, T., M. Gates, P. Luszczek, and C. Schuman, Towards Scalable and Efficient Spiking Reinforcement Learning for Continuous Control Tasks,” 2024 International Conference on Neuromorphic Systems (ICONS), Arlington, VA, USA, IEEE, 2024. DOI: 10.1109/ICONS62911.2024.00057
Yamazaki, I., T. Dong, S. Tomov, and J. Dongarra, Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster,” The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 2013.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Twenty Years of Computational Science,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.  (149.66 KB)
Li, J., B. Nicolae, J. M. Wozniak, and G. Bosilca, Understanding Scalability and Fine-Grain Parallelism of Synchronous Data Parallel Training,” 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC), Denver, CO, IEEE, November 2019. DOI: 10.1109/MLHPC49564.2019.00006  (696.89 KB)
Lindquist, N., P. Luszczek, and J. Dongarra, Using Additive Modifications in LU Factorization Instead of Pivoting,” 37th ACM International Conference on Supercomputing (ICS'23), Orlando, FL, ACM, June 2023. DOI: 10.1145/3577193.3593731  (624.18 KB)
Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, Using Advanced Vector Extensions AVX-512 for MPI Reduction,” EuroMPI/USA '20: 27th European MPI Users' Group Meeting, Austin, TX, September 2020. DOI: 10.1145/3416315.3416316  (634.45 KB)
Zhong, D., P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, Using Arm Scalable Vector Extension to Optimize Open MPI,” 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020. DOI: 10.1109/CCGrid49817.2020.00-71  (359.95 KB)
Haidar, A., S. Tomov, A. Abdelfattah, M. Zounon, and J. Dongarra, Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption,” ISC High Performance (ISC'18), Best Poster, Frankfurt, Germany, June 2018.  (3.01 MB)
Dongarra, J., K. London, S. Moore, P. Mucci, and D. Terpstra, Using PAPI for Hardware Performance Monitoring on Linux Systems,” Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.  (422.35 KB)
Eberius, D., T. Patinyasakdikul, and G. Bosilca, Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information,” EuroMPI, Chicago, IL, ACM, September 2017. DOI: 10.1145/3127024.3127039  (745.58 KB)
McCraw, H., A. Danalis, G. Bosilca, J. Dongarra, K. Kowalski, and T. Windus, Utilizing Dataflow-based Execution for Coupled Cluster Methods,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-02, Madrid, Spain, IEEE, September 2014.  (260.23 KB)
Anzt, H., J. Dongarra, G. Flegar, and T. Gruetzmacher, Variable-Size Batched Condition Number Calculation on GPUs,” SBAC-PAD, Lyon, France, September 2018.  (509.3 KB)
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning,” 46th International Conference on Parallel Processing (ICPP), Bristol, United Kingdom, IEEE, August 2017. DOI: 10.1109/ICPP.2017.18
Haugen, B., S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, Visualizing Execution Traces with Task Dependencies,” 2nd Workshop on Visual Performance Analysis (VPA '15), Austin, TX, ACM, November 2015.  (927.5 KB)
Jagode, H., A. Danalis, and J. Dongarra, What it Takes to keep PAPI Instrumental for the HPC Community,” 1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019.  (50.57 KB)
Barbut, Q., A. Benoit, T. Herault, Y. Robert, and F. Vivien, When to checkpoint at the end of a fixed-length reservation?,” Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.
Conference Proceedings
Abdelfattah, A., P. Ghysels, W. Boukaram, S. Tomov, X. Sherry Li, and J. Dongarra, Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.  (1.57 MB)
Andersson, U., and P. Mucci, Analysis and Optimization of Yee_Bench using Hardware Performance Counters,” Proceedings of Parallel Computing 2005 (ParCo), Malaga, Spain, January 2005.  (72.27 KB)

Pages