Publications

Export 1290 results:
Conference Paper
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 2016.  (1.27 MB)
Mishler, D., J. Ciesko, S. Olivier, and G. Bosilca, Performance Insights into Device-initiated RMA Using Kokkos Remote Spaces,” 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTERWorkshops61457.2023.00028
Dongarra, J., A. D. Malony, S. Moore, P. Mucci, and S. Shende, Performance Instrumentation and Measurement for Terascale Systems,” ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003. DOI: 10.1007/3-540-44864-0_6  (5.36 MB)
Benoit, A., S. Perarnau, L. Pottier, and Y. Robert, A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures,” The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.  (868.44 KB)
Moore, S., D. Cronk, F. Wolf, A. Purkayastha, P. J. Teller, R. Araiza, G. Aguilera, and J. Nava, Performance Profiling and Analysis of DoD Applications using PAPI and TAU,” Proceedings of DoD HPCMP UGC 2005, Nashville, TN, IEEE, June 2005.  (322.56 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.  (626.21 KB)
McCraw, H., J. Ralph, A. Danalis, and J. Dongarra, Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014. DOI: 10.1109/CLUSTER.2014.6968672  (3.45 MB)
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017. DOI: 10.1109/HPEC.2017.8091085  (908.84 KB)
Aggarwal, I., P. Nayak, A. Kashi, and H. Anzt, Preconditioners for Batched Iterative Linear Solvers on GPUs,” Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023. DOI: 10.1007/978-3-031-23606-810.1007/978-3-031-23606-8_3
Hunold, S., A. Bhatele, G. Bosilca, and P. Knees, Predicting MPI Collective Communication Performance Using Machine Learning,” 2020 IEEE International Conference on Cluster Computing (CLUSTER), Kobe, Japan, IEEE, September 2020. DOI: 10.1109/CLUSTER49012.2020.00036  (619.68 KB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Progressive Optimization of Batched LU Factorization on GPUs,” IEEE High Performance Extreme Computing Conference (HPEC’19), Waltham, MA, IEEE, September 2019.  (299.38 KB)
Danalis, A., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, PTG: An Abstraction for Unhindered Parallelism,” International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, LA, IEEE Press, November 2014.  (480.05 KB)
Schuchart, J., P. Nookala, T. Herault, E. F. Valeev, and G. Bosilca, Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), Heidelberg, Germany, IEEE, September 2022. DOI: 10.1109/CLUSTER51413.2022.00026
Nance, D., S. Tomov, and K. Wong, A Python Library for Matrix Algebra on GPU and Multicore Architectures,” 2022 IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems (MASS), Denver, CO, IEEE, December 2022. DOI: 10.1109/MASS56207.2022.00121  (414.36 KB)
Schuchart, J., C. Niethammer, J. Gracia, and G. Bosilca, Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication,” EuroMPI'21, Garching, Munich Germany, 2021.  (835.27 KB)
Cao, Q., S. Abdulah, H. Ltaief, M. G. Genton, D. Keyes, and G. Bosilca, Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,” 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023. DOI: 10.1109/CLUSTER52292.2023.00035
Lindquist, N., P. Luszczek, and J. Dongarra, Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques,” 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), Atlanta, GA, IEEE, November 2020.  (184.6 KB)
Benoit, A., T. Herault, V. Le Fèvre, and Y. Robert, Replication is More Efficient Than You Think,” The IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis (SC19), Denver, CO, ACM Press, November 2019.  (975.69 KB)
Gainaru, A., B. Goglin, V. Honoré, P. Raghavan, G. Pallez, P. Raghavan, Y. Robert, and H. Sun, Reservation and Checkpointing Strategies for Stochastic Jobs,” 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.  (692.4 KB)
Aupy, G., A. Gainaru, V. Honoré, P. Raghavan, Y. Robert, and H. Sun, Reservation Strategies for Stochastic Jobs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), Rio de Janeiro, Brazil, IEEE Computer Society Press, May 2019.  (808.93 KB)
Bathie, G., L. Marchal, Y. Robert, and S. Thibault, Revisiting Dynamic DAG Scheduling under Memory Constraints for Shared-Memory Platforms,” 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.  (317.93 KB)
Du, Y., L. Marchal, G. Pallez, and Y. Robert, Robustness of the Young/Daly Formula for Stochastic Iterative Applications,” 49th International Conference on Parallel Processing (ICPP 2020), Edmonton, AB, Canada, ACM Press, August 2020.  (1.11 MB)
Zhong, D., A. Bouteiller, X. Luo, and G. Bosilca, Runtime Level Failure Detection and Propagation in HPC Systems,” European MPI Users' Group Meeting (EuroMPI '19), Zürich, Switzerland, ACM, September 2019. DOI: 10.1145/3343211.3343225  (1.11 MB)
Yamazaki, I., S. Tomov, and J. Dongarra, Sampling Algorithms to Update Truncated SVD,” IEEE International Conference on Big Data, Boston, MA, IEEE, December 2017.  (700.79 KB)
Luszczek, P., Y. Tsai, N. Lindquist, H. Anzt, and J. Dongarra, Scalable Data Generation for Evaluating Mixed-Precision Solvers,” 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, IEEE, September 2020. DOI: 10.1109/HPEC43674.2020.9286145  (1.3 MB)
Luszczek, P., J. Kurzak, I. Yamazaki, D. Keffer, and J. Dongarra, Scaling Point Set Registration in 3D Across Thread Counts on Multicore and Hardware Accelerator Platforms through Autotuning for Large Scale Analysis of Scientific Point Clouds,” IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD 2017), Boston, MA, IEEE, December 2017. DOI: 10.1109/BigData.2017.8258258  (6.71 MB)
Gao, Y., L-C. Canon, Y. Robert, and F. Vivien, Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms,” IEEE Cluster 2019, Albuquerque, New Mexico, IEEE Computer Society Press, September 2019.  (651 KB)
Anzt, H., D. Lukarski, S. Tomov, and J. Dongarra, Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures,” VECPAR 2014, Eugene, OR, June 2014.  (430.56 KB)
Gates, M., J. Kurzak, A. Charara, A. YarKhan, and J. Dongarra, SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library,” International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Denver, CO, ACM, November 2019. DOI: 10.1145/3295500.3356223  (2.01 MB)
Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00069  (446.41 KB)
Tsai, Y. M., T. Cojean, and H. Anzt, Sparse Linear Algebra on AMD and NVIDIA GPUs—The Race is On,” ISC High Performance: Springer, June 2020. DOI: 10.1007/978-3-030-50743-5_16  (5.63 MB)
Luszczek, P., and C. Brown, Surrogate ML/AI Model Benchmarking for FAIR Principles' Conformance,” 2022 IEEE High Performance Extreme Computing Conference (HPEC): IEEE, September 2022. DOI: 10.1109/HPEC55821.2022.9926401
Lacoste, X., M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes,” 23rd International Heterogeneity in Computing Workshop, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (807.33 KB)
Slaughter, E., W. Wu, Y. Fu, L. Brandenburg, N. Garcia, W. Kautz, E. Marx, K. S. Morris, Q. Cao, G. Bosilca, et al., Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance,” International Conference for High Performance Computing Networking, Storage, and Analysis (SC20): ACM, November 2020.  (644.92 KB)
Sukkari, D., M. Gates, M. Al Farhan, H. Anzt, and J. Dongarra, Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators,” SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023. DOI: 10.1145/3624062.3624248
Boillot, L., G. Bosilca, E. Agullo, and H. Calandra, Task-Based Programming for Seismic Imaging: Preliminary Results,” 2014 IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.  (625.86 KB)
Bosilca, G., R. Harrison, T. Herault, M. Mahdi Javanmard, P. Nookala, and E. Valeev, The Template Task Graph (TTG) - An Emerging Practical Dataflow Programming Paradigm for Scientific Simulation at Extreme Scale,” 2020 IEEE/ACM 5th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2): IEEE, November 2020. DOI: 10.1109/ESPM251964.2020.00011  (139.6 KB)

Pages