Exa-PAPI

Overview

This project is developing PAPI, which will provide tool designers and application engineers with a consistent interface and methodology for the use of low-level performance counter hardware found across the entire compute system (i.e. CPUs, GPUs, on/off-chip memory, interconnects, I/O system, energy/power, etc.). PAPI will enable users to see, in near real time, the relations between software performance and hardware events across the entire computer system.

Exa-PAPI builds on the latest PAPI project and will be extended with:

Performance counter monitoring capabilities for new and advanced ECP hardware, and software technologies.
Fine-grained power management support.
Functionality for performance counter analysis at "task granularity" for task-based runtime systems.
"Software-defined Events" that originate from the ECP software stack and are currently treated as black boxes (i.e., communication libraries, math libraries, task-based runtime systems, etc.)

The objective is to enable monitoring of both types of performance events—hardware- and software-related events—in a uniform way, through one consistent PAPI interface. Third-party tools and application developers will have to handle only a single hook to PAPI in order to access all hardware performance counters in a system, including the new software-defined events.

PAPI Releases

2022 ECP Meeting Poster

NEWS

Announcing PAPI 7.1.0

API 7.1.0 is now available. This release includes support for Intel Sapphire Rapids and AMD Zen4 preset events. The release also includes general improvements to the PAPI code in terms of design and functionality. Furthermore, the Counter Analysis Toolkit (CAT) and the Software-Defined Events (SDE) library have also been updated.

Major Changes:

Support for Intel Sapphire Rapids native and preset events
Support for AMD Zen4 native and preset events
Support for event qualifiers in the ROCm component
New 'template' component
Integration into Spack package manager
Integration into the Extreme-Scale Scientific Software Stack (E4S)
Refactored cuda component with multi-thread and multi-gpu support
Support for ARM Neoverse V1 and V2

Acknowledgements

This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Peinan Zhang, Rashawn Knapp, John Rodgers, John Linford, Bert Wesarg, Josh Minor, Kamil Iskra, Florian Weimer, Lukas Alt, William Y. Phan, Aurelian Melinte, and Phil Mucci.

The PAPI release can be downloaded from: https://github.com/icl-utk-edu/papi/wiki/PAPI-Releases.

Papers


Barry, D., Automated Classification and Verification of Performance Counters , December 2025. (3.62 MB)
Tahmid, T., and H. Jagode, “PAPI Support for Specialized AI Architectures,” SC25: 10th International Parallel Data Systems Workshop (PDSW 2025), St. Louis, MO, IEEE, November 2025. (360.05 KB) (1.97 MB)
Barry, D., H. Jagode, A. Danalis, and J. Dongarra, “Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023. DOI: 10.1109/IPDPSW59300.2023.00070 (1.81 MB)
Barry, D., A. Danalis, and H. Jagode, “Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,” 13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020. (738.47 KB)
Jagode, H., A. Danalis, and D. Genet, “Roadmap for Refactoring Classic PAPI to PAPI++: Part II: Formulation of Roadmap Based on Survey Results,” PAPI++ Working Notes, no. 2, ICL-UT-20-09: Innovative Computing Laboratory, University of Tennessee, July 2020. (763.75 KB)
Jagode, H., A. Danalis, and J. Dongarra, Exa-PAPI: The Exascale Performance API with Modern C++ , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020. (556.78 KB)
Winkler, F., “Redesigning PAPI's High-Level API,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-03: University of Tennessee, February 2020. (356.41 KB)
Jagode, H., A. Danalis, and J. Dongarra, “Formulation of Requirements for New PAPI++ Software Package: Part I: Survey Results,” PAPI++ Working Notes, no. 1, ICL-UT-20-02: Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020. (1.49 MB)
Jagode, H., A. Danalis, H. Anzt, and J. Dongarra, “PAPI Software-Defined Events for in-Depth Performance Analysis,” The International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1113-1127, November 2019. (442.39 KB)
Jagode, H., A. Danalis, and J. Dongarra, “What it Takes to keep PAPI Instrumental for the HPC Community,” 1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019. (50.57 KB)
Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, “Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00069 (446.41 KB)
Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, “Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019. DOI: 10.1007/978-3-030-11987-4_2 (216.39 KB)
Haidar, A., H. Jagode, P. Vaccaro, A. YarKhan, S. Tomov, and J. Dongarra, “Investigating Power Capping toward Energy-Efficient Scientific Applications,” Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018. DOI: 10.1002/cpe.4485 (1.2 MB)
Parker, S., J. Mellor-Crummey, D. H. Ahn, H. Jagode, H. Brunst, S. Shende, A. D. Malony, D. DelSignore, R. Tschuter, R. Castain, et al., “Performance Analysis and Debugging Tools at Scale,” Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017. DOI: 10.1201/b21930
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, “Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017. DOI: 10.1109/HPEC.2017.8091085 (908.84 KB)

Presentations

ICL Team Members

Daniel Barry

Research Scientist

Giuseppe Congiu

Visiting Scholar

Heike Jagode

Interim Director of ICL / Research Associate Professor

Exa-PAPI is part of ICL's involvement in the Exascale Computing Project (ECP). The ECP was established with the goals of maximizing the benefits of high-performance computing (HPC) for the United States and accelerating the development of a capable exascale computing ecosystem. Exascale refers to computing systems at least 50 times faster than the nation’s most powerful supercomputers in use today.

The ECP is a collaborative effort of two U.S. Department of Energy organizations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA).

Overview

NEWS

Announcing PAPI 7.1.0

Acknowledgements

Papers

Presentations

Related Links

Software and Documentation

Google User Group

ICL Team Members

Overview

NEWS

Announcing PAPI 7.1.0

Acknowledgements

Papers

Presentations

Related Links

Software and Documentation

Google User Group

ICL Team Members

Sponsored By