Overview

This project is developing PAPI, which will provide tool designers and application engineers with a consistent interface and methodology for the use of low-level performance counter hardware found across the entire compute system (i.e. CPUs, GPUs, on/off-chip memory, interconnects, I/O system, energy/power, etc.). PAPI will enable users to see, in near real time, the relations between software performance and hardware events across the entire computer system.

Exa-PAPI builds on the latest PAPI project and will be extended with:

  1. Performance counter monitoring capabilities for new and advanced ECP hardware, and software technologies.
  2. Fine-grained power management support.
  3. Functionality for performance counter analysis at "task granularity" for task-based runtime systems.
  4. "Software-defined Events" that originate from the ECP software stack and are currently treated as black boxes (i.e., communication libraries, math libraries, task-based runtime systems, etc.)

The objective is to enable monitoring of both types of performance events—hardware- and software-related events—in a uniform way, through one consistent PAPI interface. Third-party tools and application developers will have to handle only a single hook to PAPI in order to access all hardware performance counters in a system, including the new software-defined events.

NEWS

PAPI 6.0.0 Release

PAPI 6.0.0 was released March 4, 2020. This release includes a new API for SDEs (Software Defined Events), a major revision of the 'high-level API', and several new components, including ROCM and ROCM_SMI (for AMD GPUs), powercap_ppc and sensors_ppc (for IBM Power9 and later), SDE, and the IO component (exposes I/O statistics exported by the Linux kernel). Furthermore, PAPI 6.0 ships CAT, a new Counter Analysis Toolkit that assists with native performance counter disambiguation through micro-benchmarks.

For specific and detailed information on changes made for this release, see ChangeLogP600.txt for filenames or keywords of interest and change summaries, or go directly to the PAPI git repository.

Some Major Changes for PAPI 6.0 include:
  • Added the rocm component to support performance counters on AMD GPUs.
  • Added the rocm_smi component; SMI is System Management Interface to monitor power usage on AMD GPUs, which is also writeable by the user, e.g. to reduce power consumption on non-critical operations.
  • Added 'io' component to expose I/O statistics exported by the Linux kernel (/proc/self/io).
  • Added 'SDE' component, Software Defined Events, which allows HPC software layers to expose internal performance-critical behavior via Software Defined Events (SDEs) through the PAPI interface.
  • Added 'SDE API' to register performance-critical events that originate from HPC software layers, and which are recognized as 'PAPI counters' and, thus, can be monitored with the standard PAPI interface.
  • Added powercap_ppc component to support monitoring and capping of power usage on IBM PowerPC architectures (Power9 and later) using the powercap interface exposed through the Linux kernel.
  • Added 'sensors_ppc' component to support monitoring of system metrics on IBM PowerPC architectures (Power9 and later) using the opal/exports sysfs interface.
  • Retired infiniband_umad component, it is superseded by infiniband.
  • Revived PAPI's 'high-level API' to make it more intuitive and effective for novice users and quick event reporting.
  • Added 'counter_analysis_toolkit' sub-directory (CAT): A tool to assist with native performance counter disambiguation through micro-benchmarks, which are used to probe different important aspects of modern CPUs, to aid the classification of native performance events.
Other Changes include:
  • Standardized our environment variables and implemented a simplified, unified approach for specifying libraries necessary for components, with overrides possible for special circumstances. Eliminated component level 'configure' requirements.
  • Corrected TLS issues (Thread Local Storage) and race conditions.
  • Several bug fixes, documentation fixes and enhancements, improvements to README files for user instruction and code comments.
Acknowledgements

This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Phil Mucci, Kevin Huck, Yunqiang Su, Carl Love, Andreas Beckmann, Al Grant and Evgeny Shcherbakov.

Download papi-6.0.0.tar.gz

To verify the integrity of the download, check the MD5 hash 'md5sum papi-6.0.0.tar.gz':

67d06f70fca62f4fcc95672f197638a2

Papers

Barry, D., A. Danalis, and H. Jagode, Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,” 13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.  (738.47 KB)
Jagode, H., A. Danalis, and D. Genet, Roadmap for Refactoring classic PAPI to PAPI++: Part II: Formulation of Roadmap based on Survey Results,” PAPI++ Working Notes, no. 2, ICL-UT-20-09: Innovative Computing Laboratory, University of Tennessee, July 2020.  (763.75 KB)
Winkler, F., Redesigning PAPI’s High-Level API,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-03: University of Tennessee, February 2020.  (356.41 KB)
Jagode, H., A. Danalis, and J. Dongarra, Formulation of Requirements for new PAPI++ Software Package: Part I: Survey Results,” PAPI++ Working Notes, no. 1, ICL-UT-20-02: Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020.  (1.49 MB)
Jagode, H., A. Danalis, H. Anzt, and J. Dongarra, PAPI Software-Defined Events for in-Depth Performance Analysis,” The International Journal of High Performance Computing Applications, vol. 33, issue 6, pp. 1113-1127, November 2019.  (442.39 KB)
Jagode, H., A. Danalis, and J. Dongarra, What it Takes to keep PAPI Instrumental for the HPC Community,” 1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019.  (50.57 KB)
Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00069  (446.41 KB)
Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019. DOI: 10.1007/978-3-030-11987-4_2  (216.39 KB)
Haidar, A., H. Jagode, P. Vaccaro, A. YarKhan, S. Tomov, and J. Dongarra, Investigating Power Capping toward Energy-Efficient Scientific Applications,” Concurrency Computation: Practice and Experience, vol. 2018, issue e4485, pp. 1-14, April 2018. DOI: 10.1002/cpe.4485  (1.2 MB)
Parker, S., J. Mellor-Crummey, D. H. Ahn, H. Jagode, H. Brunst, S. Shende, A. D. Malony, D. DelSignore, R. Tschuter, R. Castain, et al., Performance Analysis and Debugging Tools at Scale,” Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017. DOI: 10.1201/b21930
Haidar, A., H. Jagode, A. YarKhan, P. Vaccaro, S. Tomov, and J. Dongarra, Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017. DOI: 10.1109/HPEC.2017.8091085  (908.84 KB)

Presentations

Danalis, A., H. Jagode, and J. Dongarra, PAPI's new Software-Defined Events for in-depth Performance Analysis , Dresden, Germany, 13th Parallel Tools Workshop, September 2019.  (3.14 MB)
Danalis, A., H. Jagode, and J. Dongarra, Does your tool support PAPI SDEs yet? , Tahoe City, CA, 13th Scalable Tools Workshop, July 2019.  (3.09 MB)
Jagode, H., A. Danalis, and J. Dongarra, What it Takes to keep PAPI Instrumental for the HPC Community , Collegeville, MN, The 2019 Collegeville Workshop on Sustainable Scientific Software (CW3S19), July 2019.  (3.29 MB)
Danalis, A., H. Jagode, and J. Dongarra, Is your scheduling good? How would you know? , Bordeaux, France, 14th Scheduling for Large Scale Systems Workshop, June 2019.  (2.5 MB)
Danalis, A., H. Jagode, D. Barry, and J. Dongarra, Understanding Native Event Semantics , Knoxville, TN, 9th JLESC Workshop, April 2019.  (2.33 MB)
Jagode, H., A. Danalis, and J. Dongarra, PAPI's New Software-Defined Events for In-Depth Performance Analysis , Lyon, France, CCDSC 2018: Workshop on Clusters, Clouds, and Data for Scientific Computing, September 2018.
Danalis, A., H. Jagode, and J. Dongarra, Software-Defined Events through PAPI for In-Depth Analysis of Application Performance , Basel, Switzerland, 5th Platform for Advanced Scientific Computing Conference (PASC18), July 2018.
Danalis, A., H. Jagode, and J. Dongarra, PAPI: Counting outside the Box , Barcelona, Spain, 8th JLESC Meeting, April 2018.

ICL Team Members

Daniel Barry
Graduate Research Assistant
Tony Castaldo
Research Scientist II
Anthony Danalis
Research Assistant Professor
Jack Dongarra
University Distinguished Professor
Damien Genet
Research Scientist I
Heike Jagode
Research Assistant Professor
Frank Winkler
Consultant

Exascale Computing Project

Exa-PAPI is part of ICL's involvment in the Exascale Computing Project (ECP). The ECP was established with the goals of maximizing the benefits of high-performance computing (HPC) for the United States and accelerating the development of a capable exascale computing ecosystem. Exascale refers to computing systems at least 50 times faster than the nation’s most powerful supercomputers in use today.

The ECP is a collaborative effort of two U.S. Department of Energy organizations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA).

Sponsored By
Exascale Computing Project
National Nuclear Security Administration
The United States Department of Energy