Submitted by claxton on
| Title | PAPI Support for Specialized AI Architectures |
| Publication Type | Conference Paper |
| Year of Publication | 2025 |
| Authors | Tahmid, T., and H. Jagode |
| Conference Name | SC25: 10th International Parallel Data Systems Workshop (PDSW 2025) |
| Date Published | 2025-11 |
| Publisher | IEEE |
| Conference Location | St. Louis, MO |
| Abstract | This work focuses on developing unique performance monitoring capabilities in PAPI to extend support for specialized AI chips to help researchers identify hardware-specific bottlenecks and improve AI architectures. However, the lack of traditional hardware counters on these specialized devices poses a unique challenge and makes it necessary to develop alternative approaches for performance monitoring. PAPI’s Software Defined Events (SDEs) is one such promising approach since it provides a portable way to capture and expose software-level metrics from within applications. Our proof-of-concept implementation on traditional processors uses the vendor-agnostic HPL-MxP benchmark instrumented with PAPI SDE to register custom events, such as sde_io_read_bytes and sde_float16, to track memory and network I/O, and the floating-point precision usage throughout the workload. Results showed close agreement between SDE and hardware counts, with a mean ± standard deviation difference of 0.310% ± 5.315%, providing confidence in applying the SDE approach to systems without accessible hardware counters. |
| URL | https://www.pdsw.org/index.shtml |



