Accelerating Supercomputing: AI-Hardware-Driven Innovation for Speed and Efficiency

TitleAccelerating Supercomputing: AI-Hardware-Driven Innovation for Speed and Efficiency
Publication TypeConference Paper
Year of Publication2025
AuthorsDongarra, J., J. Gunnels, H. Bayraktar, A. Haidar, and D. Ernst
Conference Name2025 IEEE High Performance Extreme Computing Conference (HPEC)
Date Published2025-10
PublisherIEEE
Conference LocationWakefield, MA, USA
Abstract

The evolution of GPUs has resulted in democratized access to increasingly powerful low-precision compute capabilities, designed for artificial intelligence (AI), particularly large language models (LLMs) and generative AI. These algorithms heavily utilize hardware units specialized for matrix multiplication, such as Tensor Cores, that have advanced since their introduction, offering improved functionality, throughput, and energy efficiency. Two key techniques: mixed-precision algorithms and floating-point emulation, leveraging these resources, have emerged. They enable scientific applications, many dependent upon high-precision linear algebra, to achieve dramatic gains in performance and power efficiency. Additionally, these methods facilitate innovation in areas such as fine-grained mixed-precision strategies and data compression, broadening their impact across diverse computing platforms. This paper explores the opportunities afforded by these developments. We highlight both evolutionary advances and revolutionary features, such as the enhanced scaling capabilities of the latest NVIDIA Blackwell architecture’s Tensor Cores, and present empirical results, demonstrating their effectiveness on these GPUs.

URLhttps://doi.org/10.1109/HPEC67600.2025.11196413
DOI10.1109/HPEC67600.2025.11196413
External Publication Flag: