Accelerating Supercomputing: AI-Hardware-Driven Innovation for Speed and Efficiency

Dongarra, Jack; Gunnels, John; Bayraktar, Harun; Haidar, Azzam; Ernst, Dan

Submitted by webmaster on Tue, 11/25/2025 - 11:39

Title	Accelerating Supercomputing: AI-Hardware-Driven Innovation for Speed and Efficiency
Publication Type	Conference Paper
Year of Publication	2025
Authors	Dongarra, J., J. Gunnels, H. Bayraktar, A. Haidar, and D. Ernst
Conference Name	2025 IEEE High Performance Extreme Computing Conference (HPEC)
Date Published	2025-10
Publisher	IEEE
Conference Location	Wakefield, MA, USA
Abstract	The evolution of GPUs has resulted in democratized access to increasingly powerful low-precision compute capabilities, designed for artificial intelligence (AI), particularly large language models (LLMs) and generative AI. These algorithms heavily utilize hardware units specialized for matrix multiplication, such as Tensor Cores, that have advanced since their introduction, offering improved functionality, throughput, and energy efficiency. Two key techniques: mixed-precision algorithms and floating-point emulation, leveraging these resources, have emerged. They enable scientific applications, many dependent upon high-precision linear algebra, to achieve dramatic gains in performance and power efficiency. Additionally, these methods facilitate innovation in areas such as fine-grained mixed-precision strategies and data compression, broadening their impact across diverse computing platforms. This paper explores the opportunities afforded by these developments. We highlight both evolutionary advances and revolutionary features, such as the enhanced scaling capabilities of the latest NVIDIA Blackwell architecture’s Tensor Cores, and present empirical results, demonstrating their effectiveness on these GPUs.
URL	https://doi.org/10.1109/HPEC67600.2025.11196413
DOI	10.1109/HPEC67600.2025.11196413

External Publication Flag: