Efficient Embedding Initialization via Dominant Eigenvector Projections

TitleEfficient Embedding Initialization via Dominant Eigenvector Projections
Publication TypeConference Paper
Year of Publication2025
AuthorsPetit, Q., C. Li, N. Emad, and J. Dongarra
Conference NameSC Workshops '25: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Date Published2025-11
PublisherACM
Conference LocationSt Louis, MO USA
ISBN Number9798400718717
Abstract

The embedding layer is essential in deep learning, transforming high-dimensional data into compact representations. However, growing datasets and model sizes pose challenges in training time, memory, and generalization. We propose a scalable method for embedding initialization via spectral dimensionality reduction using dominant eigenvector projections.
The proposed approach leverages on MIRAMns, multiple implicitly restarted Arnoldi method with nested subspaces, to extract most informative directions from large and potentially sparse data representations. Unlike traditional embeddings or autoencoders, this proposed approach requires few tunable parameters and is inherently parallel. We apply MIRAMns to matrix representations such as covariance and co-occurrence matrices to compute low-dimensional embeddings that preserve data structure and variance. Experiments across diverse datasets show that the proposed method achieves comparable or better accuracy with significantly reduced dimensionality, enabling smaller, faster deep networks. Additionally, our parallel implementation scales efficiently on HPC platforms, making it well-suited for large-scale scientific and AI workloads.

URLhttps://doi.org/10.1145/3731599.3767541
DOI10.1145/3731599.3767541
External Publication Flag: