Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC

TitleImplementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC
Publication TypeTech Report
Year of Publication2013
AuthorsAupy, G., M. Faverge, Y. Robert, J. Kurzak, P. Luszczek, and J. Dongarra
Technical Report Series TitleLawn 277
Date Published2013-05

This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for inter-node communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-the-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures

Project Tags: 
External Publication Flag: