Integrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approach

TitleIntegrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approach
Publication TypeConference Proceedings
Year of Publication2022
AuthorsWhitlock, M., N. Morales, G. Bosilca, A. Bouteiller, B. Nicolae, K. Teranishi, E. Giem, and V. Sarkar
Conference Name2022 IEEE International Conference on Cluster Computing (CLUSTER 2022)
Date Published2022-09
Conference LocationHeidelberg, Germany
Keywordscheckpointing, Fault tolerance, Fenix, HPC, Kokkos, MPI-ULFM, resilience
URLhttps://hal.archives-ouvertes.fr/hal-03772536
External Publication Flag: