Publications
A Failure Detector for HPC Platforms,”
The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.
(1.04 MB)
“Failure Detection and Propagation in HPC Systems,”
Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
“Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.
(497.64 KB)
“Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,”
Euro-Par 2013, Aachen, Germany, Springer, August 2013.
(431.84 KB)
“Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,”
Concurrency and Computation: Practice and Experience, November 2013.
(894.61 KB)
“Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,”
University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.
(2.76 MB)
“Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,”
The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.
“Parallel algebraic domain decomposition solver for the solution of augmented systems.,”
Parallel, Distributed, Grid and Cloud Computing for Engineering, Ajaccio, Corsica, France, 12-15 April, 00 2011.
“MaPHyS or the Development of a Parallel Algebraic Domain Decomposition Solver in the Course of the Solstice Project,”
Sparse Days 2010 Meeting at CERFACS, Toulouse, France, June 2010.
“Towards a Complexity Analysis of Sparse Hybrid Linear Solvers,”
PARA 2010, Reykjavik, Iceland, June 2010.
“