Publications
Context Identifier Allocation in Open MPI,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-16-01: Innovative Computing Laboratory, University of Tennessee, January 2016.
(490.89 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Assessing the impact of ABFT and Checkpoint composite strategies,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-03, 2013.
(968.47 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Constructing resiliant communication infrastructure for runtime environments,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-09-02, July 2009.
(463.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
DAGuE: A generic distributed DAG Engine for High Performance Computing.,”
Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.
(830.85 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
DTE: PaRSEC Systems and Interfaces (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(840.54 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
Self Adapting Numerical Software SANS Effort,”
IBM Journal of Research and Development, vol. 50, no. 2/3, pp. 223-238, January 2006.
(357.53 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
DTE: PaRSEC Enabled Libraries and Applications
: 2021 Exascale Computing Project Annual Meeting, April 2021.
(3.24 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Tensor Contraction on Distributed Hybrid Architectures using a Task-Based Runtime System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-13: University of Tennessee, December 2018.
(326.11 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Algorithmic Based Fault Tolerance Applied to High Performance Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-08-620 (also LAPACK Working Note 205), January 2008.
(313.55 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure,”
Proceedings of Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, vol. 6960, Santorini, Greece, Springer, pp. 342-344, September 2011.
(115.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
On Scalability for MPI Runtime Systems,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-05, Knoxville, TN, May 2011.
(898.76 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-10-02, 00 2010.
(400.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Composing Resilience Techniques: ABFT, Periodic, and Incremental Checkpointing,”
International Journal of Networking and Computing, vol. 5, no. 1, pp. 2-15, January 2015.
(755.54 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols,”
Proceedings of EuroMPI 2010, Stuttgart, Germany, Springer, September 2010.
(202.87 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
DTE: PaRSEC Enabled Libraries and Applications (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(979.27 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems,”
Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September 2012.
(290.27 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Revisiting Credit Distribution Algorithms for Distributed Termination Detection,”
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW): IEEE, pp. 611–620, 2021.
“Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms,”
International Journal of Networking and Computing, vol. 12, issue 1, pp. 26 - 46, January 2022.
“The Template Task Graph (TTG) - An Emerging Practical Dataflow Programming Paradigm for Scientific Simulation at Extreme Scale,”
2020 IEEE/ACM 5th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2): IEEE, November 2020.
(139.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
DAGuE: A generic distributed DAG engine for high performance computing,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-10-01, April 2010.
(830.85 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,”
ICL Technical Report, no. ICL-UT-04-04, January 2004.
(241.36 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Distributed Termination Detection for HPC Task-Based Environments,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018.
“Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-538, 00 2005.
(241.36 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,”
IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.
(290.98 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Approximate Computing for Scientific Applications,”
Approximate Computing Techniques, 322: Springer International Publishing, pp. 415 - 465, January 2022.
“Static Tiling for Heterogeneous Computing Platforms,”
Parallel Computing, vol. 25, no. 5, pp. 547-568, January 1999.
(301.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Algorithmic Issues on Heterogeneous Computing Platforms,”
Parallel Processing Letters, vol. 9, no. 2, pp. 197-213, January 1999.
(301.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Fault Tolerance Management for a Hierarchical GridRPC Middleware,”
8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), Lyon, France, January 2008.
(319.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Data Movement Interfaces to Support Dataflow Runtimes,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.
(210.94 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,”
Euro-Par 2013, Aachen, Germany, Springer, August 2013.
(431.84 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Multithreaded Communication Substrate for OpenSHMEM,”
8th International Conference on Partitioned Global Address Space Programming Models (PGAS), Eugene, OR, October 2014.
(261.66 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Redesigning the Message Logging Model for High Performance,”
International Supercomputer Conference (ISC 2008), Dresden, Germany, January 2008.
(622.1 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Surviving Errors with OpenSHMEM,”
OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments, Baltimore, MD, USA, Springer International Publishing, pp. 66–81, 2016.
“Reasons for a Pessimistic or Optimistic Message Logging Protocol in MPI Uncoordinated Failure Recovery,”
CLUSTER '09, New Orleans, IEEE, August 2009.
(191.36 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Retrospect: Deterministic Relay of MPI Applications for Interactive Distributed Debugging,”
Accepted for Euro PVM/MPI 2007: Springer, September 2007.
“Implicit Actions and Non-blocking Failure Recovery with MPI,”
2022 IEEE/ACM 12th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), Dallas, TX, USA, IEEE, January 2023, 2022.
“Redesigning the Message Logging Model for High Performance,”
Concurrency and Computation: Practice and Experience (online version), June 2010.
(438.42 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Correlated Set Coordination in Fault Tolerant Message Logging Protocols,”
Proceedings of 17th International Conference, Euro-Par 2011, Part II, vol. 6853, Bordeaux, France, Springer, pp. 51-64, August 2011.
(486.68 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.
(497.64 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,”
ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015.
(1.14 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Correlated Set Coordination in Fault Tolerant Message Logging Protocols,”
Concurrency and Computation: Practice and Experience, vol. 25, issue 4, pp. 572-585, March 2013.
(636.68 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Evaluating Contexts in OpenSHMEM-X Reference Implementation,”
OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, Cham, Springer International Publishing, pp. 50–62, 2018.
“Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery,”
22nd European MPI Users' Group Meeting, Bordeaux, France, ACM, September 2015.
(543.32 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
SmartGridRPC: The new RPC model for high performance Grid Computing and Its Implementation in SmartGridSolve,”
Concurrency and Computation: Practice and Experience (to appear), January 2010.
(1.08 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-12: University of Tennessee, August 2020.
(476.36 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
CEED ECP Milestone Report: Public release of CEED 2.0
: Zenodo, April 2019.
(4.98 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
libCEED: Fast algebra for high-order element-based discretizations,”
Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021.
“hipMAGMA v2.0
: Zenodo, July 2020.
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,”
2020 IEEE High Performance Extreme Computing Virtual Conference: IEEE, September 2020.
(476.36 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
hipMAGMA v1.0
: Zenodo, March 2020.