Publications
Export 87 results:
Filters: Author is Thomas Herault [Clear All Filters]
Argobots: A Lightweight Low-Level Threading and Tasking Framework,”
IEEE Transactions on Parallel and Distributed Systems, October 2017.
DOI: 10.1109/TPDS.2017.2766062
“Checkpointing à la Young/Daly: An Overview,”
IC3-2022: Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, Noida, India, ACM Press, pp. 701-710, August 2022.
DOI: 10.1145/3549206
(639.77 KB)
“
Checkpointing Strategies for Shared High-Performance Computing Platforms,”
International Journal of Networking and Computing, vol. 9, no. 1, pp. 28–52, 2019.
(490.5 KB)
“
Comparing the Performance of Rigid, Moldable, and Grid-Shaped Applications on Failure-Prone HPC Platforms,”
Parallel Computing, vol. 85, pp. 1–12, July 2019.
DOI: 10.1016/j.parco.2019.02.002
(865.18 KB)
“
Context Identifier Allocation in Open MPI,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-16-01: Innovative Computing Laboratory, University of Tennessee, January 2016.
(490.89 KB)
“
DAGuE: A generic distributed DAG Engine for High Performance Computing.,”
Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.
(830.85 KB)
“
Data Movement Interfaces to Support Dataflow Runtimes,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.
(210.94 KB)
“
Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,”
Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.
(1.01 MB)
“
Design for a Soft Error Resilient Dynamic Task-based Runtime,”
29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.
(2.31 MB)
“
Design for a Soft Error Resilient Dynamic Task-based Runtime,”
ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.
(2.61 MB)
“
Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA,”
University of Tennessee Computer Science Technical Report, UT-CS-10-660, September 2010.
(366.26 KB)
“
Distributed Termination Detection for HPC Task-Based Environments,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018.
“Distributed-Memory Multi-GPU Block-Sparse Tensor Contraction for Electronic Structure,”
35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.
“Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-10-02, 00 2010.
(400.75 KB)
“
Do moldable applications perform better on failure-prone HPC platforms?,”
11th Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids, Turin, Italy, Springer Verlag, August 2018.
(360.72 KB)
“
DTE: PaRSEC Enabled Libraries and Applications
: 2021 Exascale Computing Project Annual Meeting, April 2021.
(3.24 MB)

DTE: PaRSEC Enabled Libraries and Applications (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(979.27 KB)

DTE: PaRSEC Systems and Interfaces (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(840.54 KB)

Dynamic Task Discovery in PaRSEC- A data-flow task-based Runtime,”
ScalA17, Denver, ACM, September 2017.
DOI: 10.1145/3148226.3148233
(1.15 MB)
“
Evaluating PaRSEC Through Matrix Computations in Scientific Applications,”
Asynchronous Many-Task Systems and Applications - Second International Workshop, WAMTA 2024, Knoxville, TN, USA, February 14-16, 2024, Proceedings, vol. 14626: Springer, pp. 22–33, 2024.
DOI: 10.1007/978-3-031-61763-8_3
(600.76 KB)
“
A Failure Detector for HPC Platforms,”
The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.
DOI: 10.1177/1094342017711505
(1.04 MB)
“
Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,”
Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1432-1441, May 2011.
(1.26 MB)
“
From MPI to OpenSHMEM: Porting LAMMPS,”
OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, Annapolis, MD, USA, Springer International Publishing, pp. 121–137, 2015.
DOI: 10.1007/978-3-319-26428-8_8
“From Serial Loops to Parallel Execution on Distributed Systems,”
International European Conference on Parallel and Distributed Computing (Euro-Par '12), Rhodes, Greece, August 2012.
(203.08 KB)
“
Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC,”
ScalA'19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.
(260.69 KB)
“
Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems,”
Parallel Computing, vol. 39, issue 4-5, pp. 212-232, May 2013.
(1.43 MB)
“
Hierarchical QR Factorization Algorithms for Multi-Core Cluster Systems,”
IPDPS 2012, the 26th IEEE International Parallel and Distributed Processing Symposium, Shanghai, China, IEEE Computer Society Press, May 2012.
(405.71 KB)
“