Publications

Export 1273 results:
Filters: 10.1109 is TPDS.2021.3131657  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
B
Bosilca, G., C. Coti, T. Herault, P. Lemariner, and J. Dongarra, Constructing resiliant communication infrastructure for runtime environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-09-02, July 2009.  (463.71 KB)
Bosilca, G., A. Bouteiller, T. Herault, Y. Robert, and J. Dongarra, Assessing the impact of ABFT and Checkpoint composite strategies,” University of Tennessee Computer Science Technical Report, no. ICL-UT-13-03, 2013.  (968.47 KB)
Bosilca, G., A. Bouteiller, E. Brunet, F. Cappello, J. Dongarra, A. Guermouche, T. Herault, Y. Robert, F. Vivien, and D. Zaidouni, Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,” University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.  (2.76 MB)
Bosilca, G., Z. Chen, J. Dongarra, and J. Langou, Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,” ICL Technical Report, no. ICL-UT-04-04, January 2004.  (241.36 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.  (290.98 KB)
Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms,” International Journal of Networking and Computing, vol. 12, issue 1, pp. 26 - 46, January 2022.
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.  (1.04 MB)
Bosilca, G., Z. Chen, J. Dongarra, V. Eijkhout, G. Fagg, E. Fuentes, J. Langou, P. Luszczek, J. Pjesivac–Grbovic, K. Seymour, et al., Self Adapting Numerical Software SANS Effort,” IBM Journal of Research and Development, vol. 50, no. 2/3, pp. 223-238, January 2006.  (357.53 KB)
Bosilca, G., Z. Chen, J. Dongarra, and J. Langou, Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,” University of Tennessee Computer Science Department Technical Report, UT-CS-04-538, 00 2005.  (241.36 KB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, DAGuE: A generic distributed DAG Engine for High Performance Computing.,” Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.  (830.85 KB)
Bosilca, G., R. Delmas, J. Dongarra, and J. Langou, Algorithmic Based Fault Tolerance Applied to High Performance Computing,” University of Tennessee Computer Science Technical Report, UT-CS-08-620 (also LAPACK Working Note 205), January 2008.  (313.55 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Bosilca, G., C. Coti, T. Herault, P. Lemariner, and J. Dongarra, Constructing Resiliant Communication Infrastructure for Runtime Environments in Advances in Parallel Computing,” Advances in Parallel Computing - Parallel Computing: From Multicores and GPU's to Petascale, vol. 19, pp. 441-451, 2010.
Bosilca, G., T. Herault, P. Lemariner, J. Dongarra, and A.. Rezmerita, Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure,” Proceedings of Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, vol. 6960, Santorini, Greece, Springer, pp. 342-344, September 2011.  (115.75 KB)
Bosilca, G., D. Genet, R. Harrison, T. Herault, M. Mahdi Javanmard, C. Peng, and E. Valeev, Tensor Contraction on Distributed Hybrid Architectures using a Task-Based Runtime System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-13: University of Tennessee, December 2018.  (326.11 KB)
Bosilca, G., J. Dongarra, G. Fagg, and J. Langou, Hash Functions for Datatype Signatures in MPI,” Proceedings of 12th European Parallel Virtual Machine and Message Passing Interface Conference - Euro PVM/MPI, vol. 3666, Sorrento (Naples), Italy, Springer-Verlag Berlin, pp. 76-83, September 2005.  (304.2 KB)
Bosilca, G., J. Dongarra, and H. Ltaeif, Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems,” Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September 2012.  (290.27 KB)
Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, Failure Detection and Propagation in HPC Systems,” Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:1-27:11, November 2016.
Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Systems and Interfaces (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (840.54 KB)
Bosilca, G., T. Herault, A.. Rezmerita, and J. Dongarra, On Scalability for MPI Runtime Systems,” University of Tennessee Computer Science Technical Report, no. ICL-UT-11-05, Knoxville, TN, May 2011.  (898.76 KB)
Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project,” Innovative Computing Laboratory Technical Report, no. ICL-UT-10-02, 00 2010.  (400.75 KB)
Bosilca, G., T. Herault, and J. Dongarra, Context Identifier Allocation in Open MPI,” University of Tennessee Computer Science Technical Report, no. ICL-UT-16-01: Innovative Computing Laboratory, University of Tennessee, January 2016.  (490.89 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, and J. Dongarra, Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols,” Proceedings of EuroMPI 2010, Stuttgart, Germany, Springer, September 2010.  (202.87 KB)
Anzt, H., M. Casas, C. I. Malossi, E. S. Quintana-Ortí, F. Scheidegger, and S. Zhuang, Approximate Computing for Scientific Applications,” Approximate Computing Techniques, 322: Springer International Publishing, pp. 415 - 465, January 2022.
Boulet, P., J. Dongarra, Y. Robert, and F. Vivien, Static Tiling for Heterogeneous Computing Platforms,” Parallel Computing, vol. 25, no. 5, pp. 547-568, January 1999.  (301.17 KB)
Boulet, P., J. Dongarra, F. Rastello, Y. Robert, and F. Vivien, Algorithmic Issues on Heterogeneous Computing Platforms,” Parallel Processing Letters, vol. 9, no. 2, pp. 197-213, January 1999.  (301.17 KB)
Bouteiller, A., G. Bosilca, T. Herault, and J. Dongarra, Data Movement Interfaces to Support Dataflow Runtimes,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.  (210.94 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Redesigning the Message Logging Model for High Performance,” Concurrency and Computation: Practice and Experience (online version), June 2010.  (438.42 KB)
Bouteiller, A., T. Ropars, G. Bosilca, C. Morin, and J. Dongarra, Reasons for a Pessimistic or Optimistic Message Logging Protocol in MPI Uncoordinated Failure Recovery,” CLUSTER '09, New Orleans, IEEE, August 2009.  (191.36 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Redesigning the Message Logging Model for High Performance,” International Supercomputer Conference (ISC 2008), Dresden, Germany, January 2008.  (622.1 KB)
Bouteiller, A., and G. Bosilca, Implicit Actions and Non-blocking Failure Recovery with MPI,” 2022 IEEE/ACM 12th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), Dallas, TX, USA, IEEE, January 2023, 2022.
Bouteiller, A., T. Herault, G. Bosilca, P. Du, and J. Dongarra, Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,” ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015.  (1.14 MB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery,” 22nd European MPI Users' Group Meeting, Bordeaux, France, ACM, September 2015.  (543.32 KB)
Bouteiller, A., G. Bosilca, and J. Dongarra, Retrospect: Deterministic Relay of MPI Applications for Interactive Distributed Debugging,” Accepted for Euro PVM/MPI 2007: Springer, September 2007.
Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, Multi-criteria Checkpointing Strategies: Response-Time versus Resource Utilization,” Euro-Par 2013, Aachen, Germany, Springer, August 2013.  (431.84 KB)
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Proceedings of 17th International Conference, Euro-Par 2011, Part II, vol. 6853, Bordeaux, France, Springer, pp. 51-64, August 2011.  (486.68 KB)
Bouteiller, A., F. Cappello, J. Dongarra, A. Guermouche, T. Herault, and Y. Robert, Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization,” University of Tennessee Computer Science Technical Report, no. ICL-UT-13-01, February 2013.  (497.64 KB)
Bouteiller, A., T. Herault, G. Bosilca, and J. Dongarra, Correlated Set Coordination in Fault Tolerant Message Logging Protocols,” Concurrency and Computation: Practice and Experience, vol. 25, issue 4, pp. 572-585, March 2013.  (636.68 KB)
Bouteiller, A., G. Bosilca, and M G. Venkata, Surviving Errors with OpenSHMEM,” OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments, Baltimore, MD, USA, Springer International Publishing, pp. 66–81, 2016.
Bouteiller, A., T. Herault, and G. Bosilca, A Multithreaded Communication Substrate for OpenSHMEM,” 8th International Conference on Partitioned Global Address Space Programming Models (PGAS), Eugene, OR, October 2014.  (261.66 KB)
Bouteiller, A., and F. Desprez, Fault Tolerance Management for a Hierarchical GridRPC Middleware,” 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), Lyon, France, January 2008.  (319.79 KB)
Bouteiller, A., S. Pophale, S. Boehm, M. B. Baker, and M G. Venkata, Evaluating Contexts in OpenSHMEM-X Reference Implementation,” OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, Cham, Springer International Publishing, pp. 50–62, 2018.
Brady, T., A. Lastovetsky, K. Seymour, M. Guidolin, and J. Dongarra, SmartGridRPC: The new RPC model for high performance Grid Computing and Its Implementation in SmartGridSolve,” Concurrency and Computation: Practice and Experience (to appear), January 2010.  (1.08 MB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v2.0 : Zenodo, July 2020.
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,” 2020 IEEE High Performance Extreme Computing Virtual Conference: IEEE, September 2020.  (476.36 KB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v1.0 : Zenodo, March 2020.
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-12: University of Tennessee, August 2020.  (476.36 KB)
Brown, J., A. Abdelfattah, V. Barra, V. Dobrev, Y. Dudouit, P. Fischer, T. Kolev, D. Medina, M. Min, T. Ratnayaka, et al., CEED ECP Milestone Report: Public release of CEED 2.0 : Zenodo, April 2019.  (4.98 MB)
Brown, J., A. Abdelfattah, V. Barra, N. Beams, J-S. Camier, V. Dobrev, Y. Dudouit, L. Ghaffari, T. Kolev, D. Medina, et al., libCEED: Fast algebra for high-order element-based discretizations,” Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021.
Browne, S., J. Dongarra, and A. Trefethen, Numerical Libraries and Tools for Scalable Parallel Cluster Computing,” IEEE Cluster Computing BOF at SC99, Portland, Oregon, January 1999.  (37.38 KB)

Pages