Publications

A

Abdulah, S., Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief, and Y. Sun, “Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, issue 4, pp. 964 - 976, April 2022.

Jagode, H., A. Danalis, and J. Dongarra, “Accelerating NWChem Coupled Cluster through dataflow-based Execution,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 540--551, July 2018.

(1.68 MB)

Jagode, H., A. Danalis, G. Bosilca, and J. Dongarra, “Accelerating NWChem Coupled Cluster through dataflow-based Execution,” 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Krakow, Poland, Springer International Publishing, September 2015.

(452.82 KB)

Jagode, H., A. Danalis, and J. Dongarra, “Accelerating NWChem Coupled Cluster through Dataflow-Based Execution,” The International Journal of High Performance Computing Applications, pp. 1–13, January 2017.

(4.07 MB)

Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, “Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results,” Parallel Computing, vol. 52, pp. 22-41, February 2016.

(2.06 MB)

C

Schuchart, J., P. Samfass, C. Niethammer, J. Gracia, and G. Bosilca, “Callback-based completion notification using MPI Continuations,” Parallel Computing, vol. 21238566, issue 0225, pp. 102793, May Jan.

Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, “Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, IEEE, May 2020.

(1.33 MB)

Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, “Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms,” International Journal of Networking and Computing, vol. 12, issue 1, pp. 26 - 46, January 2022.

Herault, T., J. Schuchart, E. F. Valeev, and G. Bosilca, “Composition of Algorithmic Building Blocks in Template Task Graphs,” 2022 IEEE/ACM Parallel Applications Workshop: Alternatives To MPI+X (PAW-ATM), Dallas, TX, USA, IEEE, January 2023, 2022.

(1015.99 KB)

D

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, “DAGuE: A generic distributed DAG Engine for High Performance Computing.,” Parallel Computing, vol. 38, no. 1-2: Elsevier, pp. 27-51, 00 2012.

(830.85 KB)

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Lemariner, and J. Dongarra, “DAGuE: A Generic Distributed DAG Engine for High Performance Computing,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1151-1158, 00 2011.

(830.85 KB)

Bouteiller, A., G. Bosilca, T. Herault, and J. Dongarra, “Data Movement Interfaces to Support Dataflow Runtimes,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-03: University of Tennessee, May 2018.

(210.94 KB)

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, P. Luszczek, and J. Dongarra, “Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach,” Scalable Computing and Communications: Theory and Practice: John Wiley & Sons, pp. 699-735, March 2013.

(1.01 MB)

Cao, C., T. Herault, G. Bosilca, and J. Dongarra, “Design for a Soft Error Resilient Dynamic Task-based Runtime,” ICL Technical Report, no. ICL-UT-14-04: University of Tennessee, November 2014.

(2.61 MB)

Cao, C., G. Bosilca, T. Herault, and J. Dongarra, “Design for a Soft Error Resilient Dynamic Task-based Runtime,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.

(2.31 MB)

Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, “Designing LU-QR Hybrid Solvers for Performance and Stability,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(4.2 MB)

Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., “Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA,” University of Tennessee Computer Science Technical Report, UT-CS-10-660, September 2010.

(366.26 KB)

Bosilca, G., A. Bouteiller, T. Herault, V. Le Fèvre, Y. Robert, and J. Dongarra, “Distributed Termination Detection for HPC Task-Based Environments,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-14: University of Tennessee, June 2018.

Herault, T., Y. Robert, G. Bosilca, R. Harrison, C. Lewis, E. Valeev, and J. Dongarra, “Distributed-Memory Multi-GPU Block-Sparse Tensor Contraction for Electronic Structure,” 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.

Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Enabled Libraries and Applications : 2021 Exascale Computing Project Annual Meeting, April 2021.

(3.24 MB)

Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Enabled Libraries and Applications (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(979.27 KB)

Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Systems and Interfaces (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.

(840.54 KB)

Hoque, R., T. Herault, G. Bosilca, and J. Dongarra, “Dynamic Task Discovery in PaRSEC- A data-flow task-based Runtime,” ScalA17, Denver, ACM, September 2017.

(1.15 MB)

E

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,” Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.

(1.42 MB)

Baboulin, M., D. Becker, G. Bosilca, A. Danalis, and J. Dongarra, “An efficient distributed randomized solver with application to large dense linear systems,” ICL Technical Report, no. ICL-UT-12-02, July 2012.

(626.26 KB)

Cao, Q., G. Bosilca, N. Losada, W. Wu, D. Zhong, and J. Dongarra, “Evaluating Data Redistribution in PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 8, pp. 1856-1872, August 2022.

(3.19 MB)

Jagode, H., A. Danalis, R. Hoque, M. Faverge, and J. Dongarra, “Evaluation of Dataflow Programming Models for Electronic Structure Theory,” Concurrency and Computation: Practice and Experience: Special Issue on Parallel and Distributed Algorithms, vol. 2018, issue e4490, pp. 1–20, May 2018.

(1.69 MB)

Pei, Y., G. Bosilca, I. Yamazaki, A. Ida, and J. Dongarra, “Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization,” PAW-ATM Workshop at SC19, Denver, CO, ACM, November 2019.

(4.51 MB)

Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,” Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020.

(2.71 MB)

F

Bosilca, G., A. Bouteiller, A. Guermouche, T. Herault, Y. Robert, P. Sens, and J. Dongarra, “A Failure Detector for HPC Platforms,” The International Journal of High Performance Computing Applications, vol. 32, issue 1, pp. 139–158, January 2018.

(1.04 MB)

Cao, Q., G. Bosilca, W. Wu, D. Zhong, A. Bouteiller, and J. Dongarra, “Flexible Data Redistribution in a Task-Based Runtime System,” IEEE International Conference on Cluster Computing (Cluster 2020), Kobe, Japan, IEEE, September 2020.

(354.8 KB)

Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemariner, H. Ltaeif, et al., “Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,” Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2011 Workshops), Anchorage, Alaska, USA, IEEE, pp. 1432-1441, May 2011.

(1.26 MB)

Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, and J. Dongarra, “From Serial Loops to Parallel Execution on Distributed Systems,” International European Conference on Parallel and Distributed Computing (Euro-Par '12), Rhodes, Greece, August 2012.

(203.08 KB)

G

Schuchart, J., P. Nookala, M. Mahdi Javanmard, T. Herault, E. F. Valeev, G. Bosilca, and R. J. Harrison, “Generalized Flow-Graph Programming Using Template Task-Graphs: Initial Implementation and Assessment,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.

Herault, T., Y. Robert, G. Bosilca, and J. Dongarra, “Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC,” ScalA'19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.

(260.69 KB)

H

Wu, W., A. Bouteiller, G. Bosilca, M. Faverge, and J. Dongarra, “Hierarchical DAG scheduling for Hybrid Distributed Systems,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.

(1.11 MB)

I

Aupy, G., M. Faverge, Y. Robert, J. Kurzak, P. Luszczek, and J. Dongarra, “Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,” Lawn 277, no. UT-CS-13-709, May 2013.

(298.63 KB)

Mor, O., G. Bosilca, and M. Snir, “Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, September 2023.

L

Cao, Q., Y. Pei, K. Akbudak, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,” 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.

(1.08 MB)

M

Faverge, M., J. Herrmann, J. Langou, B. Lowery, Y. Robert, and J. Dongarra, “Mixing LU-QR Factorization Algorithms to Design High-Performance Dense Linear Algebra Solvers,” Journal of Parallel and Distributed Computing, vol. 85, pp. 32-46, November 2015.

(5.06 MB)

P

Danalis, A., H. Jagode, and J. Dongarra, PAPI's new Software-Defined Events for in-depth Performance Analysis , Dresden, Germany, 13th Parallel Tools Workshop, September 2019.

(3.14 MB)

Jagode, H., A. Danalis, and J. Dongarra, PAPI's New Software-Defined Events for In-Depth Performance Analysis , Lyon, France, CCDSC 2018: Workshop on Clusters, Clouds, and Data for Scientific Computing, September 2018.

Bosilca, G., A. Bouteiller, A. Danalis, M. Faverge, T. Herault, and J. Dongarra, “PaRSEC: Exploiting Heterogeneity to Enhance Scalability,” IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013.

(2.16 MB)

Danalis, A., H. Jagode, G. Bosilca, and J. Dongarra, “PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.

(1.77 MB)

Cao, Q., Y. Pei, T. Herault, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools,” Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19, Denver, CO, ACM, November 2019.

(429.55 KB)

Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, “Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,” IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 2011.

(290.98 KB)

McCraw, H., J. Ralph, A. Danalis, and J. Dongarra, “Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,” 2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014.

(3.45 MB)

Danalis, A., G. Bosilca, A. Bouteiller, T. Herault, and J. Dongarra, “PTG: An Abstraction for Unhindered Parallelism,” International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, LA, IEEE Press, November 2014.

(480.05 KB)

Schuchart, J., P. Nookala, T. Herault, E. F. Valeev, and G. Bosilca, “Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), Heidelberg, Germany, IEEE, September 2022.

R

Cao, Q., S. Abdulah, H. Ltaief, M. G. Genton, D. Keyes, and G. Bosilca, “Reducing Data Motion and Energy Consumption of Geospatial Modeling Applications Using Automated Precision Conversion,” 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, IEEE, November 2023.

Main menu

Publications

Pages