Publications

Export 192 results:
Filters: Author is Piotr Luszczek  [Clear All Filters]
2022
Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, Analysis of the Communication and Computation Cost of FFT Libraries towards Exascale,” ICL Technical Report, no. ICL-UT-22-07: Innovative Computing Laboratory, July 2022.  (5.91 MB)
Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, FFT Benchmark Performance Experiments on Systems Targeting Exascale,” ICL Technical Report, no. ICL-UT-22-02, March 2022.  (5.87 MB)
Sid-Lakhdar, W. M., S. Cayrols, D. Bielich, A. Abdelfattah, P. Luszczek, M. Gates, S. Tomov, H. Johansen, D. Williams-Young, T. A. Davis, et al., PAQR: Pivoting Avoiding QR factorization,” ICL Technical Report, no. ICL-UT-22-06, June 2022.  (364.85 KB)
Murray, R., J. Demmel, M. W. Mahoney, B. N. Erichson, M. Melnichenko, O. Asif Malik, L. Grigori, P. Luszczek, M. Dereziński, M. E. Lopes, et al., Randomized Numerical Linear Algebra: A Perspective on the Field with an Eye to Software,” University of California, Berkeley EECS Technical Report, no. UCB/EECS-2022-258: University of California, Berkeley, November 2022. DOI: 10.48550/arXiv.2302.11474  (1.05 MB) (1.54 MB)
Luszczek, P., and C. Brown, Surrogate ML/AI Model Benchmarking for FAIR Principles' Conformance,” 2022 IEEE High Performance Extreme Computing Conference (HPEC): IEEE, September 2022. DOI: 10.1109/HPEC55821.2022.9926401
Lindquist, N., M. Gates, P. Luszczek, and J. Dongarra, Threshold Pivoting for Dense LU Factorization,” ScalAH22: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems , Dallas, Texas, IEEE, November 2022. DOI: 10.1109/ScalAH56622.2022.00010  (721.77 KB)
2021
Lindquist, N., P. Luszczek, and J. Dongarra, Accelerating Restarted GMRES with Mixed Precision Arithmetic,” IEEE Transactions on Parallel and Distributed Systems, June 2021. DOI: 10.1109/TPDS.2021.3090757  (572.4 KB)
Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, Interim Report on Benchmarking FFT Libraries on High Performance Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.  (2.68 MB)
Penchoff, D. A., E. Valeev, H. Jagode, P. Luszczek, A. Danalis, G. Bosilca, R. J. Harrison, J. Dongarra, and T. L. Windus, An Introduction to High Performance Computing and Its Intersection with Advances in Modeling Rare Earth Elements and Actinides,” Rare Earth Elements and Actinides: Progress in Computational Science Applications, vol. 1388, Washington, DC, American Chemical Society, pp. 3-53, October 2021. DOI: 10.1021/bk-2021-1388.ch001
Jagode, H., H. Anzt, H. Ltaief, and P. Luszczek, Lecture Notes in Computer Science: High Performance Computing , vol. 12761: Springer International Publishing, 2021. DOI: 10.1007/978-3-030-90539-2
Spannaus, A., K. J. H. Law, P. Luszczek, F. Nasrin, C. Putman Micucci, P. K. Liaw, L. J. Santodonato, D. J. Keffer, and V. Maroulas, Materials fingerprinting classification,” Computer Physics Communications, pp. 108019, May Jan. DOI: 10.1016/j.cpc.2021.108019  (3.8 MB)
Tsai, Y. M., P. Luszczek, and J. Dongarra, Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices,” ICL Technical Report, no. ICL-UT-21-05, August 2021.  (3.93 MB)
Hoemmen, M., D. Hollman, C. Trott, D. Sunderland, N. Liber, L-T. Lo, D. Lebrun-Grandie, G. Lopez, P. Caday, S. Knepper, et al., P1673R3: A Free Function Linear algebra Interface Based on the BLAS,” ISO JTC1 SC22 WG22, no. P1673R3: ISO, April 2021.  (858.89 KB)
Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,” ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021. DOI: 10.1145/3431921
Bak, S., O. Hernandez, M. Gates, P. Luszczek, and V. Sarkar, Task-graph scheduling extensions for efficient synchronization and communication,” Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021. DOI: 10.1145/3447818.3461616
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational process: Mathematical software perspective,” Journal of Computational Science, vol. 52, pp. 101216, 2021. DOI: 10.1016/j.jocs.2020.101216
2020
Gates, M., S. Tomov, H. Anzt, P. Luszczek, and J. Dongarra, Clover: Computational Libraries Optimized via Exascale Research , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (872 KB)
Pei, Y., Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime,” 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), New Orleans, LA, IEEE, May 2020. DOI: 10.1109/IPDPSW50202.2020.00127  (1.33 MB)
Zaitsev, D., and P. Luszczek, Docker Container based PaaS Cloud Computing Comprehensive Benchmarks using LAPACK,” Computer Modeling and Intelligent Systems CMIS-2020, Zaporizhzhoa, March 2020.  (451.33 KB)
Lindquist, N., P. Luszczek, and J. Dongarra, Improving the Performance of the GMRES Method using Mixed-Precision Techniques,” Smoky Mountains Computational Sciences & Engineering Conference (SMC2020), August 2020.  (600.33 KB)
Beck, M., T. Moore, P. Luszczek, and A. Danalis, Interoperable Convergence of Storage, Networking, and Computation,” Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), no. 2: Springer International Publishing, pp. 667-690, 2020.  (1.8 MB)
Luszczek, P., and J. Dongarra, The PLASMA Library on CORAL Systems and Beyond (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (550.86 KB)
Demmel, J., J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. Mahoney, Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),” LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.  (1.41 MB)
Lindquist, N., P. Luszczek, and J. Dongarra, Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques,” 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), Atlanta, GA, IEEE, November 2020.  (184.6 KB)
Luszczek, P., Y. Tsai, N. Lindquist, H. Anzt, and J. Dongarra, Scalable Data Generation for Evaluating Mixed-Precision Solvers,” 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, IEEE, September 2020. DOI: 10.1109/HPEC43674.2020.9286145  (1.3 MB)
Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., A Set of Batched Basic Linear Algebra Subprograms,” ACM Transactions on Mathematical Software, October 2020.
Abdelfattah, A., H. Anzt, E. Boman, E. Carson, T. Cojean, J. Dongarra, M. Gates, T. Gruetzmacher, N. J. Higham, S. Li, et al., A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,” SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.  (3.98 MB)
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational Process: Mathematical Software Perspective,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.  (752.59 KB)
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational Process: Mathematical Software Perspective,” Journal of Computational Science, September 2020. DOI: 10.1016/j.jocs.2020.101216  (752.59 KB)
Tsai, Y., P. Luszczek, and J. Dongarra, Using Quantized Integer in LU Factorization with Partial Pivoting (Poster) , Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.  (6.65 MB)
2019
Antoniu, G., A. Costan, O. Marcu, M. S. Pérez, N. Stojanovic, R. M. Badia, M. Vázquez, S. Girona, M. Beck, T. Moore, et al., A Collection of White Papers from the BDEC2 Workshop in Poznan, Poland,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-10: University of Tennessee, Knoxville, May 2019.  (5.82 MB)
Luszczek, P., I. Yamazaki, and J. Dongarra, Increasing Accuracy of Iterative Refinement in Limited Floating-Point Arithmetic on Half-Precision Accelerators,” IEEE High Performance Extreme Computing Conference (HPEC 2019), Best Paper Finalist, Waltham, MA, IEEE, September 2019.  (470.21 KB)
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019. DOI: 10.1145/3264491  (7.5 MB)
Danalis, A., H. Jagode, T. Herault, P. Luszczek, and J. Dongarra, Software-Defined Events through PAPI,” 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019. DOI: 10.1109/IPDPSW.2019.00069  (446.41 KB)
2018
Dongarra, J., M. Gates, J. Kurzak, P. Luszczek, and Y. Tsai, Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators,” Proceedings of the IEEE, vol. 106, issue 11, pp. 2040–2055, November 2018. DOI: 10.1109/JPROC.2018.2868961  (2.53 MB)
Luszczek, P., J. Kurzak, I. Yamazaki, D. Keffer, V. Maroulas, and J. Dongarra, Autotuning Techniques for Performance-Portable Point Set Registration in 3D,” Supercomputing Frontiers and Innovations, vol. 5, no. 4, December 2018. DOI: 10.14529/jsfi180404  (720.15 KB)
Dongarra, J., I. Duff, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Hogg, P. Valero Lara, P. Luszczek, M. Zounon, et al., Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification , July 2018.  (483.05 KB)
Abdelfattah, A., M. Gates, J. Kurzak, P. Luszczek, and J. Dongarra, Implementation of the C++ API for Batch BLAS,” SLATE Working Notes, no. 07, ICL-UT-18-04: Innovative Computing Laboratory, University of Tennessee, June 2018.  (1.07 MB)
Kurzak, J., M. Gates, I. Yamazaki, A. Charara, A. YarKhan, J. Finney, G. Ragghianti, P. Luszczek, and J. Dongarra, Linear Systems Performance Report,” SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.  (1.64 MB)
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Wu, P. Luszczek, J. Finney, and J. Dongarra, Parallel BLAS Performance Report,” SLATE Working Notes, no. 05, ICL-UT-18-01: University of Tennessee, April 2018.  (4.39 MB)
Kurzak, J., M. Gates, A. YarKhan, I. Yamazaki, P. Luszczek, J. Finney, and J. Dongarra, Parallel Norms Performance Report,” SLATE Working Notes, no. 06, ICL-UT-18-06: Innovative Computing Laboratory, University of Tennessee, June 2018.  (1.13 MB)
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale,” SIAM Review, vol. 60, issue 4, pp. 808–865, November 2018. DOI: 10.1137/17M1117732  (2.5 MB)
Dorris, J., A. YarKhan, J. Kurzak, P. Luszczek, and J. Dongarra, Task Based Cholesky Decomposition on Xeon Phi Architectures using OpenMP,” International Journal of Computational Science and Engineering (IJCSE), vol. 17, no. 3, October 2018. DOI: http://dx.doi.org/10.1504/IJCSE.2018.095851
2017
Gates, M., J. Kurzak, P. Luszczek, Y. Pei, and J. Dongarra, Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices,” Parallel and Distributed Processing Symposium Workshops (IPDPSW), Orlando, FL, IEEE, June 2017. DOI: 10.1109/IPDPSW.2017.18
Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017. DOI: 10.1007/978-3-319-49340-4  (1.22 MB)
Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., C++ API for Batch BLAS,” SLATE Working Notes, no. 04, ICL-UT-17-12: University of Tennessee, December 2017.  (1.89 MB)
Gates, M., P. Luszczek, A. Abdelfattah, J. Kurzak, J. Dongarra, K. Arturov, C. Cecka, and C. Freitag, C++ API for BLAS and LAPACK,” SLATE Working Notes, no. 02, ICL-UT-17-03: Innovative Computing Laboratory, University of Tennessee, June 2017.  (1.12 MB)

Pages