Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 1274 results:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Abdelfattah, A., H. Anzt, E. Boman, E. Carson, T. Cojean, J. Dongarra, M. Gates, T. Gruetzmacher, N. J. Higham, S. Li, et al., “A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,” SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.

(3.98 MB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020.

(702.38 KB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization,” IEEE High Performance Extreme Computing Conference (HPEC’18), Waltham, MA, IEEE, September 2018.

(729.87 KB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., “High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.

(2.36 MB)

Abdelfattah, A., S. Tomov, and J. Dongarra, Optimizing Batch HGEMM on Small Sizes Using Tensor Cores , San Jose, CA, GPU Technology Conference (GTC), March 2019.

(2.47 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs,” International Conference on Supercomputing (ICS '17), Chicago, Illinois, ACM, June 2017.

(1.04 MB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs,” ScalA19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.

(523.87 KB)

(3.42 MB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.

(675.5 KB)

Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., “A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,” ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021.

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018.

(3.73 MB)

Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.

(9.29 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Performance, Design, and Autotuning of Batched GEMM for GPUs,” High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.

(1.98 MB)

Abdelfattah, A., P. Ghysels, W. Boukaram, S. Tomov, X. Sherry Li, and J. Dongarra, “Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.

(1.57 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.

(2.53 MB)

Abdelfattah, A., H. Anzt, E. G. Boman, E. Carson, T. Cojean, J. Dongarra, A. Fox, M. Gates, N. J. Higham, X. S. Li, et al., “A survey of numerical linear algebra methods utilizing mixed-precision arithmetic,” The International Journal of High Performance Computing Applications, vol. 35, no. 4, pp. 344–369, 2021.

Abdelfattah, A., H. Ltaeif, D. Keyes, and J. Dongarra, “Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs,” Concurrency and Computation: Practice and Experience, vol. 28, issue 12, pp. 3447 - 3465, May 2016.

(3.21 MB)

Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, “GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures,” The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016.

(708.62 KB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,” Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.

(1.3 MB)

Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., “C++ API for Batch BLAS,” SLATE Working Notes, no. 04, ICL-UT-17-12: University of Tennessee, December 2017.

(1.89 MB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17 Performance Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-11: University of Tennessee, June 2017.

(7.57 MB)

Abalenkovs, M., A. Abdelfattah, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, and A. YarKhan, “Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems,” Supercomputing Frontiers and Innovations, vol. 2, no. 4, October 2015.

(3.68 MB)

Abalenkovs, M., N. Bagherpour, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Relton, J. Sistek, D. Stevens, et al., “PLASMA 17.1 Functionality Report,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.

(1.8 MB)

, “The Future of Supercomputing: An Interim Report,” National Research Council, Washington, D.C., The National Academies Press, January 2003.

Main menu

Publications

Pages