Publications

Export 281 results:
Filters: Author is Stan Tomov  [Clear All Filters]
2021
Ayala, A., S. Tomov, A. Haidar, M.. Stoyanov, S. Cayrols, J. Li, G. Bosilca, and J. Dongarra, Accelerating FFT towards Exascale Computing : NVIDIA GPU Technology Conference (GTC2021), 2021.  (27.23 MB)
Kolev, T., P. Fischer, M. Min, J. Dongarra, J. Brown, V. Dobrev, T. Warburton, S. Tomov, M. S. Shephard, A. Abdelfattah, et al., Efficient exascale discretizations: High-order finite element methods,” The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021. DOI: 10.1177/10943420211020803
Iqbal, Z., S. Nooshabadi, I. Yamazaki, S. Tomov, and J. Dongarra, Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems,” IEEE Access, 2021. DOI: 10.1109/ACCESS.2021.3106054  (1.35 MB)
Ayala, A., S. Tomov, P. Luszczek, S. Cayrols, G. Ragghianti, and J. Dongarra, Interim Report on Benchmarking FFT Libraries on High Performance Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-21-03: University of Tennessee, July 2021.  (2.68 MB)
Brown, J., A. Abdelfattah, V. Barra, N. Beams, J-S. Camier, V. Dobrev, Y. Dudouit, L. Ghaffari, T. Kolev, D. Medina, et al., libCEED: Fast algebra for high-order element-based discretizations,” Journal of Open Source Software, vol. 6, no. 63, pp. 2945, 2021. DOI: 10.21105/joss.02945
Sharp, D., M. Stoyanov, S. Tomov, and J. Dongarra, A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms,” ICL Technical Report, no. ICL-UT-21-04: University of Tennessee, August 2021.  (493.17 KB)
Ayala, A., S. Tomov, M. Stoyanov, and J. Dongarra, Scalability Issues in FFT Computation,” International Conference on Parallel Computing Technologies: Springer, pp. 279–287, 2021. DOI: 10.1007/978-3-030-86359-3_21
Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,” ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021. DOI: 10.1145/3431921
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational process: Mathematical software perspective,” Journal of Computational Science, vol. 52, pp. 101216, 2021. DOI: 10.1016/j.jocs.2020.101216
2020
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,” Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL 2020), May 2020.  (188.51 KB)
Lopez, F., E. Chow, S. Tomov, and J. Dongarra, Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-04: University of Tennessee, Knoxville, March 2020.  (188.51 KB)
Kolev, T., P. Fischer, A. Abdelfattah, S. Ananthan, V. Barra, N. Beams, R. Bleile, J. Brown, R. Carson, J-S. Camier, et al., CEED ECP Milestone Report: Improve Performance and Capabilities of CEED-Enabled ECP Applications on Summit/Sierra,” ECP Milestone Reports: Zenodo, May 2020. DOI: 10.5281/zenodo.3860804  (28.12 MB)
Gates, M., S. Tomov, H. Anzt, P. Luszczek, and J. Dongarra, Clover: Computational Libraries Optimized via Exascale Research , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (872 KB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,” 2020 IEEE High Performance Extreme Computing Virtual Conference: IEEE, September 2020.  (476.36 KB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-12: University of Tennessee, August 2020.  (476.36 KB)
Tomov, S., A. Ayala, A. Haidar, and J. Dongarra, FFT-ECP API and High-Performance Library Prototype for 2-D and 3-D FFTs on Large-Scale Heterogeneous Systems with GPUs,” ECP Milestone Report, no. FFT-ECP STML13-27: Innovative Computing Laboratory, University of Tennessee, January 2020.  (9.71 MB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020. DOI: 10.1007/978-3-030-50371-0_19  (2.62 MB)
Ayala, A., S. Tomov, J. Dongarra, and A. Haidar, heFFTe: Highly Efficient FFT for Exascale (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (6.2 MB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) : NVIDIA GPU Technology Conference (GTC2020), October 2020.  (866.88 KB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) , Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.  (1.54 MB)
Beams, N., A. Abdelfattah, S. Tomov, J. Dongarra, T. Kolev, and Y. Dudouit, High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,” 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.  (1.3 MB)
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v1.0 : Zenodo, March 2020. DOI: 10.5281/zenodo.3908549
Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, hipMAGMA v2.0 : Zenodo, July 2020. DOI: 10.5281/zenodo.3928667
Wong, K., S. Tomov, D. Nichols, R. Febbo, F. Lopez, J. Halloy, and X. Ma, How to Build Your Own Deep Neural Network : PEARC20, July 2020.  (18.8 MB)
Tomov, S., K. Wong, J. Dongarra, R. Archibald, E. Chow, E. D'Azevedo, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, et al., Integrating Deep Learning in Domain Science at Exascale (MagmaDNN) , virtual, DOD HPCMP seminar, December 2020.  (11.12 MB)
Archibald, R., E. Chow, E. D'Azevedo, J. Dongarra, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, S. Tomov, K. Wong, et al., Integrating Deep Learning in Domain Sciences at Exascale,” 2020 Smoky Mountains Computational Sciences and Engineering Conference (SMC 2020), August 2020.
Archibald, R., E. Chow, E. D'Azevedo, J. Dongarra, M. Eisenbach, R. Febbo, F. Lopez, D. Nichols, S. Tomov, K. Wong, et al., Integrating Deep Learning in Domain Sciences at Exascale,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-10: University of Tennessee, August 2020.  (1.09 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs,” International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Springer, Cham, June 2020. DOI: 10.1007/978-3-030-50417-5_18  (702.38 KB)
Anzt, H., T. Cojean, C. Yen-Chen, J. Dongarra, G. Flegar, P. Nayak, S. Tomov, Y. M. Tsai, and W. Wang, Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020. DOI: 10.1145/3380930  (5.67 MB)
Farhan, M. Al, A. Abdelfattah, S. Tomov, M. Gates, D. Sukkari, A. Haidar, R. Rosenberg, and J. Dongarra, MAGMA Templates for Scalable Linear Algebra on Emerging Architectures,” The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020. DOI: 10.1177/1094342020938421
Tomov, S., MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020.  (2.28 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,” Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020. DOI: 10.1016/j.jpdc.2020.07.001  (1.3 MB)
Haidar, A., H. Bayraktar, S. Tomov, J. Dongarra, and N. J. Higham, Mixed-Precision Iterative Refinement using Tensor Cores on GPUs to Accelerate Solution of Linear Systems,” Proceedings of the Royal Society A, vol. 476, issue 2243, November 2020. DOI: 10.1098/rspa.2020.0110  (2.24 MB)
Haidar, A., H. Bayraktar, S. Tomov, J. Dongarra, and N. J. Higham, Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05: University of Tennessee, May 2020.  (1.03 MB)
Wong, K., S. Tomov, and J. Dongarra, Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,” The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020. DOI: 10.22369/issn.2153-4136/11/1/7  (4.4 MB)
Lu, Y., I. Yamazaki, F. Ino, Y. Matsushita, S. Tomov, and J. Dongarra, Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD,” Concurrency and Computation: Practice and Experience, April 2020. DOI: 10.1002/cpe.5754  (1.43 MB)
Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., A Set of Batched Basic Linear Algebra Subprograms,” ACM Transactions on Mathematical Software, October 2020.
Abdelfattah, A., H. Anzt, E. Boman, E. Carson, T. Cojean, J. Dongarra, M. Gates, T. Gruetzmacher, N. J. Higham, S. Li, et al., A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,” SLATE Working Notes, no. 15, ICL-UT-20-08: University of Tennessee, July 2020.  (3.98 MB)
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational Process: Mathematical Software Perspective,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.  (752.59 KB)
Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Translational Process: Mathematical Software Perspective,” Journal of Computational Science, September 2020. DOI: 10.1016/j.jocs.2020.101216  (752.59 KB)
2019
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019. DOI: 10.1016/j.parco.2018.10.003  (3.27 MB)
Tomov, S., A. Abdelfattah, V. Barra, N. Beams, J. Brown, J-S. Camier, V. Dobrev, J. Dongarra, Y. Dudouit, P. Fischer, et al., CEED ECP Milestone Report: Performance Tuning of CEED Software and 1st and 2nd Wave Apps : Zenodo, October 2019. DOI: 10.5281/zenodo.3477618  (8.31 MB)
Brown, J., A. Abdelfattah, V. Barra, V. Dobrev, Y. Dudouit, P. Fischer, T. Kolev, D. Medina, M. Min, T. Ratnayaka, et al., CEED ECP Milestone Report: Public release of CEED 2.0 : Zenodo, April 2019. DOI: 10.5281/zenodo.2641316  (4.98 MB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, Design and Implementation for FFT-ECP on Distributed Accelerated Systems,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-05: University of Tennessee, April 2019.  (3.19 MB)
Lopez, M. G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking, vol. 14, issue 2, pp. 165-182. DOI: http://dx.doi.org/10.1504/IJHPCN.2017.10009064  (1.12 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (675.5 KB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, FFT-ECP Fast Fourier Transform , Houston, TX, 2019 ECP Annual Meeting (Research Poster), January 2019.  (1.51 MB)
Tomov, S., A. Haidar, A. Ayala, H. Shaiek, and J. Dongarra, FFT-ECP Implementation Optimizations and Features Phase,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-12: University of Tennessee, October 2019.  (4.14 MB)

Pages