Publications

Export 281 results:
Filters: Author is Stanimire Tomov  [Clear All Filters]
Journal Article
Iqbal, Z., S. Nooshabadi, I. Yamazaki, S. Tomov, and J. Dongarra, Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems,” IEEE Access, 2021. DOI: 10.1109/ACCESS.2021.3106054  (1.35 MB)
Lopez, M. G., W. Joubert, V. Larrea, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, Evaluation of Directive-Based Performance Portable Programming Models,” International Journal of High Performance Computing and Networking, vol. 14, issue 2, pp. 165-182. DOI: http://dx.doi.org/10.1504/IJHPCN.2017.10009064  (1.12 MB)
Kolev, T., P. Fischer, M. Min, J. Dongarra, J. Brown, V. Dobrev, T. Warburton, S. Tomov, M. S. Shephard, A. Abdelfattah, et al., Efficient exascale discretizations: High-order finite element methods,” The International Journal of High Performance Computing Applications, pp. 10943420211020803, 2021. DOI: 10.1177/10943420211020803
Voemel, C., S. Tomov, and J. Dongarra, Divide & Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing (submitted), August 2010.
Voemel, C., S. Tomov, and J. Dongarra, Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems,” SIAM Journal on Scientific Computing, vol. 34(2), pp. C70-C82, April 2012.
Dongarra, J., J. Kurzak, P. Luszczek, and S. Tomov, Dense Linear Algebra on Accelerated Multicore Hardware,” High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
Tomov, S., J. Langou, J. Dongarra, A. Canning, and L-W. Wang, Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures,” International Journal of Computational Science and Engineering, vol. 2, no. 3/4, pp. 205-212, 00 2006.  (428.21 KB)
Tomov, S., J. Langou, A. Canning, L-W. Wang, and J. Dongarra, Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures,” International Journal of Computational Science and Engineering (to appear), January 2005.  (428.21 KB)
Yamazaki, I., S. Tomov, and J. Dongarra, Computing Low-rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and its Application to Solving a Hierarchically Semiseparable Linear System of Equations,” Scientific Programming, 2015.  (648.87 KB)
Sun, J., J. Fu, J. Drake, Q. Zhu, A. Haidar, M. Gates, S. Tomov, and J. Dongarra, Computational Benefit of GPU Optimization for Atmospheric Chemistry Modeling,” Journal of Advances in Modeling Earth Systems, vol. 10, issue 8, pp. 1952–1969, August 2018. DOI: 10.1029/2018MS001276  (3.4 MB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, A Block-Asynchronous Relaxation Method for Graphics Processing Units,” Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1613–1626, December 2013. DOI: http://dx.doi.org/10.1016/j.jpdc.2013.05.008  (1.08 MB)
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.  (608.95 KB)
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,” ICCS 2012, Omaha, NE, June 2012.  (608.95 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018. DOI: 10.1016/j.jocs.2018.01.005  (3.73 MB)
Haidar, A., T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, Batched matrix computations on hardware accelerators based on GPUs,” International Journal of High Performance Computing Applications, February 2015. DOI: 10.1177/1094342014567546  (2.16 MB)
Kurzak, J., S. Tomov, and J. Dongarra, Autotuning GEMM Kernels for the Fermi GPU,” IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 11, November 2012. DOI: 10.1109/TPDS.2011.311  (742.5 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018. DOI: 10.1109/TPDS.2018.2842785  (2.53 MB)
Masliah, I., A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,” Parallel Computing, vol. 81, pp. 1–21, January 2019. DOI: 10.1016/j.parco.2018.10.003  (3.27 MB)
Anzt, H., W. Sawyer, S. Tomov, P. Luszczek, and J. Dongarra, Acceleration of GPU-based Krylov solvers via Data Transfer Reduction,” International Journal of High Performance Computing Applications, 2015.
Gates, M., S. Tomov, and J. Dongarra, Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs,” Parallel Computing, vol. 74, pp. 3–18, May 2018. DOI: 10.1016/j.parco.2017.10.004  (1.34 MB)
Dong, T., A. Haidar, S. Tomov, and J. Dongarra, Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018. DOI: 10.1016/j.jocs.2018.01.007  (2.18 MB)
Tomov, S., R. Nath, and J. Dongarra, Accelerating the Reduction to Upper Hessenberg, Tridiagonal, and Bidiagonal Forms through Hybrid GPU-Based Computing,” Parallel Computing, vol. 36, no. 12, pp. 645-654, 00 2010.  (1.39 MB)
Baboulin, M., A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov, Accelerating Scientific Computations with Mixed Precision Algorithms,” Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009. DOI: 10.1016/j.cpc.2008.11.005  (402.69 KB)
Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques,” INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.  (358.79 KB)
Baboulin, M., J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques,” ACM Transactions on Mathematical Software (also LAWN 246), vol. 39, issue 2, February 2013. DOI: 10.1145/2427023.2427025  (358.79 KB)
Nath, R., S. Tomov, and J. Dongarra, Accelerating GPU Kernels for Dense Linear Algebra,” Proc. of VECPAR'10, Berkeley, CA, June 2010.  (615.07 KB)
Conference Proceedings
Haidar, A., Y. Jia, P. Luszczek, S. Tomov, A. YarKhan, and J. Dongarra, Weighted Dynamic Scheduling with Many Parallelism Grains for Offloading of Numerical Workloads to Multiple Varied Accelerators,” Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'15), vol. No. 5, Austin, TX, ACM, November 2015.  (347.6 KB)
Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,” Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.  (764.02 KB)
Bosilca, G., A. Bouteiller, T. Herault, P. Lemariner, N. Ohm Saengpatsa, S. Tomov, and J. Dongarra, A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
Canning, A., J. Dongarra, J. Langou, O. Marques, S. Tomov, C. Voemel, and L-W. Wang, Towards bulk based preconditioning for quantum dot computations,” IEEE/ACM Proceedings of HPCNano SC06 (to appear), January 2006.  (172.46 KB)
Baboulin, M., S. Tomov, and J. Dongarra, Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,” PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 2008.
Ayala, A., S. Tomov, M. Stoyanov, and J. Dongarra, Scalability Issues in FFT Computation,” International Conference on Parallel Computing Technologies: Springer, pp. 279–287, 2021. DOI: 10.1007/978-3-030-86359-3_21
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Canning, A., J. Dongarra, J. Langou, O. Marques, S. Tomov, C. Voemel, and L-W. Wang, Performance evaluation of eigensolvers in nano-structure computations,” IEEE/ACM Proceedings of HPCNano SC06 (to appear), January 2006.  (120.61 KB)
Tomov, S., W. Lu, J. Bernholc, S. Moore, and J. Dongarra, Performance evaluation for petascale quantum simulation tools,” Proceedings of CUG09, Atlanta, GA, May 2009.  (1.09 MB)
Tomov, S., W. Lu, J. Bernholc, S. Moore, and J. Dongarra, Performance Evaluation for Petascale Quantum Simulation Tools,” Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 2010.
Nath, R., S. Tomov, T. Dong, and J. Dongarra, Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs,” ACM/IEEE Conference on Supercomputing (SC’11), Seattle, WA, November 2011.  (630.63 KB)
Yamazaki, I., S. Tomov, and J. Dongarra, One-Sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators,” The International Conference on Computational Science (ICCS), June 2012.
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, and S. Tomov, Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects,” Journal of Physics: Conference Series, vol. 180, 00 2009.  (119.37 KB)
Li, Y., J. Dongarra, and S. Tomov, A Note on Auto-tuning GEMM for GPUs,” 9th International Conference on Computational Science (ICCS 2009), no. 5544-5545, Baton Rouge, LA, pp. 884-892, May 2009. DOI: 10.1007/978-3-642-01970-8_89  (236.02 KB)
Du, P., P. Luszczek, S. Tomov, and J. Dongarra, Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,” First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.  (1.24 MB)
Haidar, A., S. Tomov, J. Dongarra, R. Solcà, and T. C. Schulthess, Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations,” International Supercomputing Conference (ISC), Lecture Notes in Computer Science, vol. 7905, Leipzig, Germany, Springer Berlin Heidelberg, pp. 67-80, June 2013. DOI: 10.1007/978-3-642-38750-0_6  (2.14 MB)
Canning, A., J. Dongarra, J. Langou, O. Marques, S. Tomov, C. Voemel, and L-W. Wang, Interior State Computation of Nano Structures,” PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway, May 2008.  (137.12 KB)
Dongarra, J., S. Moore, G. D. Peterson, S. Tomov, J. Allred, V. Natoli, and D. Richie, Exploring New Architectures in Accelerating CFD for Air Force Applications,” Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.  (492.86 KB)
Song, F., S. Tomov, and J. Dongarra, Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems,” 26th ACM International Conference on Supercomputing (ICS 2012), San Servolo Island, Venice, Italy, ACM, June 2012.  (5.88 MB)
Haidar, A., A. Abdelfattah, M. Zounon, P. Wu, S. Pranesh, S. Tomov, and J. Dongarra, The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques,” International Conference on Computational Science (ICCS 2018), vol. 10860, Wuxi, China, Springer, pp. 586–600, June 2018. DOI: 10.1007/978-3-319-93698-7_45  (487.88 KB)
Tomov, S., R. Nath, H. Ltaeif, and J. Dongarra, Dense Linear Algebra Solvers for Multicore with GPU Accelerators,” Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, Atlanta, GA, pp. 1-8, 2010. DOI: 10.1109/IPDPSW.2010.5470941  (1 MB)
Tomov, S., J. Langou, A. Canning, L-W. Wang, and J. Dongarra, Comparison of Nonlinear Conjugate-Gradient methods for computing the Electronic Properties of Nanostructure Architectures,” Proceedings of 5th International Conference on Computational Science (ICCS), Atlanta, GA, USA, Springer's Lecture Notes in Computer Science, pp. 317-325, January 2005.  (172.86 KB)
Horton, M., S. Tomov, and J. Dongarra, A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures,” Symposium for Application Accelerators in High Performance Computing (SAAHPC'11), Knoxville, TN, July 2011.  (329.68 KB)
Baboulin, M., S. Donfack, J. Dongarra, L. Grigori, A. Remi, and S. Tomov, A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,” Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.

Pages