Publications

Export 1275 results:
Filters: 10.1007 is 978-3-030-90539-2  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Alomairy, R., M. Gates, S. Cayrols, D. Sukkari, K. Akbudak, A. YarKhan, P. Bagwell, and J. Dongarra, Communication Avoiding LU with Tournament Pivoting in SLATE,” SLATE Working Notes, no. 18, ICL-UT-22-01, January 2022.  (3.74 MB)
Computational Science – ICCS 2009, Proceedings of the 9th International Conference,” Lecture Notes in Computer Science: Theoretical Computer Science and General Issues, vol. -, no. 5544-5545, Baton Rouge, LA, May 2009.
Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Ortí, and A. E. Thomas, Compressed basis GMRES on high-performance graphics processing units,” The International Journal of High Performance Computing Applications, May 2022.  (13.52 MB)
Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Orti, and A. E. Thomas, Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units,” Concurrency and Computation: Practice and Experience, vol. 34, issue 14, June 2022.  (749.82 KB)
Aliaga, J. I., H. Anzt, M. Castillo, J. C. Fernández, G. León, J. Pérez, and E. S. Quintana-Orti, Unveiling the Performance-energy Trade-off in Iterative Linear System Solvers for Multithreaded Processors,” Concurrency and Computation: Practice and Experience, vol. 27, issue 4, pp. 885-904, September 2014.  (1.83 MB)
Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023.
Alam, S., R. F. Barrett, H. Jagode, J. A.. Kuehn, S. W. Poole, and R.. Sankaran, Impact of Quad-core Cray XT4 System and Software Stack on Scientific Computation,” Euro-Par 2009, Lecture Notes in Computer Science, vol. 5704/2009, Delft, The Netherlands, Springer Berlin / Heidelberg, pp. 334-344, August 2009.  (312.74 KB)
Akbudak, K., P. Bagwell, S. Cayrols, M. Gates, D. Sukkari, A. YarKhan, and J. Dongarra, SLATE Performance Improvements: QR and Eigenvalues,” SLATE Working Notes, no. 17, ICL-UT-21-02, April 2021.  (2 MB)
Ahrens, J., C. M. Biwer, A. Costan, G. Antoniu, M. S. Pérez, N. Stojanovic, R. Badia, O. Beckstein, G. Fox, S. Jha, et al., A Collection of White Papers from the BDEC2 Workshop in Bloomington, IN,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-15: University of Tennessee, Knoxville, November 2018.  (9.26 MB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, J. Roman, and Y. Lee-Tin-Yien, MaPHyS or the Development of a Parallel Algebraic Domain Decomposition Solver in the Course of the Solstice Project,” Sparse Days 2010 Meeting at CERFACS, Toulouse, France, June 2010.
Agullo, E., B. Hadri, H. Ltaeif, and J. Dongarra, Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware,” 2009 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09) (to appear), 00 2009.  (515.63 KB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs,” in GPU Computing Gems, Jade Edition, vol. 2: Elsevier, pp. 473-484, 00 2011.
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, and J. Roman, Towards a Complexity Analysis of Sparse Hybrid Linear Solvers,” PARA 2010, Reykjavik, Iceland, June 2010.
Agullo, E., C. Coti, T. Herault, J. Langou, S. Peyronnet, A.. Rezmerita, F. Cappello, and J. Dongarra, QCG-OMPI: MPI Applications on Grids,” Future Generation Computer Systems, vol. 27, no. 4, pp. 357-369, March 2010.  (1.48 MB)
Agullo, E., C. Coti, T. Herault, J. Langou, S. Peyronnet, A.. Rezmerita, F. Cappello, and J. Dongarra, QCG-OMPI: MPI Applications on Grids.,” Future Generation Computer Systems, vol. 27, no. 4, pp. 435-369, January 2011.  (1.48 MB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, Faster, Cheaper, Better - A Hybridization Methodology to Develop Linear Algebra Software for GPUs,” LAPACK Working Note, no. 230, 00 2010.  (334.48 KB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, and J. Roman, Parallel algebraic domain decomposition solver for the solution of augmented systems.,” Parallel, Distributed, Grid and Cloud Computing for Engineering, Ajaccio, Corsica, France, 12-15 April, 00 2011.
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaeif, and S. Tomov, LU Factorization for Accelerator-Based Systems,” IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December 2011.  (234.86 KB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, R. Nath, J. Roman, S. Thibault, and S. Tomov, Scheduling Cholesky Factorization on Multicore Architectures with GPU Accelerators , Knoxville, TN, 2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC'10), Poster, July 2010.  (3.86 MB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, S. Lanteri, and J. Roman, Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,” The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.
Agullo, E., C. Coti, J. Dongarra, T. Herault, and J. Langou, QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,” 24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.  (261.55 KB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, and S. Tomov, Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects,” Journal of Physics: Conference Series, vol. 180, 00 2009.  (119.37 KB)
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, R. Nath, S. Tomov, et al., Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects , Portland, OR, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.  (3.53 MB)
Agullo, E., M. Altenbernd, H. Anzt, L. Bautista-Gomez, T. Benacchio, L. Bonaventura, H-J. Bungartz, S. Chatterjee, F. M. Ciorba, N. DeBardeleben, et al., Resiliency in numerical algorithm design for extreme scale simulations,” The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.
Agullo, E., G. Bosilca, C. Castagnède, J. Dongarra, H. Ltaeif, and S. Tomov, Matrices Over Runtime Systems at Exascale,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.
Aguilera, G., P. J. Teller, M. Taufer, and F. Wolf, A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering,” Proc. of the 5th International Workshop on Performance Modeling, Evaluation, and Organization of Parallel and Distributed Systems (PMEO-PDS 2006), no. ICL-UT-05-06, Rhodes Island, Greece, IEEE Computer Society, April 2006.  (1.02 MB)
Agrawal, S., J. Dongarra, K. Seymour, and S. Vadhiyar, NetSolve: Past, Present, and Future - A Look at a Grid Enabled Server,” Making the Global Infrastructure a Reality: Wiley Publishing, 00 2003.  (158.19 KB)
Agrawal, S., Hardware Software Server in NetSolve,” ICL Technical Report, no. ICL-UT-02-02, January 2002.  (221.4 KB)
Agrawal, S., D. Arnold, S. Blackford, J. Dongarra, M. Miller, K. Sagi, Z. Shi, K. Seymour, and S. Vadhiyar, Users' Guide to NetSolve v1.4.1,” ICL Technical Report, no. ICL-UT-02-05, June 2002.  (328.01 KB)
Agarwal, P., R. A.. Alexander, E.. Apra, S. Balay, A. S. Bland, J. Colgan, E. D'Azevedo, J. Dongarra, T. Dunigan, M. Fahey, et al., Cray X1 Evaluation Status Report,” Oak Ridge National Laboratory Report, vol. /-2004/13, January 2004.  (817.33 KB)
Abdulah, S., Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief, and Y. Sun, Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, issue 4, pp. 964 - 976, April 2022.
Abdelfattah, A., S. Tomov, and J. Dongarra, Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs,” ScalA19: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Denver, CO, IEEE, November 2019.  (523.87 KB) (3.42 MB)
Abdelfattah, A., H. Ltaeif, D. Keyes, and J. Dongarra, Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs,” Concurrency and Computation: Practice and Experience, vol. 28, issue 12, pp. 3447 - 3465, May 2016.  (3.21 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures,” The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (708.62 KB)
Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al., A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines,” ACM Transactions on Mathematical Software (TOMS), vol. 47, no. 3, pp. 1–23, 2021.
Abdelfattah, A., S. Tomov, and J. Dongarra, Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs,” 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.  (675.5 KB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018.  (3.73 MB)
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., Accelerating Tensor Contractions in High-Order FEM with MAGMA Batched , Atlanta, GA, SIAM Conference on Computer Science and Engineering (SIAM CSE17), Presentation, March 2017.  (9.29 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-739: University of Tennessee, February 2016.  (1.27 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.  (1.98 MB)
Abdelfattah, A., P. Ghysels, W. Boukaram, S. Tomov, X. Sherry Li, and J. Dongarra, Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,” 2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.  (1.57 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.  (2.53 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 2016.  (1.27 MB)
Abdelfattah, A., H. Anzt, E. G. Boman, E. Carson, T. Cojean, J. Dongarra, A. Fox, M. Gates, N. J. Higham, X. S. Li, et al., A survey of numerical linear algebra methods utilizing mixed-precision arithmetic,” The International Journal of High Performance Computing Applications, vol. 35, no. 4, pp. 344–369, 2021.
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., High-Performance Tensor Contractions for GPUs,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-738: University of Tennessee, January 2016.  (2.36 MB)
Abdelfattah, A., S. Tomov, P. Luszczek, H. Anzt, and J. Dongarra, GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure,” SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, CO, ACM, November 2023.
Abdelfattah, A., S. Tomov, and J. Dongarra, Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,” Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.  (1.3 MB)
Abdelfattah, A., K. Arturov, C. Cecka, J. Dongarra, C. Freitag, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, et al., C++ API for Batch BLAS,” SLATE Working Notes, no. 04, ICL-UT-17-12: University of Tennessee, December 2017.  (1.89 MB)

Pages