Publications

Export 1283 results:
Filters: 10.1007 is 978-3-031-08751-6_5  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, J. Roman, and Y. Lee-Tin-Yien, MaPHyS or the Development of a Parallel Algebraic Domain Decomposition Solver in the Course of the Solstice Project,” Sparse Days 2010 Meeting at CERFACS, Toulouse, France, June 2010.
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, and S. Tomov, Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects,” Journal of Physics: Conference Series, vol. 180, 00 2009.  (119.37 KB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs,” in GPU Computing Gems, Jade Edition, vol. 2: Elsevier, pp. 473-484, 00 2011.
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, and J. Roman, Towards a Complexity Analysis of Sparse Hybrid Linear Solvers,” PARA 2010, Reykjavik, Iceland, June 2010.
Agullo, E., C. Coti, T. Herault, J. Langou, S. Peyronnet, A.. Rezmerita, F. Cappello, and J. Dongarra, QCG-OMPI: MPI Applications on Grids,” Future Generation Computer Systems, vol. 27, no. 4, pp. 357-369, March 2010.  (1.48 MB)
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, S. Thibault, and S. Tomov, Faster, Cheaper, Better - A Hybridization Methodology to Develop Linear Algebra Software for GPUs,” LAPACK Working Note, no. 230, 00 2010.  (334.48 KB)
Agullo, E., C. Coti, T. Herault, J. Langou, S. Peyronnet, A.. Rezmerita, F. Cappello, and J. Dongarra, QCG-OMPI: MPI Applications on Grids.,” Future Generation Computer Systems, vol. 27, no. 4, pp. 435-369, January 2011.  (1.48 MB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, and J. Roman, Parallel algebraic domain decomposition solver for the solution of augmented systems.,” Parallel, Distributed, Grid and Cloud Computing for Engineering, Ajaccio, Corsica, France, 12-15 April, 00 2011.
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, R. Nath, J. Roman, S. Thibault, and S. Tomov, Scheduling Cholesky Factorization on Multicore Architectures with GPU Accelerators , Knoxville, TN, 2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC'10), Poster, July 2010.  (3.86 MB)
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaeif, and S. Tomov, LU Factorization for Accelerator-Based Systems,” IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December 2011.  (234.86 KB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, R. Nath, S. Tomov, et al., Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects , Portland, OR, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.  (3.53 MB)
Agullo, E., L. Giraud, A. Guermouche, A. Haidar, S. Lanteri, and J. Roman, Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,” The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.
Agullo, E., C. Coti, J. Dongarra, T. Herault, and J. Langou, QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,” 24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.  (261.55 KB)
Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, H. Ltaeif, S. Thibault, and S. Tomov, QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,” Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.  (468.17 KB)
Agullo, E., M. Altenbernd, H. Anzt, L. Bautista-Gomez, T. Benacchio, L. Bonaventura, H-J. Bungartz, S. Chatterjee, F. M. Ciorba, N. DeBardeleben, et al., Resiliency in numerical algorithm design for extreme scale simulations,” The International Journal of High Performance Computing Applications, vol. 36371337212766180823, issue 2, pp. 251 - 285, March 2022.
Agullo, E., B. Hadri, H. Ltaeif, and J. Dongarra, Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware,” 2009 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09) (to appear), 00 2009.  (515.63 KB)
Ahrens, J., C. M. Biwer, A. Costan, G. Antoniu, M. S. Pérez, N. Stojanovic, R. Badia, O. Beckstein, G. Fox, S. Jha, et al., A Collection of White Papers from the BDEC2 Workshop in Bloomington, IN,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-15: University of Tennessee, Knoxville, November 2018.  (9.26 MB)
Akbudak, K., P. Bagwell, S. Cayrols, M. Gates, D. Sukkari, A. YarKhan, and J. Dongarra, SLATE Performance Improvements: QR and Eigenvalues,” SLATE Working Notes, no. 17, ICL-UT-21-02, April 2021.  (2 MB)
Alam, S., R. F. Barrett, H. Jagode, J. A.. Kuehn, S. W. Poole, and R.. Sankaran, Impact of Quad-core Cray XT4 System and Software Stack on Scientific Computation,” Euro-Par 2009, Lecture Notes in Computer Science, vol. 5704/2009, Delft, The Netherlands, Springer Berlin / Heidelberg, pp. 334-344, August 2009.  (312.74 KB)
Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Ortí, and A. E. Thomas, Compressed basis GMRES on high-performance graphics processing units,” The International Journal of High Performance Computing Applications, May 2022.  (13.52 MB)
Aliaga, J. I., H. Anzt, T. Grützmacher, E. S. Quintana-Orti, and A. E. Thomas, Compression and load balancing for efficient sparse matrix‐vector product on multicore processors and graphics processing units,” Concurrency and Computation: Practice and Experience, vol. 34, issue 14, June 2022.  (749.82 KB)
Aliaga, J. I., H. Anzt, E. S. Quintana-Orti, and A. E. Thomas, Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors,” Concurrency and Computation: Practice and Experience, August 2023.
Aliaga, J. I., H. Anzt, M. Castillo, J. C. Fernández, G. León, J. Pérez, and E. S. Quintana-Orti, Unveiling the Performance-energy Trade-off in Iterative Linear System Solvers for Multithreaded Processors,” Concurrency and Computation: Practice and Experience, vol. 27, issue 4, pp. 885-904, September 2014.  (1.83 MB)
Computational Science – ICCS 2009, Proceedings of the 9th International Conference,” Lecture Notes in Computer Science: Theoretical Computer Science and General Issues, vol. -, no. 5544-5545, Baton Rouge, LA, May 2009.
Alomairy, R., M. Gates, S. Cayrols, D. Sukkari, K. Akbudak, A. YarKhan, P. Bagwell, and J. Dongarra, Communication Avoiding LU with Tournament Pivoting in SLATE,” SLATE Working Notes, no. 18, ICL-UT-22-01, January 2022.  (3.74 MB)
Altintas, I., K. Marcus, V. Vural, S. Purawat, D. Crawl, G. Antoniu, A. Costan, O. Marcu, P. Balaprakash, R. Cao, et al., A Collection of White Papers from the BDEC2 Workshop in San Diego, CA,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-13: University of Tennessee, October 2019.  (8.25 MB)
Alvaro, W., J. Kurzak, and J. Dongarra, Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the CELL Processor,” University of Tennessee Computer Science Technical Report, no. UT-CS-08-609, (also LAPACK Working Note 189), January 2008.  (500.99 KB)
Alvaro, W., J. Kurzak, and J. Dongarra, Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture - CELL Processor,” Parallel Computing, vol. 35, pp. 138-150, 00 2009.  (591.16 KB)
Anderson, E., Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, et al., LAPACK Users' Guide, 3rd ed.,” Philadelphia: Society for Industrial and Applied Mathematics, January 1999.
Andersson, U., and P. Mucci, Analysis and Optimization of Yee_Bench using Hardware Performance Counters,” Proceedings of Parallel Computing 2005 (ParCo), Malaga, Spain, January 2005.  (72.27 KB)
Angskun, T., G. Bosilca, and J. Dongarra, Binomial Graph: A Scalable and Fault- Tolerant Logical Network Topology,” Proceedings of The Fifth International Symposium on Parallel and Distributed Processing and Applications (ISPA07), Niagara Falls, Canada, Springer, August 2007.  (480.47 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Scalable Fault Tolerant Protocol for Parallel Runtime Environments,” 2006 Euro PVM/MPI, no. ICL-UT-06-12, Bonn, Germany, 00 2006.  (149.07 KB)
Angskun, T., G. Bosilca, G. Fagg, J. Pjesivac–Grbovic, and J. Dongarra, Reliability Analysis of Self-Healing Network using Discrete-Event Simulation,” Proceedings of Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07): IEEE Computer Society, pp. 437-444, May 2007.
Angskun, T., G. Bosilca, B. Vander Zanden, and J. Dongarra, Optimal Routing in Binomial Graph Networks,” The International Conference on Parallel and Distributed Computing, applications and Technologies (PDCAT), Adelaide, Australia, IEEE Computer Society, December 2007.
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault Tolerant Runtime Environments,” DAPSYS 2006, 6th Austrian-Hungarian Workshop on Distributed and Parallel Systems, Innsbruck, Austria, January 2006.  (162.83 KB)
Angskun, T., G. Bosilca, and J. Dongarra, Self-Healing in Binomial Graph Networks,” 2nd International Workshop On Reliability in Decentralized Distributed Systems (RDDS 2007), Vilamoura, Algarve, Portugal, November 2007.  (322.39 KB)
Angskun, T., G. Fagg, G. Bosilca, J. Pjesivac–Grbovic, and J. Dongarra, Self-Healing Network for Scalable Fault-Tolerant Runtime Environments,” Future Generation Computer Systems, vol. 26, no. 3, pp. 479-485, March 2010.  (1.54 MB)
Antoniu, G., A. Costan, O. Marcu, M. S. Pérez, N. Stojanovic, R. M. Badia, M. Vázquez, S. Girona, M. Beck, T. Moore, et al., A Collection of White Papers from the BDEC2 Workshop in Poznan, Poland,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-10: University of Tennessee, Knoxville, May 2019.  (5.82 MB)
Anzt, H., M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Preconditioned Krylov Solvers on GPUs,” Parallel Computing, June 2017.  (1.19 MB)
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,” ICCS 2012, Omaha, NE, June 2012.  (608.95 KB)
Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017.  (1.22 MB)
Anzt, H., G. Collins, J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Flexible Batched Sparse Matrix-Vector Product on GPUs,” 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '17), Denver, CO, ACM Press, November 2017.  (583.4 KB)
Anzt, H., A. Huebl, and X. S. Li, Then and Now: Improving Software Portability, Productivity, and 100× Performance,” Computing in Science & Engineering, pp. 1 - 10, April 2024.
Anzt, H., Y. Chen Chen, T. Cojean, J. Dongarra, G. Flegar, P. Nayak, E. S. Quintana-Orti, Y. M. Tsai, and W. Wang, Towards Continuous Benchmarking,” Platform for Advanced Scientific Computing Conference (PASC 2019), Zurich, Switzerland, ACM Press, June 2019.  (1.51 MB)
Anzt, H., E. Chow, T. Huckle, and J. Dongarra, Batched Generation of Incomplete Sparse Approximate Inverses on GPUs,” Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 49–56, November 2016.
Anzt, H., I. Yamazaki, M. Hoemmen, E. Boman, and J. Dongarra, Solver Interface & Performance on Cori,” Innovative Computing Laboratory Technical Report, no. ICL-UT-18-05: University of Tennessee, June 2018.  (188.05 KB)
Anzt, H., and E. S. Quintana-Orti, Improving the Energy Efficiency of Sparse Linear System Solvers on Multicore and Manycore Systems,” Philosophical Transactions of the Royal Society A -- Mathematical, Physical and Engineering Sciences, vol. 372, issue 2018, July 2014.  (779.57 KB)
Anzt, H., J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Relaxation for GPU-Accelerated Systems,” SIAM Journal on Computing (submitted), March 2012.  (811.01 KB)
Anzt, H., J. Dongarra, and E. S. Quintana-Orti, Tuning Stationary Iterative Solvers for Fault Resilience,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.  (1.28 MB)
Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.  (608.95 KB)

Pages