Publications

Export 1287 results:
Filters: 10.1007 is 978-3-030-66057-4_11  [Clear All Filters]
Journal Article
Grützmacher, T., H. Anzt, and E. S. Quintana‐Ortí, Using Ginkgo's memory accessor for improving the accuracy of memory‐bound low precision BLAS,” Software: Practice and Experience, vol. 532, issue 1, pp. 81 - 98, January Jan. DOI: 10.1002/spe.v53.110.1002/spe.3041
Chow, E., H. Anzt, J. Scott, and J. Dongarra, Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,” Journal of Parallel and Distributed Computing, vol. 119, pp. 219–230, November 2018. DOI: 10.1016/j.jpdc.2018.04.017  (273.53 KB)
Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, Using long vector extensions for MPI reductions,” Parallel Computing, vol. 109, pp. 102871, March 2022. DOI: 10.1016/j.parco.2021.102871
Tomov, S., M. Faverge, P. Luszczek, and J. Dongarra, Using MAGMA with PGI Fortran,” PGI Insider, November 2010.  (176.67 KB)
Buttari, A., J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov, Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy,” ACM Transactions on Mathematical Software, vol. 34, no. 4, pp. 17-22, 00 2008.  (364.48 KB)
Giraud, L., A. Haidar, and S. Pralet, Using multiple levels of parallelism to enhance the performance of domain decomposition solvers,” Parallel Computing, vol. 36, no. 5-6: Elsevier journals, pp. 285-296, 00 2010.  (418.57 KB)
Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,” Parallel Computing, vol. 81, pp. 131-146, January 2019. DOI: 10.1016/j.parco.2017.12.006  (1.9 MB)
Casanova, H., T. Bartol, F. Berman, A. Birnbaum, J. Dongarra, M. Ellisman, M. Faerman, E. Gockay, M. Miller, G. Obertelli, et al., The Virtual Instrument: Support for Grid-enabled Scientific Simulations,” International Journal of High Performance Computing Applications, vol. 18, no. 1, pp. 3-17, January 2004.  (282.16 KB)
Casanova, H., T. Bartol, F. Berman, A. Birnbaum, J. Dongarra, M. Ellisman, M. Faerman, E. Gockay, M. Miller, G. Obertelli, et al., The Virtual Instrument: Support for Grid-enabled Scientific Simulations,” Journal of Parallel and Distributed Computing (submitted), October 2002.  (282.16 KB)
Lee, DW., and J. Dongarra, VisPerf: Monitoring Tool for Grid Computing,” Lecture Notes in Computer Science, vol. 2659: Springer Verlag, Heidelberg, pp. 233-243, 00 2003.  (835.09 KB)
Anzt, H., J. Dongarra, and V. Heuveline, Weighted Block-Asynchronous Relaxation for GPU-Accelerated Systems,” SIAM Journal on Computing (submitted), March 2012.  (811.01 KB)
Dongarra, J., S. Tomov, P. Luszczek, J. Kurzak, M. Gates, I. Yamazaki, H. Anzt, A. Haidar, and A. Abdelfattah, With Extreme Computing, the Rules Have Changed,” Computing in Science & Engineering, vol. 19, issue 3, pp. 52-62, May 2017. DOI: 10.1109/MCSE.2017.48  (485.34 KB)
Poster
Cheng, X., A. Soma, E. D'Azevedo, K. Wong, and S. Tomov, Accelerating 2D FFT: Exploit GPU Tensor Cores through Mixed-Precision , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), ACM Student Research Poster, November 2018.  (740.37 KB)
Ayala, A., S. Tomov, A. Haidar, M. Stoyanov, S. Cayrols, J. Li, G. Bosilca, and J. Dongarra, Accelerating FFT towards Exascale Computing : NVIDIA GPU Technology Conference (GTC2021), 2021.  (27.23 MB)
Haidar, A., A. Abdelfattah, V. Dobrev, I. Karlin, T. Kolev, S. Tomov, and J. Dongarra, Accelerating Tensor Contractions for High-Order FEM on CPUs, GPUs, and KNLs , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC16), Poster, September 2016.  (4.29 MB)
Dong, T., T. Kolev, R. Rieben, V. Dobrev, S. Tomov, and J. Dongarra, Acceleration of the BLAST Hydro Code on GPU,” Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Cholesky Factorization on Batches of Matrices with Fixed and Variable Sizes , San Jose, CA, GPU Technology Conference (GTC16), Poster, April 2016.  (480.51 KB)
Gates, M., S. Tomov, H. Anzt, P. Luszczek, and J. Dongarra, Clover: Computational Libraries Optimized via Exascale Research , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (872 KB)
Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Enabled Libraries and Applications : 2021 Exascale Computing Project Annual Meeting, April 2021.  (3.24 MB)
Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Enabled Libraries and Applications (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (979.27 KB)
Bosilca, G., T. Herault, and J. Dongarra, DTE: PaRSEC Systems and Interfaces (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (840.54 KB)
Baboulin, M., J. Demmel, J. Dongarra, S. Tomov, and V. Volkov, Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project) , Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.  (5.28 MB)
Jagode, H., A. Danalis, and J. Dongarra, Exa-PAPI: The Exascale Performance API with Modern C++ , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (556.78 KB)
Fortenberry, A., S. Tomov, and K. Wong, Extending MAGMA Portability with OneAPI , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), ACM Student Research Competition, November 2022.  (1.33 MB)
Tomov, S., A. Haidar, A. Ayala, D. Schultz, and J. Dongarra, FFT-ECP Fast Fourier Transform , Houston, TX, 2019 ECP Annual Meeting (Research Poster), January 2019.  (1.51 MB)
Anzt, H., N. Beams, T. Cojean, F. Göbel, T. Grützmacher, A. Kashi, P. Nayak, T. Ribizel, and Y. M. Tsai, Gingko: A Sparse Linear Algebrea Library for HPC : 2021 ECP Annual Meeting, April 2021.  (893.04 KB)
Anzt, H., T. Cojean, Y-C. Chen, F. Goebel, T. Gruetzmacher, P. Nayak, T. Ribizel, Y-H. Tsai, and J. Dongarra, Ginkgo: A Node-Level Sparse Linear Algebra Library for HPC (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (699 KB)
Shaiek, H., S. Tomov, A. Ayala, A. Haidar, and J. Dongarra, GPUDirect MPI Communications and Optimizations to Accelerate FFTs on Exascale Systems,” EuroMPI'19 Posters, Zurich, Switzerland, no. icl-ut-19-06: ICL, September 2019.  (2.25 MB)
Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers and Achieve 74 Gflops/Watt on Nvidia V100 , San Jose, CA, GPU Technology Conference (GTC), Poster, March 2018.  (2.96 MB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) : NVIDIA GPU Technology Conference (GTC2020), October 2020.  (866.88 KB)
Ayala, A., S. Tomov, A. Haidar, and J. Dongarra, heFFTe: Highly Efficient FFT for Exascale (Poster) , Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.  (1.54 MB)
Ayala, A., S. Tomov, J. Dongarra, and A. Haidar, heFFTe: Highly Efficient FFT for Exascale (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (6.2 MB)
Tomov, S., MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020.  (2.28 MB)
Abdelfattah, A., J. Dongarra, A. Haidar, S. Tomov, and I. Yamazaki, MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), Research Poster, November 2018.  (2.55 MB)
Haidar, A., S. Tomov, A. Abdelfattah, I. Yamazaki, and J. Dongarra, MAtrix, TEnsor, and Deep-learning Optimized Routines (MATEDOR) , Washington, DC, NSF PI Meeting, Poster, April 2018. DOI: 10.6084/m9.figshare.6174143.v3  (2.4 MB)
Agullo, E., J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaeif, P. Luszczek, R. Nath, S. Tomov, et al., Numerical Linear Algebra on Emerging Architectures: The PLASMA and MAGMA Projects , Portland, OR, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.  (3.53 MB)
Nath, R., J. Dongarra, S. Tomov, H. Ltaeif, and P. Du, Numerical Linear Algebra on Hybrid Architectures: Recent Developments in the MAGMA Project , Portland, Oregon, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC09), November 2009.  (1.41 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Optimizing Batch HGEMM on Small Sizes Using Tensor Cores , San Jose, CA, GPU Technology Conference (GTC), March 2019.  (2.47 MB)
Weaver, V., D. Terpstra, H. McCraw, M. Johnson, K. Kasichayanula, J. Ralph, J. Nelson, P. Mucci, T. Mohan, and S. Moore, PAPI 5: Measuring Power, Energy, and the Cloud , Austin, TX, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, April 2013.  (78.39 KB)
Dongarra, J., H. Jagode, A. Danalis, D. Barry, and V. Weaver, Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.  (2.53 MB)
Luszczek, P., and J. Dongarra, The PLASMA Library on CORAL Systems and Beyond (Poster) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (550.86 KB)
Kasichayanula, K., H. You, S. Moore, S. Tomov, H. Jagode, and M. Johnson, Power-aware Computing on GPGPUs , Gatlinburg, TN, Fall Creek Falls Conference, Poster, September 2011.  (2.89 MB)
Jagode, H., and A. Danalis, PULSE: PAPI Unifying Layer for Software-Defined Events (Poster) , Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, February 2020.  (1.86 MB)
Hori, A., T. Ogura, B. Gerofi, J. Yin, Y. Ishikawa, E. Jeannot, and G. Bosilca, A Report of the MPI International Survey (Poster) , Austin, TX, EuroMPI/USA '20: 27th European MPI Users' Group Meeting, September 2020.
Agullo, E., C. Augonnet, J. Dongarra, H. Ltaeif, R. Namyst, R. Nath, J. Roman, S. Thibault, and S. Tomov, Scheduling Cholesky Factorization on Multicore Architectures with GPU Accelerators , Knoxville, TN, 2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC'10), Poster, July 2010.  (3.86 MB)
Gates, M., A. Charara, J. Kurzak, A. YarKhan, M. Al Farhan, D. Sukkari, and J. Dongarra, SLATE: Software for Linear Algebra Targeting Exascale (POSTER) , Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.  (546.56 KB)
Valero-Lara, P., J. Dongarra, A. Haidar, S. D. Relton, S. Tomov, and M. Zounon, A Standard for Batched BLAS Routines , Paris, France, 17th SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP16), April 2016.  (1.93 MB)
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Tensor Contractions using Optimized Batch GEMM Routines , San Jose, CA, GPU Technology Conference (GTC), Poster, March 2018.  (1.64 MB)
Baboulin, M., V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, and S. Tomov, Towards a High-Performance Tensor Algebra Package for Accelerators , Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.  (1.76 MB)
Zhong, D., G. Bosilca, Q. Cao, and J. Dongarra, Using Advanced Vector Extensions AVX-512 for MPI Reduction (Poster) , Austin, TX, EuroMPI/USA '20: 27th European MPI Users' Group Meeting, September 2020.  (708.68 KB)

Pages