Publications

Show only items where

Author

Type

Term

Year

Keyword

Export 987 results:

Filters: Author is Dongarra, Jack [Clear All Filters]

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Cayrols, S., J. Li, G. Bosilca, S. Tomov, A. Ayala, and J. Dongarra, “Mixed precision and approximate 3D FFTs: Speed for accuracy trade-off with GPU-aware MPI and run-time data compression,” ICL Technical Report, no. ICL-UT-22-04, May 2022.

(706.14 KB)

Tsai, Y-H. Mike, N. Beams, and H. Anzt, “Mixed Precision Algebraic Multigrid on GPUs,” Parallel Processing and Applied Mathematics (PPAM 2022), vol. 13826, Cham, Springer International Publishing, April 2023.

Beck, M., D. Arnold, A. Bassi, F. Berman, H. Casanova, J. Dongarra, T. Moore, G. Obertelli, J. Plank, M. Swany, et al., “Middleware for the Use of Storage in Communication,” Parallel Computing, vol. 28, no. 12, pp. 1773-1788, August 2002.

(87.97 KB)

Marin, G., J. Dongarra, and D. Terpstra, “MIAMI: A Framework for Application Performance Diagnosis ,” IPASS-2014, Monterey, CA, IEEE, March 2014.

(1010.75 KB)

Vadhiyar, S., and J. Dongarra, “A Metascheduler For The Grid,” Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC 2002), Edinburgh, Scotland, IEEE Computer Society, pp. 343-351, July 2002.

(99.53 KB)

Dongarra, J., G. Fagg, R. Hempel, and D. W. Walker, “Message Passing Software Systems,” Encyclopedia of Electrical and Engineering, Supplement 1: John Wiley & Sons, Inc., 00 2000.

(289.38 KB)

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, “Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,” 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.

(1.81 MB)

Barry, D., H. Jagode, A. Danalis, and J. Dongarra, Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements , St. Petersburg, FL, 28th HIPS Workshop, May 2023.

(3.99 MB)

Dongarra, J., “Measuring Computer Performance: A Practioner's Guide,” SIAM Review (book review), vol. 43, no. 2, pp. 383-384, 00 2001.

(558.9 KB)

Haidar, A., S. Tomov, A. Abdelfattah, I. Yamazaki, and J. Dongarra, MAtrix, TEnsor, and Deep-learning Optimized Routines (MATEDOR) , Washington, DC, NSF PI Meeting, Poster, April 2018.

(2.4 MB)

Dongarra, J., J-F. Pineau, Y. Robert, and F. Vivien, “Matrix Product on Heterogeneous Master Worker Platforms,” 2008 PPoPP Conference, Salt Lake City, Utah, January 2008.

Bai, Z., J. Dongarra, D. Lu, and I. Yamazaki, “Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,” International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.

(480.73 KB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions,” Journal of Parallel and Distributed Computing, vol. 145, pp. 188-201, November 2020.

(1.3 MB)

Agullo, E., G. Bosilca, C. Castagnède, J. Dongarra, H. Ltaeif, and S. Tomov, “Matrices Over Runtime Systems at Exascale,” Supercomputing '12 (poster), Salt Lake City, Utah, November 2012.

Abdelfattah, A., J. Dongarra, A. Haidar, S. Tomov, and I. Yamazaki, MATEDOR: MAtrix, TEnsor, and Deep-learning Optimized Routines , Dallas, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), Research Poster, November 2018.

(2.55 MB)

Kurzak, J., Y. Tsai, M. Gates, A. Abdelfattah, and J. Dongarra, “Massively Parallel Automated Software Tuning,” 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan, ACM Press, August 2019.

(911.88 KB)

Strohmaier, E., J. Dongarra, H. Meuer, and H. D. Simon, “The Marketplace for High-Performance Computers,” Parallel Computing, vol. 25, no. 13-14, pp. 1517-1545, October 2002.

(285.78 KB)

Anzt, H., E. Boman, J. Dongarra, G. Flegar, M. Gates, M. Heroux, M. Hoemmen, J. Kurzak, P. Luszczek, S. Rajamanickam, et al., “MAGMA-sparse Interface Design Whitepaper,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-05, September 2017.

(1.28 MB)

Nichols, D., N-S. Tomov, F. Betancourt, S. Tomov, K. Wong, and J. Dongarra, “MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing,” ISC High Performance, Frankfurt, Germany, Springer International Publishing, June 2019.

(1.37 MB)

(8.72 MB)

Ng, L., K. Wong, A. Haidar, S. Tomov, and J. Dongarra, MagmaDNN – High-Performance Data Analytics for Manycore GPUs and CPUs , Knoxville, TN, 2017 Summer Research Experiences for Undergraduate (REU), Presentation, December 2017.

(5.06 MB)

Ng, L., S. Chen, A. Gessinger, D. Nichols, S. Cheng, A. Meenasorna, K. Wong, S. Tomov, A. Haidar, E. D'Azevedo, et al., MagmaDNN 0.2 High-Performance Data Analytics for Manycore GPUs and CPUs : University of Tennessee, January 2019.

(7.84 MB)

Farhan, M. Al, A. Abdelfattah, S. Tomov, M. Gates, D. Sukkari, A. Haidar, R. Rosenberg, and J. Dongarra, “MAGMA Templates for Scalable Linear Algebra on Emerging Architectures,” The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020.

Anzt, H., J. Dongarra, M. Gates, A. Haidar, K. Kabir, P. Luszczek, S. Tomov, and I. Yamazaki, MAGMA MIC: Optimizing Linear Algebra for Intel Xeon Phi , Frankfurt, Germany, ISC High Performance (ISC15), Intel Booth Presentation, June 2015.

(2.03 MB)

Dongarra, J., M. Gates, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, MAGMA MIC: Linear Algebra Library for Intel Xeon Phi Coprocessors , Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), November 2012.

(6.4 MB)

Tomov, S., and J. Dongarra, MAGMA - LAPACK for HPC on Heterogeneous Architectures , Oak Ridge, TN, Titan Summit at Oak Ridge National Laboratory, Presentation, August 2011.

(20.43 MB)

Haidar, A., S. Tomov, P. Luszczek, and J. Dongarra, “MAGMA Embedded: Towards a Dense Linear Algebra Library for Energy Efficient Extreme Computing,” 2015 IEEE High Performance Extreme Computing Conference (HPEC ’15), (Best Paper Award), Waltham, MA, IEEE, September 2015.

(678.86 KB)

Dong, T., A. Haidar, P. Luszczek, S. Tomov, A. Abdelfattah, and J. Dongarra, “MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.

(929.79 KB)

Dongarra, J., T. Dong, M. Gates, A. Haidar, S. Tomov, and I. Yamazaki, MAGMA: A New Generation of Linear Algebra Library for GPU and Multicore Architectures , Salt Lake City, UT, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC12), Presentation, November 2012.

(4.69 MB)

Tomov, S., J. Dongarra, A. Haidar, I. Yamazaki, T. Dong, T. Schulthess, and R. Solcà, MAGMA: A Breakthrough in Solvers for Eigenvalue Problems , San Jose, CA, GPU Technology Conference (GTC12), Presentation, May 2012.

(9.23 MB)

Haidar, A., S. Tomov, K. Arturov, M. Guney, S. Story, and J. Dongarra, “LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi,” IEEE High Performance Extreme Computing Conference (HPEC'16), Waltham, MA, IEEE, September 2016.

(943.23 KB)

Kurzak, J., P. Luszczek, and J. Dongarra, “LU Factorization with Partial Pivoting for a Multicore System with Accelerators,” IEEE Transactions on Parallel and Distributed Computing, vol. 24, issue 8, pp. 1613-1621, August 2013.

(1.08 MB)

Dong, T., A. Haidar, P. Luszczek, J. Harris, S. Tomov, and J. Dongarra, “LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU,” 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.

(684.73 KB)

Agullo, E., C. Augonnet, J. Dongarra, M. Faverge, J. Langou, H. Ltaeif, and S. Tomov, “LU Factorization for Accelerator-Based Systems,” IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December 2011.

(234.86 KB)

Cayrols, S., J. Li, G. Bosilca, S. Tomov, A. Ayala, and J. Dongarra, “Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), pp. 152-160, September 2022.

Luszczek, P., J. Kurzak, and J. Dongarra, “Looking Back at Dense Linear Algebra Software,” Journal of Parallel and Distributed Computing, vol. 74, issue 7, pp. 2548–2560, July 2014.

(1.79 MB)

Luszczek, P., J. Kurzak, and J. Dongarra, “Looking Back at Dense Linear Algebra Software,” Perspectives on Parallel and Distributed Processing: Looking Back and What's Ahead (to appear), 00 2012.

(235.91 KB)

Bell, G., D. Bailey, A. H. Karp, J. Dongarra, and K. Walsh, “A Look Back on 30 Years of the Gordon Bell Prize,” International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.

Beck, M., H. Casanova, J. Dongarra, T. Moore, J. Plank, F. Berman, and R. Wolski, “Logistical Quality of Service in NetSolve,” Computer Communications, vol. 22, no. 11, pp. 1034-1044, January 1999.

(168.39 KB)

Beck, M., D. Arnold, A. Bassi, F. Berman, H. Casanova, J. Dongarra, T. Moore, G. Obertelli, J. Plank, M. Swany, et al., “Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication,” submitted to SC2001, Denver, Colorado, November 2001.

(41.79 KB)

Ma, T., A. Bouteiller, G. Bosilca, and J. Dongarra, “Locality and Topology aware Intra-node Communication Among Multicore CPUs,” Proceedings of the 17th EuroMPI conference, Stuttgart, Germany, LNCS, September 2010.

(327.01 KB)

Anzt, H., T. Cojean, C. Yen-Chen, J. Dongarra, G. Flegar, P. Nayak, S. Tomov, Y. M. Tsai, and W. Wang, “Load-Balancing Sparse Matrix Vector Product Kernels on GPUs,” ACM Transactions on Parallel Computing, vol. 7, issue 1, March 2020.

(5.67 MB)

Dongarra, J., “LINPACK on Future Manycore and GPu Based Systems,” PARA 2010, Reykjavik, Iceland, June 2010.

Dongarra, J., P. Luszczek, and A. Petitet, “The LINPACK Benchmark: Past, Present, and Future,” Concurrency: Practice and Experience, vol. 15, pp. 803-820, 00 2008.

(94.86 KB)

Kurzak, J., M. Gates, A. Charara, A. YarKhan, I. Yamazaki, and J. Dongarra, “Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators,” Euro-Par 2019: Parallel Processing, vol. 11725: Springer, pp. 495–506, August 2019.

Kurzak, J., M. Gates, I. Yamazaki, A. Charara, A. YarKhan, J. Finney, G. Ragghianti, P. Luszczek, and J. Dongarra, “Linear Systems Performance Report,” SLATE Working Notes, no. 08, ICL-UT-18-08: Innovative Computing Laboratory, University of Tennessee, September 2018.

(1.64 MB)

Abdelfattah, A., H. Anzt, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, , and A. YarKhan, “Linear Algebra Software for Large-Scale Accelerated Multicore Computing,” Acta Numerica, vol. 25, pp. 1-160, May 2016.

Buttari, A., J. Dongarra, and J. Kurzak, “Limitations of the Playstation 3 for High Performance Cluster Computing,” University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.

(171.01 KB)

Cao, Q., Y. Pei, K. Akbudak, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, “Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,” 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.

(1.08 MB)

Gustavson, F. G., J. Wasniewski, and J. Dongarra, “Level-3 Cholesky Kernel Subroutine of a Fully Portable High Performance Minimal Storage Hybrid Format Cholesky Algorithm,” ACM TOMS (submitted), also LAPACK Working Note (LAWN) 211, 00 2010.

(190.2 KB)

Gustavson, F. G., J. Wasniewski, J. Dongarra, J. Herrero, and J. Langou, “Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms,” ACM Transactions on Mathematical Software (TOMS), vol. 39, issue 2, February 2013.

(439.46 KB)

Main menu

Publications

Pages