Publications

Danalis, A., H. Jagode, and J. Dongarra, PAPI: Counting outside the Box , Barcelona, Spain, 8th JLESC Meeting, April 2018.

Danalis, A., H. Jagode, H. Hanumantharayappa, S. Ragate, and J. Dongarra, “Counter Inspection Toolkit: Making Sense out of Hardware Performance Events,” 11th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Cham, Switzerland: Springer, February 2019.

(216.39 KB)

Davis, J., T. Gao, S. Chandrasekaran, H. Jagode, A. Danalis, P. Balaji, J. Dongarra, and M. Taufer, “Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,” 2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.

Demmel, J., J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. Mahoney, “Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),” LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.

(1.41 MB)

Demmel, J., and J. Dongarra, LAPACK 2005 Prospectus: Reliable and Scalable Software for Linear Algebra Computations on High End Computers : LAPACK Working Note 164, January 2005.

(172.59 KB)

Demmel, J., J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, C. Whaley, and K. Yelick, “Self Adapting Linear Algebra Algorithms and Software,” IEEE Proceedings (to appear), 00 2004.

(587.67 KB)

Demmel, J., J. Dongarra, A. Fox, S. Williams, V. Volkov, and K. Yelick, “Accelerating Time-To-Solution for Computational Science and Engineering,” SciDAC Review, 00 2009.

(739.11 KB)

Demmel, J., J. Dongarra, B.. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, X. Li, O. Marques, J. E. Riedy, et al., “Prospectus for the Next LAPACK and ScaLAPACK Libraries,” PARA 2006, Umea, Sweden, June 2006.

(460.11 KB)

Dempsey, B., and D. Weiss, “Towards An Efficient, Scalable Replication Mechanism for the I2-DSI Project,” University of North Carolina School of Library and Information Science Technical Report, no. TR-1999-01, January 1999.

Deshmukh, S., R. Yokota, and G. Bosilca, “Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors,” ACM Transactions on Mathematical Software, vol. 49, issue 3, pp. 1 - 29, September 2023.

Deshmukh, S., R. Yokota, G. Bosilca, and Q. Ma, “O(N) distributed direct factorization of structured dense matrices using runtime systems,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, August 2023.

Dewolfs, D., J. Broeckhove, V. Sunderam, and G. Fagg, “FT-MPI, Fault-Tolerant Metacomputing and Generic Name Services: A Case Study,” Lecture Notes in Computer Science, vol. 4192, no. ICL-UT-06-14: Springer Berlin / Heidelberg, pp. 133-140, 00 2006.

(362.44 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-713, July 2013.

(659.77 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, May 2014.

(490.08 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,” University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.

(358.98 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Performance evaluation of LU factorization through hardware counter measurements,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.

(794.82 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,” Concurrency and Computation: Practice and Experience, vol. 27, issue 5, pp. 1292-1309, April 2015.

(783.45 KB)

Dong, T., A. Haidar, P. Luszczek, J. Harris, S. Tomov, and J. Dongarra, “LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU,” 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.

(684.73 KB)

Dong, T., A. Haidar, P. Luszczek, S. Tomov, A. Abdelfattah, and J. Dongarra, “MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.

(929.79 KB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.

(364.95 KB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018.

(2.18 MB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.01 MB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “A Fast Batched Cholesky Factorization on a GPU,” International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, September 2014.

(1.37 MB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.

(866.68 KB)

Dong, T., T. Kolev, R. Rieben, V. Dobrev, S. Tomov, and J. Dongarra, “Acceleration of the BLAST Hydro Code on GPU,” Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.

Dongarra, J., V. Eijkhout, and P. Luszczek, “Recursive Approach in Sparse Matrix LU Factorization,” Scientific Programming, vol. 9, no. 1, pp. 51-60, 00 2001.

(217.16 KB)

Dongarra, J., and P. Luszczek, “Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modelling,” University of Tennessee Computer Science Technical Report, no. UT-CS-10-661, October 2010.

(287.87 KB)

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),” University of Tennessee Computer Science Technical Report, CS-89-85, January 2008.

(6.42 MB)

Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, “Translational process: Mathematical software perspective,” Journal of Computational Science, vol. 52, pp. 101216, 2021.

Dongarra, J., “Network-Enabled Solvers: A Step Toward Grid-Based Computing,” SIAM News, vol. 34, no. 10, December 2001.

Dongarra, J., K. London, S. Moore, P. Mucci, D. Terpstra, H. You, and M. Zhou, “Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,” PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003.

(432.57 KB)

Dongarra, J., and P. Luszczek, “How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,” in Beautiful Code Leading Programmers Explain How They Think (Chapter 14), pp. 243-282, January 2008.

(257 KB)

Dongarra, J., “The evolution of mathematical software,” Communications of the ACM, vol. 65227, issue 12, pp. 66 - 72, December 2022.

Dongarra, J., H. Meuer, and E. Strohmaier, “Top500 Supercomputer Sites (13th edition),” University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-425, June 1999.

(278.51 KB)

Dongarra, J., “High Performance Computing Trends and Self Adapting Numerial Software,” Lecture Notes in Computer Science, High Performance Computing, 5th International Symposium ISHPC, vol. 2858, Tokyo-Odaiba, Japan, Springer-Verlag, Heidelberg, pp. 1-9, January 2003.

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),” University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.

(514.64 KB)

Dongarra, J., and S. Tomov, An Introduction to the MAGMA project - Acceleration of Dense Linear Algebra : NVIDIA Webinar, June 2010.

Dongarra, J., M. Faverge, T. Herault, M. Jacquelin, J. Langou, and Y. Robert, “Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems,” Parallel Computing, vol. 39, issue 4-5, pp. 212-232, May 2013.

(1.43 MB)

Dongarra, J., H. Ltaeif, P. Luszczek, and V. M. Weaver, “Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture,” The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November 2012.

(329.5 KB)

Dongarra, J., and D. W. Walker, “The Quest for Petascale Computing,” Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.

(178.3 KB)

Dongarra, J., and V. Eijkhout, “Self-Adapting Numerical Software and Automatic Tuning of Heuristics,” Lecture Notes in Computer Science, vol. 2660, Melbourne, Australia, Springer Verlag, pp. 759-770, June 2003.

(45.95 KB)

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Exploiting Fine-Grain Parallelism in Recursive LU Factorization,” Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.

Dongarra, J., and P. Luszczek, “Introduction to the HPCChallenge Benchmark Suite,” ICL Technical Report, no. ICL-UT-05-01, January 2005.

(124.86 KB)

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,” Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014.

(1.96 MB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour, et al., “PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,” ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.

(7.5 MB)

Dongarra, J., A. Haidar, O. Hernandez, S. Tomov, and M G. Venkata, “POMPEI: Programming with OpenMP4 for Exascale Investigations,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.

(1.1 MB)

Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, “Translational Process: Mathematical Software Perspective,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-11, August 2020.

(752.59 KB)

Dongarra, J., and P. Luszczek, “HPC Challenge: Design, History, and Implementation Highlights,” On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing (to appear): Chapman & Hall/CRC Press, 00 2012.

(469.92 KB)

Dongarra, J., M. A. Heroux, and P. Luszczek, “HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems,” University of Tennessee Computer Science Technical Report , no. ut-eecs-15-736: University of Tennessee, January 2015.

Dongarra, J., H. Meuer, and E. Strohmaier, “Top500 Supercomputer Sites (15th edition),” University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-442, June 2000.

(278.88 KB)

Main menu

Pages