Publications

Danalis, A., P. Luszczek, G. Marin, J. Vetter, and J. Dongarra, “BlackjackBench: Hardware Characterization with Portable Micro-Benchmarks and Automatic Statistical Analysis of Results,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.

Danalis, A., H. Jagode, G. Bosilca, and J. Dongarra, “PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.

(1.77 MB)

Davis, J., T. Gao, S. Chandrasekaran, H. Jagode, A. Danalis, P. Balaji, J. Dongarra, and M. Taufer, “Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI,” 2019 International Conference on Parallel Computing (ParCo2019), Prague, Czech Republic, September 2019.

Demmel, J., J. Dongarra, B.. Parlett, W. Kahan, M. Gu, D. Bindel, Y. Hida, X. Li, O. Marques, J. E. Riedy, et al., “Prospectus for the Next LAPACK and ScaLAPACK Libraries,” PARA 2006, Umea, Sweden, June 2006.

(460.11 KB)

Demmel, J., and J. Dongarra, LAPACK 2005 Prospectus: Reliable and Scalable Software for Linear Algebra Computations on High End Computers : LAPACK Working Note 164, January 2005.

(172.59 KB)

Demmel, J., J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, C. Whaley, and K. Yelick, “Self Adapting Linear Algebra Algorithms and Software,” IEEE Proceedings (to appear), 00 2004.

(587.67 KB)

Demmel, J., J. Dongarra, A. Fox, S. Williams, V. Volkov, and K. Yelick, “Accelerating Time-To-Solution for Computational Science and Engineering,” SciDAC Review, 00 2009.

(739.11 KB)

Demmel, J., J. Dongarra, J. Langou, J. Langou, P. Luszczek, and M. Mahoney, “Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),” LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.

(1.41 MB)

Dempsey, B., and D. Weiss, “Towards An Efficient, Scalable Replication Mechanism for the I2-DSI Project,” University of North Carolina School of Library and Information Science Technical Report, no. TR-1999-01, January 1999.

Deshmukh, S., R. Yokota, and G. Bosilca, “Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors,” ACM Transactions on Mathematical Software, vol. 49, issue 3, pp. 1 - 29, September 2023.

Deshmukh, S., R. Yokota, G. Bosilca, and Q. Ma, “O(N) distributed direct factorization of structured dense matrices using runtime systems,” 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, ACM, August 2023.

Dewolfs, D., J. Broeckhove, V. Sunderam, and G. Fagg, “FT-MPI, Fault-Tolerant Metacomputing and Generic Name Services: A Case Study,” Lecture Notes in Computer Science, vol. 4192, no. ICL-UT-06-14: Springer Berlin / Heidelberg, pp. 133-140, 00 2006.

(362.44 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Performance evaluation of LU factorization through hardware counter measurements,” University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.

(794.82 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,” Concurrency and Computation: Practice and Experience, vol. 27, issue 5, pp. 1292-1309, April 2015.

(783.45 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-713, July 2013.

(659.77 KB)

Donfack, S., S. Tomov, and J. Dongarra, “Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, May 2014.

(490.08 KB)

Donfack, S., J. Dongarra, M. Faverge, M. Gates, J. Kurzak, P. Luszczek, and I. Yamazaki, “On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,” University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.

(358.98 KB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.

(364.95 KB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Accelerating the SVD Bi-Diagonalization of a Batch of Small Matrices using GPUs,” Journal of Computational Science, vol. 26, pp. 237–245, May 2018.

(2.18 MB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU,” IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(1.01 MB)

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “A Fast Batched Cholesky Factorization on a GPU,” International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, September 2014.

(1.37 MB)

Dong, T., V. Dobrev, T. Kolev, R. Rieben, S. Tomov, and J. Dongarra, “Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters,” University of Tennessee Computer Science Technical Report, no. ut-cs-13-714, July 2013.

(866.68 KB)

Dong, T., T. Kolev, R. Rieben, V. Dobrev, S. Tomov, and J. Dongarra, “Acceleration of the BLAST Hydro Code on GPU,” Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.

Dong, T., A. Haidar, P. Luszczek, J. Harris, S. Tomov, and J. Dongarra, “LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU,” 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, IEEE, August 2014.

(684.73 KB)

Dong, T., A. Haidar, P. Luszczek, S. Tomov, A. Abdelfattah, and J. Dongarra, “MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs,” Innovative Computing Laboratory Technical Report, no. ICL-UT-16-02: University of Tennessee, August 2016.

(929.79 KB)

Dongarra, J., S. Hammarling, N. J. Higham, S. Relton, and M. Zounon, “Optimized Batched Linear Algebra for Modern Architectures,” Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017.

(618.33 KB)

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),” University of Tennessee Computer Science Technical Report, no. CS-89-85, 00 2011.

(6.42 MB)

Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, “Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.

(1.06 MB)

Dongarra, J., and V. Eijkhout, “Self-adapting Numerical Software for Next Generation Applications (LAPACK Working Note 157),” ICL Technical Report, no. ICL-UT-02-07, 00 2002.

(475.94 KB)

Dongarra, J., “Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),” University of Tennessee Computer Science Department Technical Report, no. CS-89-85, January 2000.

(354.1 KB)

Dongarra, J., T. Herault, and Y. Robert, “Revisiting the Double Checkpointing Algorithm,” 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium, Boston, MA, May 2013.

(591.1 KB)

Dongarra, J., “Network-Enabled Solvers: A Step Toward Grid-Based Computing,” SIAM News, vol. 34, no. 10, December 2001.

Dongarra, J., H. Meuer, and E. Strohmaier, “Top500 Supercomputer Sites (14th edition),” University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-434, November 1999.

(281.81 KB)

Dongarra, J., “Report on the Fujitsu Fugaku System,” Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06: University of Tennessee, June 2020.

(3.3 MB)

Dongarra, J., T. Herault, and Y. Robert, “Fault Tolerance Techniques for High-performance Computing,” University of Tennessee Computer Science Technical Report (also LAWN 289), no. UT-EECS-15-734: University of Tennessee, May 2015.

Dongarra, J., “High Performance Computing Trends and Self Adapting Numerial Software,” Lecture Notes in Computer Science, High Performance Computing, 5th International Symposium ISHPC, vol. 2858, Tokyo-Odaiba, Japan, Springer-Verlag, Heidelberg, pp. 1-9, January 2003.

Dongarra, J., R. Graybill, W. Harrod, R. Lucas, E. Lusk, P. Luszczek, J. McMahon, A. Snavely, J. Vetter, K. Yelick, et al., “DARPA's HPCS Program: History, Models, Tools, Languages,” in Advances in Computers, vol. 72: Elsevier, January 2008.

(3.61 MB)

Dongarra, J., M. A. Heroux, and P. Luszczek, “High Performance Conjugate Gradient Benchmark: A new Metric for Ranking High Performance Computing Systems,” International Journal of High Performance Computing Applications, vol. 30, issue 1, pp. 3 - 10, February 2016.

(277.51 KB)

Dongarra, J., M. Faverge, Y. Ishikawa, R. Namyst, F. Rue, and F. Trahay, “EZTrace: a generic framework for performance analysis,” ICL Technical Report, no. ICL-UT-11-01, December 2010.

Dongarra, J., L. Grigori, and N. J. Higham, “Numerical Algorithms for High-Performance Computational Science,” Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020.

(724.37 KB)

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures,” Proceedings of MTAGS11, Seattle, WA, November 2011.

(879.49 KB)

Dongarra, J., J. Demmel, J. Langou, and J. Langou, “2016 Dense Linear Algebra Software Packages Survey,” University of Tennessee Computer Science Technical Report, no. UT-EECS-16-744 / LAWN 290: University of Tennessee, September 2016.

(366.43 KB)

Dongarra, J., I. Duff, D. Sorensen, and H. van der Vorst, “Numerical Linear Algebra for High-Performance Computers,” Software, Environments and Tools: SIAM, 1998.

Dongarra, J., and P. Luszczek, “Introduction to the HPCChallenge Benchmark Suite,” ICL Technical Report, no. ICL-UT-05-01, January 2005.

(124.86 KB)

Dongarra, J., H. Meuer, H. D. Simon, and E. Strohmaier, “High Performance Computing Today,” FOMMS 2000: Foundations of Molecular Modeling and Simulation Conference (to appear), January 2000.

(66 KB)

Dongarra, J., G. Bosilca, R. Delmas, and J. Langou, “Algorithmic Based Fault Tolerance Applied to High Performance Computing,” Journal of Parallel and Distributed Computing, vol. 69, pp. 410-416, 00 2009.

(313.55 KB)

Dongarra, J., M. Faverge, H. Ltaeif, and P. Luszczek, “Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization,” University of Tennessee Computer Science Technical Report (also as a LAWN), no. ICL-UT-11-08, September 2011.

(618.53 KB)

Dongarra, J., J-F. Pineau, Y. Robert, Z. Shi, and F. Vivien, “Revisiting Matrix Product on Master-Worker Platforms,” International Journal of Foundations of Computer Science (IJFCS), vol. 19, no. 6, pp. 1317-1336, December 2008.

(248.66 KB)

Dongarra, J., H. Meuer, H. D. Simon, and E. Strohmaier, “Recent Trends in High Performance Computing,” in Birth of Numerical Analysis (to appear), 00 2009.

Dongarra, J., “Sunway TaihuLight Supercomputer Makes Its Appearance,” National Science Review, vol. 3, issue 3, pp. 256-266, September 2016.

(292.11 KB)

Main menu

Pages