Publications
Export 971 results:
Filters: Author is Jack Dongarra [Clear All Filters]
Is your scheduling good? How would you know?
, Bordeaux, France, 14th Scheduling for Large Scale Systems Workshop, June 2019.
(2.5 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing
: arXiv, January 2024.
With Extreme Computing, the Rules Have Changed,”
Computing in Science & Engineering, vol. 19, issue 3, pp. 52-62, May 2017.
(485.34 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
What it Takes to keep PAPI Instrumental for the HPC Community
, Collegeville, MN, The 2019 Collegeville Workshop on Sustainable Scientific Software (CW3S19), July 2019.
(3.29 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
What it Takes to keep PAPI Instrumental for the HPC Community,”
1st Workshop on Sustainable Scientific Software (CW3S19), Collegeville, Minnesota, July 2019.
(50.57 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Weighted Dynamic Scheduling with Many Parallelism Grains for Offloading of Numerical Workloads to Multiple Varied Accelerators,”
Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'15), vol. No. 5, Austin, TX, ACM, November 2015.
(347.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Weighted Block-Asynchronous Relaxation for GPU-Accelerated Systems,”
SIAM Journal on Computing (submitted), March 2012.
(811.01 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems,”
Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August 2012.
(764.02 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Visualizing Execution Traces with Task Dependencies,”
2nd Workshop on Visual Performance Analysis (VPA '15), Austin, TX, ACM, November 2015.
(927.5 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
VisPerf: Monitoring Tool for Grid Computing,”
Lecture Notes in Computer Science, vol. 2659: Springer Verlag, Heidelberg, pp. 233-243, 00 2003.
(835.09 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Virtual Systolic Array for QR Decomposition,”
15th Workshop on Advances in Parallel and Distributed Computational Models, IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), Boston, MA, IEEE, May 2013.
(749.84 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Virtual Instrument: Support for Grid-enabled Scientific Simulations,”
Journal of Parallel and Distributed Computing (submitted), October 2002.
(282.16 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Virtual Instrument: Support for Grid-enabled Scientific Simulations,”
International Journal of High Performance Computing Applications, vol. 18, no. 1, pp. 3-17, January 2004.
(282.16 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning,”
46th International Conference on Parallel Processing (ICPP), Bristol, United Kingdom, IEEE, August 2017.
“Variable-Size Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioning on Graphics Processors,”
Parallel Computing, vol. 81, pp. 131-146, January 2019.
(1.9 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning,”
International Conference on Computational Science (ICCS 2017), vol. 108, Zurich, Switzerland, Procedia Computer Science, pp. 1783-1792, June 2017.
(512.57 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Variable-Size Batched Condition Number Calculation on GPUs,”
SBAC-PAD, Lyon, France, September 2018.
(509.3 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Utilizing Dataflow-based Execution for Coupled Cluster Methods,”
2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-02, Madrid, Spain, IEEE, September 2014.
(260.23 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Quantized Integer in LU Factorization with Partial Pivoting (Poster)
, Seattle, WA, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20), February 2020.
(6.65 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Using PAPI for Hardware Performance Monitoring on Linux Systems,”
Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.
(422.35 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy,”
ACM Transactions on Mathematical Software, vol. 34, no. 4, pp. 17-22, 00 2008.
(364.48 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using MAGMA with PGI Fortran,”
PGI Insider, November 2010.
(176.67 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,”
Journal of Parallel and Distributed Computing, vol. 119, pp. 219–230, November 2018.
(273.53 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications,”
Proceedings of the 13th International Euro-Par Conference on Parallel Processing (Euro-Par '07), Rennes, France, Springer LNCS, January 2007.
“Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption
, Frankfurt, Germany, ISC High Performance (ISC18), Best Poster Award, June 2018.
(3.01 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Using GPU FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption,”
ISC High Performance (ISC'18), Best Poster, Frankfurt, Germany, June 2018.
(3.01 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Arm Scalable Vector Extension to Optimize Open MPI,”
20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.
(359.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Advanced Vector Extensions AVX-512 for MPI Reduction (Poster)
, Austin, TX, EuroMPI/USA '20: 27th European MPI Users' Group Meeting, September 2020.
(708.68 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
Using Advanced Vector Extensions AVX-512 for MPI Reduction,”
EuroMPI/USA '20: 27th European MPI Users' Group Meeting, Austin, TX, September 2020.
(634.45 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using Additive Modifications in LU Factorization Instead of Pivoting,”
37th ACM International Conference on Supercomputing (ICS'23), Orlando, FL, ACM, June 2023.
(624.18 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Users' Guide to NetSolve v1.4.1,”
ICL Technical Report, no. ICL-UT-02-05, June 2002.
(328.01 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot,”
Journal of Computational Physics, vol. 223, pp. 774-782, 00 2007.
(452.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot,”
Journal of Computational Physics (submitted), January 2006.
(337.08 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Updating Incomplete Factorization Preconditioners for Model Order Reduction,”
Numerical Algorithms, vol. 73, issue 3, no. 3, pp. 611–630, February 2016.
(565.34 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Updated Set of Basic Linear Algebra Subprograms (BLAS),”
ACM Transactions on Mathematical Software, vol. 28, no. 2, pp. 135-151, December 2002.
(228.33 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,”
Concurrency and Computation: Practice and Experience, November 2013.
(894.61 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Unified Model for Assessing Checkpointing Protocols at Extreme-Scale,”
University of Tennessee Computer Science Technical Report (also LAWN 269), no. UT-CS-12-697, June 2012.
(2.76 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems,”
IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
“Unified Development for Mixed Multi-GPU and Multi-Coprocessor Environments using a Lightweight Runtime Environment,”
IPDPS 2014, Phoenix, AZ, IEEE, May 2014.
(1.51 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Understanding Native Event Semantics
, Knoxville, TN, 9th JLESC Workshop, April 2019.
(2.33 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Two-stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures,”
IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.
“Twenty-Plus Years of Netlib and NA-Net,”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, 00 2006.
(62.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Twenty Years of Computational Science,”
International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.
(149.66 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Tuning Stationary Iterative Solvers for Fault Resilience,”
6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA15), Austin, TX, ACM, November 2015.
(1.28 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,”
FOSS4G 2010, Barcelona, Spain, September 2010.
(1.57 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Truss Structural Optimization Using NetSolve System,”
Meeting of the Japan Society of Mechanical Engineers, Kyoto University, Kyoto, Japan, October 2002.
(450.65 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster,”
The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 2013.
“Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems,”
Concurrency and Computation: Practice and Experience, October 2013.
(1.71 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Tribute to Gene Golub,”
Computing in Science and Engineering: IEEE, pp. 5, January 2008.
“Trends in High Performance Computing,”
The Computer Journal, vol. 47, no. 4: The British Computer Society, pp. 399-403, 00 2004.
(455.96 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)