Publications
Export 987 results:
Filters: Author is Dongarra, Jack [Clear All Filters]
PLASMA 17.1 Functionality Report,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-10: University of Tennessee, June 2017.
(1.8 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The PLASMA Library on CORAL Systems and Beyond (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(550.86 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
PLASMA: Parallel Linear Algebra Software for Multicore Using OpenMP,”
ACM Transactions on Mathematical Software, vol. 45, issue 2, June 2019.
(7.5 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The PlayStation 3 for High Performance Scientific Computing,”
Computing in Science and Engineering, pp. 80-83, January 2008.
(2.45 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The PlayStation 3 for High Performance Scientific Computing,”
University of Tennessee Computer Science Technical Report, no. UT-CS-08-608, January 2008.
(2.45 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
POMPEI: Programming with OpenMP4 for Exascale Investigations,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-09: University of Tennessee, December 2017.
(1.1 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,”
PPAM 2013, Warsaw, Poland, September 2013.
(284.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Portable Programming Interface for Performance Evaluation on Modern Processors,”
The International Journal of High Performance Computing Applications, vol. 14, no. 3, pp. 189-204, September 2000.
(655.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Portable Programming Interface for Performance Evaluation on Modern Processors,”
University of Tennessee Computer Science Technical Report, UT-CS-00-444, July 2000.
(655.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Porting the PLASMA Numerical Library to the OpenMP Standard,”
International Journal of Parallel Programming, June 2016.
(1.66 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Post-failure recovery of MPI communication capability: Design and rationale,”
International Journal of High Performance Computing Applications, vol. 27, issue 3, pp. 244 - 254, January 2013.
(285.77 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Power Management and Event Verification in PAPI,”
Tools for High Performance Computing 2015: Proceedings of the 9th International Workshop on Parallel Tools for High Performance Computing, September 2015, Dresden, Germany, Dresden, Germany, Springer International Publishing, pp. pp. 41-51, 2016.
(565.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Power Monitoring with PAPI for Extreme Scale Architectures and Dataflow-based Programming Models,”
2014 IEEE International Conference on Cluster Computing, no. ICL-UT-14-04, Madrid, Spain, IEEE, September 2014.
(3.45 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems,”
Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September 2012.
(290.27 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi,”
2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist, Waltham, MA, IEEE, September 2017.
(908.84 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Power-Aware HPC on Intel Xeon Phi KNL Processors
, Frankfurt, Germany, ISC High Performance (ISC17), Intel Booth Presentation, June 2017.
(5.87 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems: Formal Proof,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-15-01, April 2015.
(570.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems,”
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
(550.96 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Preconditioned Krylov Solvers on GPUs,”
Parallel Computing, June 2017.
(1.19 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures,”
J. Phys.: Conf. Ser. 46, vol. :101088/1742-6596/46/1/040, pp. 292-298, January 2006.
(644.1 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture,”
LAWN 267, 00 2012.
(1.14 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Problem with the Linpack Benchmark Matrix Generator,”
University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.
(136.41 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Problem with the Linpack Benchmark Matrix Generator,”
International Journal of High Performance Computing Applications, vol. 23, no. 1, pp. 5-14, 00 2009.
(136.41 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
“Proceedings of the International Conference on Computational Science,”
ICCS 2010, Amsterdam, Elsevier, May 2010.
Process Distance-aware Adaptive MPI Collective Communications,”
IEEE Int'l Conference on Cluster Computing (Cluster 2011), Austin, Texas, 00 2011.
“Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing,”
International Journal for High Performance Applications and Supercomputing (to appear), April 2004.
(186.9 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency,”
International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 2011.
(1.27 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Programming the LU Factorization for a Multicore System with Accelerators,”
Proceedings of VECPAR’12, Kobe, Japan, April 2012.
(414.33 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Progressive Optimization of Batched LU Factorization on GPUs,”
IEEE High Performance Extreme Computing Conference (HPEC’19), Waltham, MA, IEEE, September 2019.
(299.38 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning,”
The Journal of Computational Science Education, vol. 11, issue 1, pp. 36-44, January 2020.
(4.4 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Proposal for User-Level Failure Mitigation in the MPI-3 Standard,”
University of Tennessee Electrical Engineering and Computer Science Technical Report, no. ut-cs-12-693: University of Tennessee, February 2012.
(159.46 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Proposal of MPI operation level Checkpoint/Rollback and one implementation,”
Proceedings of IEEE CCGrid 2006: IEEE Computer Society, January 2006.
(277.27 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Prospectus for the Next LAPACK and ScaLAPACK Libraries,”
PARA 2006, Umea, Sweden, June 2006.
(460.11 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC),”
LAPACK Working Notes, no. 297, ICL-UT-20-07: University of Tennessee.
(1.41 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Providing GPU Capability to LU and QR within the ScaLAPACK Framework,”
University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.
(7.48 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Providing Infrastructure and Interface to High Performance Applications in a Distributed Setting,”
ASTC-HPC 2000, Washington, DC, April 2000.
(96.04 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
PTG: An Abstraction for Unhindered Parallelism,”
International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New Orleans, LA, IEEE Press, November 2014.
(480.05 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
PULSAR Users’ Guide, Parallel Ultra-Light Systolic Array Runtime,”
University of Tennessee EECS Technical Report, no. UT-EECS-14-733: University of Tennessee, November 2014.
(561.56 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QCG-OMPI: MPI Applications on Grids.,”
Future Generation Computer Systems, vol. 27, no. 4, pp. 435-369, January 2011.
(1.48 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QCG-OMPI: MPI Applications on Grids,”
Future Generation Computer Systems, vol. 27, no. 4, pp. 357-369, March 2010.
(1.48 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization for the CELL Processor,”
University of Tennessee Computer Science Technical Report, UT-CS-08-616 (also LAPACK Working Note 201), May 2008.
(194.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization for the CELL Processor,”
Scientific Programming (to appear), 00 2009.
(234.02 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization for the CELL Processor,”
Scientific Programming, vol. 17, no. 1-2, pp. 31-42, 00 2010.
(194.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment,”
24th IEEE International Parallel and Distributed Processing Symposium (also LAWN 224), Atlanta, GA, April 2010.
(261.55 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,”
Proceedings of IPDPS 2011, no. ICL-UT-10-04, Anchorage, AK, October 2010.
(468.17 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
QUARK Users' Guide: QUeueing And Runtime for Kernels,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-02, 00 2011.
(247.12 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Quest for Petascale Computing,”
Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.
(178.3 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
(106.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Randomized Algorithms to Update Partial Singular Value Decomposition on a Hybrid CPU/GPU Cluster,”
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
“Randomized Numerical Linear Algebra: A Perspective on the Field with an Eye to Software,”
University of California, Berkeley EECS Technical Report, no. UCB/EECS-2022-258: University of California, Berkeley, November 2022.
(1.05 MB)
(1.54 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
![application/pdf](/modules/file/icons/application-pdf.png)