Publications
Report on the Fujitsu Fugaku System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06: University of Tennessee, June 2020.
(3.3 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,”
in Beautiful Code Leading Programmers Explain How They Think (Chapter 14), pp. 243-282, January 2008.
(257 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accurate Cache and TLB Characterization Using Hardware Counters,”
International Conference on Computational Science (ICCS 2004), Krakow, Poland, Springer, June 2004.
(167.1 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Recursive Approach in Sparse Matrix LU Factorization,”
Scientific Programming, vol. 9, no. 1, pp. 51-60, 00 2001.
(217.16 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer—Supercomputing History and the Immortality of Now,”
Computer, vol. 51, issue 10, pp. 74–85, November 2018.
(1.73 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Revisiting the Double Checkpointing Algorithm,”
University of Tennessee Computer Science Technical Report (LAWN 274), no. ut-cs-13-705, January 2013.
(682.22 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Numerical Algorithms for High-Performance Computational Science,”
Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020.
(724.37 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,”
PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003.
(432.57 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
PULSAR Users’ Guide, Parallel Ultra-Light Systolic Array Runtime,”
University of Tennessee EECS Technical Report, no. UT-EECS-14-733: University of Tennessee, November 2014.
(561.56 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Exploiting Fine-Grain Parallelism in Recursive LU Factorization,”
Proceedings of PARCO'11, no. ICL-UT-11-04, Gent, Belgium, April 2011.
“Top500 Supercomputer Sites (13th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-425, June 1999.
(278.51 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High-Performance Conjugate-Gradient Benchmark: A New Metric for Ranking High-Performance Computing Systems,”
The International Journal of High Performance Computing Applications, 2015.
(336.19 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Report on the TianHe-2A System,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-17-04: University of Tennessee, September 2017.
(7.15 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Self-Adapting Numerical Software and Automatic Tuning of Heuristics,”
Lecture Notes in Computer Science, vol. 2660, Melbourne, Australia, Springer Verlag, pp. 759-770, June 2003.
(45.95 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Quest for Petascale Computing,”
Computing in Science and Engineering, vol. 3, no. 3, pp. 32-39, May 2001.
(178.3 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems,”
International Conference on Computational Science (ICCS 2017), Zürich, Switzerland, Elsevier, June 2017.
(446.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Overview of Heterogeneous High Performance and Grid Computing,”
Engineering the Grid (to appear): Nova Science Publishers, Inc., 00 2004.
(199.93 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Sunway TaihuLight Supercomputer Makes Its Appearance,”
National Science Review, vol. 3, issue 3, pp. 256-266, September 2016.
(292.11 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Computing Systems: Status and Outlook,”
Acta Numerica, vol. 21, Cambridge, UK, Cambridge University Press, pp. 379-474, May 2012.
(1.48 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Dense Linear Algebra on Accelerated Multicore Hardware,”
High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
“The HPL Benchmark: Past, Present & Future
, ISC High Performance, Frankfurt, Germany, July 2016.
(3.41 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Computing Trends,”
HERMIS, vol. 2, pp. 155-163, November 2001.
“Model-Driven One-Sided Factorizations on Multicore, Accelerated Systems,”
Supercomputing Frontiers and Innovations, vol. 1, issue 1, 2014.
(1.86 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Not So Simple Matter of Software,”
NCSA Access Online: NCSA, 00 2005.
(457.69 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Exploring New Architectures in Accelerating CFD for Air Force Applications,”
Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.
(492.86 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems,”
19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA) (submitted), San Diego, CA, June 2007.
(223.82 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
International Exascale Software Project Roadmap v1.0,”
University of Tennessee Computer Science Technical Report, UT-CS-10-654, May 2010.
(719.74 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,”
PPAM 2013, Warsaw, Poland, September 2013.
(284.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Top500 Supercomputer Sites (15th edition),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-00-442, June 2000.
(278.88 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
With Extreme Computing, the Rules Have Changed,”
Computing in Science & Engineering, vol. 19, issue 3, pp. 52-62, May 2017.
(485.34 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High-Performance Computing,”
The Princeton Companion to Applied Mathematics, Princeton, New Jersey, Princeton University Press, pp. 839-842, 2015.
“High Performance Computing Trends, Supercomputers, Clusters, and Grids,”
Information Processing Society of Japan Symposium Series, vol. 2003, no. 14, pp. 55-58, January 2003.
“Race to Exascale,”
Computing in Science and Engineering, vol. 21, issue 1, pp. 4-5, March 2019.
(106.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance and Reliability Trade-offs for the Double Checkpointing Algorithm,”
International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41.
(859.04 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2040–2055, November 2018.
(2.53 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Finite-choice Algorithm Optimization in Conjugate Gradients (LAPACK Working Note 159),”
University of Tennessee Computer Science Technical Report, UT-CS-03-502, January 2003.
(64.52 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Netlib and NA-Net: building a scientific computing community,”
In IEEE Annals of the History of Computing (to appear), August 2007.
(352.71 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The LINPACK Benchmark: Past, Present, and Future,”
Concurrency: Practice and Experience, vol. 15, pp. 803-820, 00 2008.
(94.86 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Impact of Multicore on Computational Science Software,”
CTWatch Quarterly, vol. 3, issue 1, February 2007.
“Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification
, July 2018.
(483.05 KB)
![application/pdf](/modules/file/icons/application-pdf.png)
Numerical Linear Algebra,”
Encyclopedia of Computer Science and Technology, eds. Kent, A., Williams, J., vol. 41, pp. 207-233, August 1999.
(262 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Using PAPI for Hardware Performance Monitoring on Linux Systems,”
Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.
(422.35 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, CS-89-85, January 2004.
(6.42 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Problem with the Linpack Benchmark Matrix Generator,”
University of Tennessee Computer Science Technical Report, UT-CS-08-621 (also LAPACK Working Note 206), June 2008.
(136.41 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Translational Process: Mathematical Software Perspective,”
Journal of Computational Science, September 2020.
(752.59 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Measuring Computer Performance: A Practioner's Guide,”
SIAM Review (book review), vol. 43, no. 2, pp. 383-384, 00 2001.
(558.9 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance Community,”
International Journal of High Performance Computing Applications (to appear), July 2009.
(203.04 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High Performance Development for High End Computing with Python Language Wrapper (PLW),”
International Journal for High Performance Computer Applications, vol. 21, no. 3, pp. 360-369, 00 2007.
(179.32 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.
(514.64 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Disaster Survival Guide in Petascale Computing: An Algorithmic Approach,”
in Petascale Computing: Algorithms and Applications (to appear): Chapman & Hall - CRC Press, 00 2007.
(260.18 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)