Publications
Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures,”
PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 2008.
“Towards a High-Performance Tensor Algebra Package for Accelerators
, Gatlinburg, TN, moky Mountains Computational Sciences and Engineering Conference (SMC15), September 2015.
(1.76 MB)
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-11-07, October 2011.
(544.2 KB)
“Accelerating Linear System Solutions Using Randomization Techniques,”
ACM Transactions on Mathematical Software (also LAWN 246), vol. 39, issue 2, February 2013.
DOI: 10.1145/2427023.2427025 (358.79 KB)
“Computing the Conditioning of the Components of a Linear Least Squares Solution,”
VECPAR '08, High Performance Computing for Computational Science, Toulouse, France, January 2008.
(374.97 KB)
“Solving Dense Symmetric Indefinite Systems using GPUs,”
Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017.
DOI: 10.1002/cpe.4055 (1.94 MB)
“An efficient distributed randomized solver with application to large dense linear systems,”
ICL Technical Report, no. ICL-UT-12-02, July 2012.
(626.26 KB)
“Accelerating Scientific Computations with Mixed Precision Algorithms,”
Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.
DOI: 10.1016/j.cpc.2008.11.005 (402.69 KB)
“A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.
(58.85 MB)
“Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,”
International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.
(480.73 KB)
“LAPACK,”
Handbook of Linear Algebra, Second, Boca Raton, FL, CRC Press, 2013.
(223.21 KB)
“PERI Auto-tuning,”
Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.
(873.75 KB)
“Task-graph scheduling extensions for efficient synchronization and communication,”
Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021.
DOI: 10.1145/3447818.3461616
“OpenMP application experiences: Porting to accelerated nodes,”
Parallel Computing, vol. 109, March 2022.
DOI: 10.1016/j.parco.2021.102856
“Autotuning in High-Performance Computing Applications,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2068–2083, November 2018.
DOI: 10.1109/JPROC.2018.2841200 (2.5 MB)
“Communication-Avoiding Symmetric-Indefinite Factorization,”
SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.
(593.18 KB)
“When to checkpoint at the end of a fixed-length reservation?,”
Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.
“Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,”
2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.
DOI: 10.1109/IPDPSW59300.2023.00070 (1.81 MB)
“Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements
, St. Petersburg, FL, 28th HIPS Workshop, May 2023.
(3.99 MB)
Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,”
13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.
(738.47 KB)
“Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,”
Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.
DOI: 10.1007/978-3-030-66057-4_11
“xSDK4ECP: Extreme-scale Scientific Software Development Kit for ECP (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(1.54 MB)
The Internet BackPlane Protocol: A Study in Resource Sharing,”
Proceedings of the second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, October 2002.
“Internet Backplane Protocol - Test Language v. 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(22.43 KB)
“Internet Backplane Protocol: API 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(55.33 KB)
“Revisiting Dynamic DAG Scheduling under Memory Constraints for Shared-Memory Platforms,”
22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.
(317.93 KB)
“Dynamic DAG scheduling under memory constraints for shared-memory platforms,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 27-49, 2021.
(574.64 KB)
“A survey on checkpointing strategies: Should we always checkpoint à la Young/Daly?,”
Future Generation Computer Systems, July 2024.
DOI: 10.1016/j.future.2024.07.022
“High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,”
2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.
(1.3 MB)
“Logistical Networking: Sharing More Than the Wires,”
In Active Middleware Services, Ed. Salim Hariri, Craig A. Lee, Cauligi S. Raghavendra (2000), Kluwer Academic, Norwell, MA, January 2000.
(84.69 KB)
“Scalable, Trustworthy Network Computing Using Untrusted Intermediaries: A Position Paper,”
DOE/NSF Workshop on New Directions in Cyber-Security in Large-Scale Networks: Development Obstacles, National Conference Center - Landsdowne, Virginia, March 2003.
(54.62 KB)
“Enabling Full Service Surrogates Using the Portable Channel Representation,”
Tenth International World Wide Web Conference Proceedings (to appear),, Hong Kong, May 2001.
(267.23 KB)
“Active Logistical State Management in the GridSolve/L,”
4th International Symposium on Cluster Computing and the Grid (CCGrid 2004)(submitted), Chicago, Illinois, January 2004.
(123.69 KB)
“HARNESS: A Next Generation Distributed Virtual Machine,”
International Journal on Future Generation Computer Systems, vol. 15, no. 5-6, pp. 571-582, January 1999.
(183.78 KB)
“Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication,”
submitted to SC2001, Denver, Colorado, November 2001.
(41.79 KB)
“Portable Representation of Internet Content Channels in I2-DSI,”
4th Intl. Web Caching Workshop, San Diego, CA, March 1999.
“Middleware for the Use of Storage in Communication,”
Parallel Computing, vol. 28, no. 12, pp. 1773-1788, August 2002.
(87.97 KB)
“Data Logistics: Toolkit and Applications,”
5th EAI International Conference on Smart Objects and Technologies for Social Good, Valencia, Spain, September 2019.
(6.71 MB)
“Logistical Quality of Service in NetSolve,”
Computer Communications, vol. 22, no. 11, pp. 1034-1044, January 1999.
(168.39 KB)
“Interoperable Convergence of Storage, Networking, and Computation,”
Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), no. 2: Springer International Publishing, pp. 667-690, 2020.
(1.8 MB)
“Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.
(145.76 KB)
“Towards a Parallel Tile LDL Factorization for Multicore Architectures,”
ICL Technical Report, no. ICL-UT-11-03, Seattle, WA, April 2011.
(425.45 KB)
“Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-06, Knoxville, TN, Submitted to PPAM 2011, May 2011.
(145.76 KB)
“Harnessing the Computing Continuum for Programming Our World,”
Fog Computing: Theory and Practice: John Wiley & Sons, Inc., 2020.
DOI: 10.1002/9781119551713.ch7 (1.4 MB)
“A Look Back on 30 Years of the Gordon Bell Prize,”
International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.
“Design and Comparison of Resilient Scheduling Heuristics for Parallel Jobs,”
22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.
(696.21 KB)
“Combining Checkpointing and Replication for Reliable Execution of Linear Workflows with Fail-Stop and Silent Errors,”
International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27.
(754.6 KB)
“Replication is More Efficient Than You Think,”
The IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis (SC19), Denver, CO, ACM Press, November 2019.
(975.69 KB)
“Max-Stretch Minimization on an Edge-Cloud Platform,”
IPDPS'2021, the 34th IEEE International Parallel and Distributed Processing Symposium: IEEE Computer Society Press, 2021.
(4.94 MB)
“A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures,”
The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.
(868.44 KB)
“