Publications
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,”
Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.
(327.14 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accelerating Linear System Solutions Using Randomization Techniques,”
INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.
(358.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
An Efficient Distributed Randomized Algorithm for Solving Large Dense Symmetric Indefinite Linear Systems,”
Parallel Computing, vol. 40, issue 7, pp. 213-223, July 2014.
(1.42 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems,”
International Interdisciplinary Conference on Applied Mathematics, Modeling and Computational Science (AMMCS), Waterloo, Ontario, CA, August 2014.
(130.18 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Accelerating Scientific Computations with Mixed Precision Algorithms,”
Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.
(402.69 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Computing the Conditioning of the Components of a Linear Least Squares Solution,”
VECPAR '08, High Performance Computing for Computational Science, Toulouse, France, January 2008.
(374.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Class of Communication-Avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines,”
Proc. of the International Conference on Computational Science (ICCS), vol. 9, pp. 17-26, June 2012.
“A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.
(58.85 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
LAPACK,”
Handbook of Linear Algebra, Second, Boca Raton, FL, CRC Press, 2013.
(223.21 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,”
International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.
(480.73 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
PERI Auto-tuning,”
Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.
(873.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Task-graph scheduling extensions for efficient synchronization and communication,”
Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021.
“OpenMP application experiences: Porting to accelerated nodes,”
Parallel Computing, vol. 109, March 2022.
“Autotuning in High-Performance Computing Applications,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2068–2083, November 2018.
(2.5 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Communication-Avoiding Symmetric-Indefinite Factorization,”
SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.
(593.18 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
When to checkpoint at the end of a fixed-length reservation?,”
Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.
“Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,”
2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.
(1.81 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements
, St. Petersburg, FL, 28th HIPS Workshop, May 2023.
(3.99 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,”
13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.
(738.47 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,”
Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.
“xSDK4ECP: Extreme-scale Scientific Software Development Kit for ECP (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(1.54 MB)
![application/pdf](/modules/file/icons/application-pdf.png)
Internet Backplane Protocol: API 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(55.33 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
The Internet BackPlane Protocol: A Study in Resource Sharing,”
Proceedings of the second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, October 2002.
“Internet Backplane Protocol - Test Language v. 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(22.43 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Revisiting Dynamic DAG Scheduling under Memory Constraints for Shared-Memory Platforms,”
22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.
(317.93 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Dynamic DAG scheduling under memory constraints for shared-memory platforms,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 27-49, 2021.
(574.64 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,”
2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.
(1.3 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Scalable, Trustworthy Network Computing Using Untrusted Intermediaries: A Position Paper,”
DOE/NSF Workshop on New Directions in Cyber-Security in Large-Scale Networks: Development Obstacles, National Conference Center - Landsdowne, Virginia, March 2003.
(54.62 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Active Logistical State Management in the GridSolve/L,”
4th International Symposium on Cluster Computing and the Grid (CCGrid 2004)(submitted), Chicago, Illinois, January 2004.
(123.69 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Logistical Networking: Sharing More Than the Wires,”
In Active Middleware Services, Ed. Salim Hariri, Craig A. Lee, Cauligi S. Raghavendra (2000), Kluwer Academic, Norwell, MA, January 2000.
(84.69 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Interoperable Convergence of Storage, Networking, and Computation,”
Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), no. 2: Springer International Publishing, pp. 667-690, 2020.
(1.8 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Enabling Full Service Surrogates Using the Portable Channel Representation,”
Tenth International World Wide Web Conference Proceedings (to appear),, Hong Kong, May 2001.
(267.23 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
HARNESS: A Next Generation Distributed Virtual Machine,”
International Journal on Future Generation Computer Systems, vol. 15, no. 5-6, pp. 571-582, January 1999.
(183.78 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication,”
submitted to SC2001, Denver, Colorado, November 2001.
(41.79 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Data Logistics: Toolkit and Applications,”
5th EAI International Conference on Smart Objects and Technologies for Social Good, Valencia, Spain, September 2019.
(6.71 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Portable Representation of Internet Content Channels in I2-DSI,”
4th Intl. Web Caching Workshop, San Diego, CA, March 1999.
“Middleware for the Use of Storage in Communication,”
Parallel Computing, vol. 28, no. 12, pp. 1773-1788, August 2002.
(87.97 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Logistical Quality of Service in NetSolve,”
Computer Communications, vol. 22, no. 11, pp. 1034-1044, January 1999.
(168.39 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.
(145.76 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Towards a Parallel Tile LDL Factorization for Multicore Architectures,”
ICL Technical Report, no. ICL-UT-11-03, Seattle, WA, April 2011.
(425.45 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-06, Knoxville, TN, Submitted to PPAM 2011, May 2011.
(145.76 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Harnessing the Computing Continuum for Programming Our World,”
Fog Computing: Theory and Practice: John Wiley & Sons, Inc., 2020.
(1.4 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
A Look Back on 30 Years of the Gordon Bell Prize,”
International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.
“Towards Optimal Multi-Level Checkpointing,”
IEEE Transactions on Computers, vol. 66, issue 7, pp. 1212–1226, July 2017.
(1.39 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Combining Checkpointing and Replication for Reliable Execution of Linear Workflows with Fail-Stop and Silent Errors,”
International Journal of Networking and Computing, vol. 9, no. 1, pp. 2-27.
(754.6 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Efficient checkpoint/verification patterns for silent error detection,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-14-03: University of Tennessee, May 2014.
(397.75 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Replication is More Efficient Than You Think,”
The IEEE/ACM Conference on High Performance Computing Networking, Storage and Analysis (SC19), Denver, CO, ACM Press, November 2019.
(975.69 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Max-Stretch Minimization on an Edge-Cloud Platform,”
IPDPS'2021, the 34th IEEE International Parallel and Distributed Processing Symposium: IEEE Computer Society Press, 2021.
(4.94 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Optimal Checkpointing Period with replicated execution on heterogeneous platforms,”
2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, IEEE Computer Society Press, June 2017.
(1.02 MB)
“![application/pdf](/modules/file/icons/application-pdf.png)
Efficient Checkpoint/Verification Patterns,”
International Journal on High Performance Computing Applications, July 2015.
(392.76 KB)
“![application/pdf](/modules/file/icons/application-pdf.png)