Publications
Export 1285 results:
Filters: 10.1007 is 978-3-031-08751-6_5 [Clear All Filters]
An efficient distributed randomized solver with application to large dense linear systems,”
ICL Technical Report, no. ICL-UT-12-02, July 2012.
(626.26 KB)
“Solving Dense Symmetric Indefinite Systems using GPUs,”
Concurrency and Computation: Practice and Experience, vol. 29, issue 9, March 2017.
DOI: 10.1002/cpe.4055 (1.94 MB)
“Accelerating Scientific Computations with Mixed Precision Algorithms,”
Computer Physics Communications, vol. 180, issue 12, pp. 2526-2533, December 2009.
DOI: 10.1016/j.cpc.2008.11.005 (402.69 KB)
“A Parallel Tiled Solver for Symmetric Indefinite Systems On Multicore Architectures,”
IPDPS 2012, Shanghai, China, May 2012.
(544.09 KB)
“Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project)
, Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.
(5.28 MB)
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,”
Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.
DOI: 10.1007/978-3-319-32149-3_9 (327.14 KB)
“Accelerating Linear System Solutions Using Randomization Techniques,”
INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 2011.
(358.79 KB)
“Using dual techniques to derive componentwise and mixed condition numbers for a linear functional of a linear least squares solution,”
University of Tennessee Computer Science Technical Report, UT-CS-08-622 (also LAPACK Working Note 207), January 2008.
(159.65 KB)
“A Collection of Presentations from the BDEC2 Workshop in Kobe, Japan,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-19-09: University of Tennessee, Knoxville, February 2019.
(58.85 MB)
“LAPACK,”
Handbook of Linear Algebra, Second, Boca Raton, FL, CRC Press, 2013.
(223.21 KB)
“Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation,”
International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.
(480.73 KB)
“PERI Auto-tuning,”
Proc. SciDAC 2008, vol. 125, Seatlle, Washington, Journal of Physics, January 2008.
(873.75 KB)
“Task-graph scheduling extensions for efficient synchronization and communication,”
Proceedings of the ACM International Conference on Supercomputing, pp. 88–101, 2021.
DOI: 10.1145/3447818.3461616
“OpenMP application experiences: Porting to accelerated nodes,”
Parallel Computing, vol. 109, March 2022.
DOI: 10.1016/j.parco.2021.102856
“Autotuning in High-Performance Computing Applications,”
Proceedings of the IEEE, vol. 106, issue 11, pp. 2068–2083, November 2018.
DOI: 10.1109/JPROC.2018.2841200 (2.5 MB)
“Communication-Avoiding Symmetric-Indefinite Factorization,”
SIAM Journal on Matrix Analysis and Application, vol. 35, issue 4, pp. 1364-1406, July 2014.
(593.18 KB)
“When to checkpoint at the end of a fixed-length reservation?,”
Fault Tolerance for HPC at eXtreme Scales (FTXS) Workshop, Denver, United States, August 2023.
“Effortless Monitoring of Arithmetic Intensity with PAPI's Counter Analysis Toolkit,”
13th International Workshop on Parallel Tools for High Performance Computing, Dresden, Germany, Springer International Publishing, September 2020.
(738.47 KB)
“Effortless Monitoring of Arithmetic Intensity with PAPI’s Counter Analysis Toolkit,”
Tools for High Performance Computing 2018/2019: Springer, pp. 195–218, 2021.
DOI: 10.1007/978-3-030-66057-4_11
“Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements,”
2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, Florida, IEEE, August 2023.
DOI: 10.1109/IPDPSW59300.2023.00070 (1.81 MB)
“Memory Traffic and Complete Application Profiling with PAPI Multi-Component Measurements
, St. Petersburg, FL, 28th HIPS Workshop, May 2023.
(3.99 MB)
xSDK4ECP: Extreme-scale Scientific Software Development Kit for ECP (Poster)
, Houston, TX, 2020 Exascale Computing Project Annual Meeting, February 2020.
(1.54 MB)
The Internet BackPlane Protocol: A Study in Resource Sharing,”
Proceedings of the second IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, October 2002.
“Internet Backplane Protocol - Test Language v. 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(22.43 KB)
“Internet Backplane Protocol: API 1.0,”
University of Tennessee Computer Science Technical Report, no. UT-CS-01-464, January 2001.
(55.33 KB)
“Dynamic DAG scheduling under memory constraints for shared-memory platforms,”
Int. J. of Networking and Computing, vol. 11, no. 1, pp. 27-49, 2021.
(574.64 KB)
“Revisiting Dynamic DAG Scheduling under Memory Constraints for Shared-Memory Platforms,”
22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020), New Orleans, LA, IEEE Computer Society Press, May 2020.
(317.93 KB)
“A survey on checkpointing strategies: Should we always checkpoint à la Young/Daly?,”
Future Generation Computer Systems, July 2024.
DOI: 10.1016/j.future.2024.07.022
“High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,”
2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.
(1.3 MB)
“Portable Representation of Internet Content Channels in I2-DSI,”
4th Intl. Web Caching Workshop, San Diego, CA, March 1999.
“Middleware for the Use of Storage in Communication,”
Parallel Computing, vol. 28, no. 12, pp. 1773-1788, August 2002.
(87.97 KB)
“Logistical Quality of Service in NetSolve,”
Computer Communications, vol. 22, no. 11, pp. 1034-1044, January 1999.
(168.39 KB)
“Logistical Networking: Sharing More Than the Wires,”
In Active Middleware Services, Ed. Salim Hariri, Craig A. Lee, Cauligi S. Raghavendra (2000), Kluwer Academic, Norwell, MA, January 2000.
(84.69 KB)
“Data Logistics: Toolkit and Applications,”
5th EAI International Conference on Smart Objects and Technologies for Social Good, Valencia, Spain, September 2019.
(6.71 MB)
“Interoperable Convergence of Storage, Networking, and Computation,”
Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), no. 2: Springer International Publishing, pp. 667-690, 2020.
(1.8 MB)
“Scalable, Trustworthy Network Computing Using Untrusted Intermediaries: A Position Paper,”
DOE/NSF Workshop on New Directions in Cyber-Security in Large-Scale Networks: Development Obstacles, National Conference Center - Landsdowne, Virginia, March 2003.
(54.62 KB)
“Enabling Full Service Surrogates Using the Portable Channel Representation,”
Tenth International World Wide Web Conference Proceedings (to appear),, Hong Kong, May 2001.
(267.23 KB)
“Active Logistical State Management in the GridSolve/L,”
4th International Symposium on Cluster Computing and the Grid (CCGrid 2004)(submitted), Chicago, Illinois, January 2004.
(123.69 KB)
“HARNESS: A Next Generation Distributed Virtual Machine,”
International Journal on Future Generation Computer Systems, vol. 15, no. 5-6, pp. 571-582, January 1999.
(183.78 KB)
“Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication,”
submitted to SC2001, Denver, Colorado, November 2001.
(41.79 KB)
“Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science (PPAM 2011), vol. 7203: Springer-Verlag Berlin Heidelberg, pp. 133-142, 00 2012.
(145.76 KB)
“Towards a Parallel Tile LDL Factorization for Multicore Architectures,”
ICL Technical Report, no. ICL-UT-11-03, Seattle, WA, April 2011.
(425.45 KB)
“Reducing the Amount of Pivoting in Symmetric Indefinite Systems,”
University of Tennessee Innovative Computing Laboratory Technical Report, no. ICL-UT-11-06, Knoxville, TN, Submitted to PPAM 2011, May 2011.
(145.76 KB)
“Harnessing the Computing Continuum for Programming Our World,”
Fog Computing: Theory and Practice: John Wiley & Sons, Inc., 2020.
DOI: 10.1002/9781119551713.ch7 (1.4 MB)
“A Look Back on 30 Years of the Gordon Bell Prize,”
International Journal of High Performance Computing and Networking, vol. 31, issue 6, pp. 469–484, 2017.
“Assessing General-purpose Algorithms to Cope with Fail-stop and Silent Errors,”
ACM Transactions on Parallel Computing, August 2016.
DOI: 10.1145/2897189 (573.71 KB)
“Resilient Co-Scheduling of Malleable Applications,”
International Journal of High Performance Computing Applications (IJHPCA), May 2017.
DOI: 10.1177/1094342017704979 (1.62 MB)
“Identifying the Right Replication Level to Detect and Correct Silent Errors at Scale,”
2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, ACM, June 2017.
DOI: 10.1145/3086157.3086162 (865.68 KB)
“Multi-Level Checkpointing and Silent Error Detection for Linear Workflows,”
Journal of Computational Science, vol. 28, pp. 398–415, September 2018.
“Towards Optimal Multi-Level Checkpointing,”
IEEE Transactions on Computers, vol. 66, issue 7, pp. 1212–1226, July 2017.
DOI: 10.1109/TC.2016.2643660 (1.39 MB)
“