Publications
Export 1285 results:
Filters: 10.1109 is TPDS.2021.3131657 [Clear All Filters]
Assembly Operations for Multicore Architectures using Task-Based Runtime Systems,”
Euro-Par 2014, Porto, Portugal, Springer International Publishing, August 2014.
(481.52 KB)
“ASCR@40: Highlights and Impacts of ASCR’s Programs
: US Department of Energy’s Office of Advanced Scientific Computing Research, June 2020.
ASCR@40: Four Decades of Department of Energy Leadership in Advanced Scientific Computing Research
: Advanced Scientific Computing Advisory Committee (ASCAC), US Department of Energy, August 2020.
Argobots: A Lightweight Low-Level Threading and Tasking Framework,”
IEEE Transactions on Parallel and Distributed Systems, October 2017.
“Are we Doing the Right Thing? – A Critical Analysis of the Academic HPC Community,”
2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, IEEE, May 2019.
(622.32 KB)
“Approximate Computing for Scientific Applications,”
Approximate Computing Techniques, 322: Springer International Publishing, pp. 415 - 465, January 2022.
“Approximate and Exact Selection on GPUs,”
2019 IEEE International Parallel and Distributed Processing Symposium Workshops, Rio de Janeiro, Brazil, IEEE, May 2019.
(440.71 KB)
“Applying Aspect-Oriented Programming Concepts to a Component-based Programming Model,”
IPDPS 2003, Workshop on NSF-Next Generation Software, Nice, France, March 2003.
(66.99 KB)
“Application of Machine Learning to the Selection of Sparse Linear Solvers,”
International Journal of High Performance Computing Applications (submitted), 00 2006.
(392.96 KB)
“Anatomy of a Globally Recursive Embedded LINPACK Benchmark,”
2012 IEEE High Performance Extreme Computing Conference, Waltham, MA, pp. 1-6, September 2012.
(204.74 KB)
“Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU Clusters,”
IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, IEEE, May 2018.
(1.37 MB)
“Analyzing PAPI Performance on Virtual Machines,”
VMWare Technical Journal, vol. Winter 2013, January 2014.
“Analyzing PAPI Performance on Virtual Machines,”
ICL Technical Report, no. ICL-UT-13-02, August 2013.
(437.37 KB)
“Analytical Modeling for Affinity-Based Thread Scheduling on Multicore Platforms,”
University of Tennessee Computer Science Technical Report, UT-CS-08-626, January 2008.
(650.75 KB)
“Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems,”
IEEE Cluster 2009, New Orleans, August 2009.
(395.53 KB)
“Analysis of Various Scalar, Vector, and Parallel Implementations of RandomAccess,”
Innovative Computing Laboratory (ICL) Technical Report, no. ICL-UT-10-03, June 2010.
(226.9 KB)
“Analysis of the Communication and Computation Cost of FFT Libraries towards Exascale,”
ICL Technical Report, no. ICL-UT-22-07: Innovative Computing Laboratory, July 2022.
(5.91 MB)
“Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,”
Submitted to Concurrency and Computations: Practice and Experience, November 2010.
(1.65 MB)
“Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures,”
University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), March 2011.
(1.65 MB)
“Analysis and Optimization of Yee_Bench using Hardware Performance Counters,”
Proceedings of Parallel Computing 2005 (ParCo), Malaga, Spain, January 2005.
(72.27 KB)
“Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs,”
IEEE Transactions on Parallel and Distributed Systems, vol. 29, issue 12, pp. 2700–2712, December 2018.
(2.53 MB)
“Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-18-09: Innovative Computing Laboratory, University of Tennessee, September 2018.
(3.74 MB)
“Algorithms and Optimization Techniques for High-Performance Matrix-Matrix Multiplications of Very Small Matrices,”
Parallel Computing, vol. 81, pp. 1–21, January 2019.
(3.27 MB)
“On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,”
University of Tennessee Computer Science Technical Report, no. UT-CS-13-715, July 2013, 2012.
(358.98 KB)
“Algorithmic Redistribution Methods for Block Cyclic Decompositions,”
IEEE Transactions on Parallel and Distributed Computing, vol. 10, no. 12, pp. 201-220, October 2002.
(524.82 KB)
“Algorithmic Issues on Heterogeneous Computing Platforms,”
Parallel Processing Letters, vol. 9, no. 2, pp. 197-213, January 1999.
(301.17 KB)
“Algorithmic Based Fault Tolerance Applied to High Performance Computing,”
Journal of Parallel and Distributed Computing, vol. 69, pp. 410-416, 00 2009.
(313.55 KB)
“Algorithmic Based Fault Tolerance Applied to High Performance Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-08-620 (also LAPACK Working Note 205), January 2008.
(313.55 KB)
“Algorithm-Based Fault Tolerance for Fail-Stop Failures,”
IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 12, January 2008.
(340.49 KB)
“Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures, and Accuracy,”
ACM Transactions on Parallel Computing, vol. 1, issue 2, no. 10, pp. 10:1-10:28, January 2015.
(1.14 MB)
“Algorithm-based Fault Tolerance for Dense Matrix Factorizations,”
University of Tennessee Computer Science Technical Report, no. UT-CS-11-676, Knoxville, TN, August 2011.
(865.79 KB)
“Algorithm-Based Fault Tolerance for Dense Matrix Factorization,”
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2012, New Orleans, LA, USA, ACM, pp. 225-234, February 2012.
(865.79 KB)
“Algorithm-Based Checkpoint-Free Fault Tolerance for Parallel Matrix Computations on Volatile Resources,”
University of Tennessee Computer Science Department Technical Report, vol. –05-561, November 2005.
(266.54 KB)
“Algorithm-Based Checkpoint-Free Fault Tolerance for Parallel Matrix Computations on Volatile Resources,”
IPDPS 2006, 20th IEEE International Parallel and Distributed Processing Symposium, Rhodes Island, Greece, January 2006.
(266.54 KB)
“Algebraic Schwarz Preconditioning for the Schur Complement: Application to the Time-Harmonic Maxwell Equations Discretized by a Discontinuous Galerkin Method.,”
The Twentieth International Conference on Domain Decomposition Methods, La Jolla, California, February 2011.
“An Algebra for Cross-Experiment Performance Analysis,”
2004 International Conference on Parallel Processing (ICCP-04), Montreal, Quebec, Canada, August 2004.
(166.12 KB)
“AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,”
Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023.
“Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers,”
2022 International Conference for High Performance Computing, Networking, Storage and Analysis (SC22), Dallas, TX, IEEE Computer Society, pp. 354-367, November 2022.
(1.57 MB)
“Adaptive Scheduling for Task Farming with Grid Middleware,”
International Journal of Supercomputer Applications and High-Performance Computing, vol. 13, no. 3, pp. 231-240, October 2002.
(461.08 KB)
“Adaptive Precision Solvers for Sparse Linear Systems,”
3rd International Workshop on Energy Efficient Supercomputing (E2SC '15), Austin, TX, ACM, November 2015.
“Adaptive Precision in Block-Jacobi Preconditioning for Iterative Sparse Linear System Solvers,”
Concurrency and Computation: Practice and Experience, vol. 31, no. 6, pp. e4460, March 2019.
(341.54 KB)
“ADAPT: An Event-Based Adaptive Collective Communication Framework,”
The 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '18), Tempe, Arizona, ACM Press, June 2018.
(493.65 KB)
“Active Netlib: An Active Mathematical Software Collection for Inquiry-based Computational Science and Engineering Education,”
Journal of Digital Information special issue on Interactivity in Digital Libraries, vol. 2, no. 4, 00 2002.
(182.59 KB)
“Active Logistical State Management in the GridSolve/L,”
4th International Symposium on Cluster Computing and the Grid (CCGrid 2004)(submitted), Chicago, Illinois, January 2004.
(123.69 KB)
“Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization,”
University of Tennessee Computer Science Technical Report (also as a LAWN), no. ICL-UT-11-08, September 2011.
(618.53 KB)
“Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting,”
Concurrency and Computation: Practice and Experience, vol. 26, issue 7, pp. 1408-1431, May 2014.
(1.96 MB)
“Accurate Cache and TLB Characterization Using Hardware Counters,”
International Conference on Computational Science (ICCS 2004), Krakow, Poland, Springer, June 2004.
(167.1 KB)
“Access-averse Framework for Computing Low-rank Matrix Approximations,”
First International Workshop on High Performance Big Graph Data Management, Analysis, and Mining, Washington, DC, October 2014.
“Acceleration of the BLAST Hydro Code on GPU,”
Supercomputing '12 (poster), Salt Lake City, Utah, SC12, November 2012.
“Acceleration of GPU-based Krylov solvers via Data Transfer Reduction,”
International Journal of High Performance Computing Applications, 2015.
“