Publications
Export 1285 results:
Filters: 10.1016 is j.parco.2021.102871 [Clear All Filters]
Paravirtualization Effect on Single- and Multi-threaded Memory-Intensive Linear Algebra Software,”
Cluster Computing Journal: Special Issue on High Performance Distributed Computing, vol. 12, no. 2: Springer Netherlands, pp. 101-122, 00 2009.
(451.07 KB)
“ParILUT - A New Parallel Threshold ILU,”
SIAM Journal on Scientific Computing, vol. 40, issue 4: SIAM, pp. C503–C519, July 2018.
(19.26 MB)
“ParILUT – A Parallel Threshold ILU for GPUs,”
IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil, IEEE, May 2019.
(505.95 KB)
“PaRSEC: Exploiting Heterogeneity to Enhance Scalability,”
IEEE Computing in Science and Engineering, vol. 15, issue 6, pp. 36-45, November 2013.
(2.16 MB)
“PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed Task-Based Execution,”
2015 IEEE International Conference on Cluster Computing, Chicago, IL, IEEE, September 2015.
(1.77 MB)
“A Pattern-Based Approach to Automated Application Performance Analysis,”
Workshop on Patterns in High Performance Computing, University of Illinois at Urbana-Champaign, May 2005.
(3.47 MB)
“PerfMiner: Cluster-Wide Collection, Storage and Presentation of Application Level Hardware Performance Data,”
European Conference on Parallel Processing (Euro-Par 2005), Monte de Caparica, Portugal, Springer, September 2005.
(205.45 KB)
“Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations,”
2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), Waltham, MA, IEEE, September 2016.
(480.29 KB)
“Performance Analysis and Debugging Tools at Scale,”
Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017.
“Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures,”
The Spring Simulation Multi-Conference 2015 (SpringSim'15), Best Paper Award, Alexandria, VA, April 2015.
(608.44 KB)
“Performance Analysis and Modeling of Task-Based Runtimes,”
Department of Electrical Engineering and Computer Science, vol. PhD, Knoxville, University of Tennessee, May 2016.
(5.14 MB)
“Performance Analysis and Optimization of Two-Sided Factorization Algorithms for Heterogeneous Platform,”
International Conference on Computational Science (ICCS 2015), Reykjavík, Iceland, June 2015.
(1.12 MB)
“Performance Analysis of GYRO: A Tool Evaluation,”
In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.
(172.07 KB)
“Performance Analysis of MPI Collective Operations,”
Cluster computing, vol. 10, no. 2: Springer Netherlands, pp. 127-143, June 2007.
(1018.28 KB)
“Performance Analysis of MPI Collective Operations,”
4th International Workshop on Performance Modeling, Evaluation, and Optmization of Parallel and Distributed Systems (PMEO-PDS '05), Denver, Colorado, April 2005.
(1018.28 KB)
“Performance Analysis of MPI Collective Operations,”
Cluster Computing Journal (to appear), January 2005.
(1018.28 KB)
“Performance Analysis of One-sided Communication Mechanisms,”
Mini-Symposium "Tools Support for Parallel Programming", Proceedings of Parallel Computing (ParCo), no. ICL-UT-06-07, Malaga, Spain, September 2005.
(121.49 KB)
“Performance Analysis of Parallel FFT on Large Multi-GPU Systems,”
2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, IEEE, August 2022.
“Performance Analysis of the MPAS-Ocean Code using HPCToolkit and MIAMI,”
ICL Technical Report, no. ICL-UT-14-01: University of Tennessee, February 2014.
(894.39 KB)
“Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools,”
Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19, Denver, CO, ACM, November 2019.
(429.55 KB)
“On the performance and energy efficiency of sparse linear algebra on GPUs,”
International Journal of High Performance Computing Applications, October 2016.
(1.19 MB)
“Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads Across Accelerators, Coprocessors, and Multicore Processors,”
5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA '14), New Orleans, LA, IEEE, November 2014.
(407.5 KB)
“Performance and Reliability Trade-offs for the Double Checkpointing Algorithm,”
International Journal of Networking and Computing, vol. 4, no. 1, pp. 32-41.
(859.04 KB)
“Performance Application Programming Interface,”
Accelerated Computing with HIP: Sun, Baruah and Kaeli, December 2022.
“Performance Application Programming Interface for Extreme-Scale Environments (PAPI-EX) (Poster)
, Seattle, WA, 2020 NSF Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Principal Investigator Meeting, 20 2020.
(2.53 MB)
Performance Counter Monitoring for the Blue Gene/Q Architecture,”
University of Tennessee Computer Science Technical Report, no. ICL-UT-12-01, 00 2012.
(92.5 KB)
“Performance, Design, and Autotuning of Batched GEMM for GPUs,”
University of Tennessee Computer Science Technical Report, no. UT-EECS-16-739: University of Tennessee, February 2016.
(1.27 MB)
“Performance, Design, and Autotuning of Batched GEMM for GPUs,”
The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 2016.
(1.27 MB)
“Performance, Design, and Autotuning of Batched GEMM for GPUs,”
High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.
(1.98 MB)
“Performance Evaluation for Petascale Quantum Simulation Tools,”
Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 2010.
“Performance evaluation for petascale quantum simulation tools,”
Proceedings of CUG09, Atlanta, GA, May 2009.
(1.09 MB)
“Performance evaluation of eigensolvers in nano-structure computations,”
IEEE/ACM Proceedings of HPCNano SC06 (to appear), January 2006.
(120.61 KB)
“Performance evaluation of LU factorization through hardware counter measurements,”
University of Tennessee Computer Science Technical Report, no. ut-cs-12-700, October 2012.
(794.82 KB)
“Performance Insights into Device-initiated RMA Using Kokkos Remote Spaces,”
2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops), Santa Fe, NM, USA, IEEE, November 2023.
“Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,”
Second International Workshop on OpenMP, Reims, France, January 2006.
(350.9 KB)
“Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,”
Lecture Notes in Computer Science, OpenMP Shared Memory Parallel Programming, vol. 4315: Springer Berlin / Heidelberg, 00 2008.
(350.9 KB)
“Performance Instrumentation and Measurement for Terascale Systems,”
ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003.
(5.36 MB)
“A Performance Model to Execute Workflows on High-Bandwidth Memory Architectures,”
The 47th International Conference on Parallel Processing (ICPP 2018), Eugene, OR, IEEE Computer Society Press, August 2018.
(868.44 KB)
“Performance Modeling for Self Adapting Collective Communications for MPI,”
LACSI Symposium 2001, Santa Fe, NM, October 2001.
(105.49 KB)
“Performance of Asynchronous Optimized Schwarz with One-sided Communication,”
Parallel Computing, vol. 86, pp. 66-81, August 2019.
(3.09 MB)
“Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs,”
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
“Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85: University of Tennessee, June 2014.
(514.64 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, UT-CS-04-526, vol. –89-95, January 2006.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Department Technical Report, CS-89-85, January 2004.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, 00 2011.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software,”
University of Tennessee Computer Science Technical Report, no. cs-89-85, February 2013.
(539.24 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, no. CS-89-85, January 2001.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, CS-89-85, January 2008.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Technical Report, UT-CS-89-85, 00 2010.
(6.42 MB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Dept. Technical Report CS-89-85, 00 2007.
(6.42 MB)
“