Publications
Export 1285 results:
Filters: 10.1007 is 978-3-030-90539-2 [Clear All Filters]
High Performance Computing Systems: Status and Outlook,”
Acta Numerica, vol. 21, Cambridge, UK, Cambridge University Press, pp. 379-474, May 2012.
(1.48 MB)
“High Performance Conjugate Gradient Benchmark: A new Metric for Ranking High Performance Computing Systems,”
International Journal of High Performance Computing Applications, vol. 30, issue 1, pp. 3 - 10, February 2016.
DOI: 10.1177/1094342015593158 (277.51 KB)
“High Performance Computing Trends,”
HERMIS, vol. 2, pp. 155-163, November 2001.
“Numerical Algorithms for High-Performance Computational Science,”
Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020.
DOI: 10.1098/rsta.2019.0066 (724.37 KB)
“A Not So Simple Matter of Software,”
NCSA Access Online: NCSA, 00 2005.
(457.69 KB)
“Dense Linear Algebra on Accelerated Multicore Hardware,”
High Performance Scientific Computing: Algorithms and Applications, London, UK, Springer-Verlag, 00 2012.
“JLAPACK - Compiling LAPACK Fortran to Java,”
Scientific Programming, vol. 7, no. 2, pp. 111-138, October 2002.
(307.46 KB)
“Task Based Cholesky Decomposition on Xeon Phi Architectures using OpenMP,”
International Journal of Computational Science and Engineering (IJCSE), vol. 17, no. 3, October 2018.
DOI: http://dx.doi.org/10.1504/IJCSE.2018.095851
“Preconditioners for Batched Iterative Linear Solvers on GPUs,”
Smoky Mountains Computational Sciences and Engineering Conference, vol. 169075: Springer Nature Switzerland, pp. 38 - 53, January 2023.
DOI: 10.1007/978-3-031-23606-810.1007/978-3-031-23606-8_3
“Soft Error Resilient QR Factorization for Hybrid System with GPGPU,”
Journal of Computational Science, Seattle, WA, Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems at SC11, November 2011.
(965.88 KB)
“Soft Error Resilient QR Factorization for Hybrid System,”
University of Tennessee Computer Science Technical Report, no. UT-CS-11-675, Knoxville, TN, July 2011.
(1.39 MB)
“Soft Error Resilient QR Factorization for Hybrid System,”
UT-CS-11-675 (also LAPACK Working Note #252), no. ICL-CS-11-675, July 2011.
(1.39 MB)
“Tuning Principal Component Analysis for GRASS GIS on Multi-core and GPU Architectures,”
FOSS4G 2010, Barcelona, Spain, September 2010.
(1.57 MB)
“Providing GPU Capability to LU and QR within the ScaLAPACK Framework,”
University of Tennessee Computer Science Technical Report (also LAWN 272), no. UT-CS-12-699, September 2012.
(7.48 MB)
“From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming,”
Parallel Computing, vol. 38, no. 8, pp. 391-407, August 2012.
(1.64 MB)
“Soft Error Resilient QR Factorization for Hybrid System with GPGPU,”
Journal of Computational Science, vol. 4, issue 6, pp. 457–464, November 2013.
DOI: http://dx.doi.org/10.1016/j.jocs.2013.01.004 (995.45 KB)
“Robustness of the Young/Daly Formula for Stochastic Iterative Applications,”
49th International Conference on Parallel Processing (ICPP 2020), Edmonton, AB, Canada, ACM Press, August 2020.
(1.11 MB)
“Optimal Checkpointing Strategies for Iterative Applications,”
IEEE Transactions on Parallel Distributed Systems, vol. 33, issue 3, pp. 507-522, March 2022.
DOI: 10.1109/TPDS.2021.3099440 (1.47 MB)
“High Performance Dense Linear System Solver with Soft Error Resilience,”
IEEE Cluster 2011, Austin, TX, September 2011.
(1.27 MB)
“Algorithm-Based Fault Tolerance for Dense Matrix Factorization,”
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2012, New Orleans, LA, USA, ACM, pp. 225-234, February 2012.
DOI: 10.1145/2145816.2145845 (865.79 KB)
“Algorithm-based Fault Tolerance for Dense Matrix Factorizations,”
University of Tennessee Computer Science Technical Report, no. UT-CS-11-676, Knoxville, TN, August 2011.
(865.79 KB)
“High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors,”
ICCS 2012, Omaha, NE, June 2012.
(1.27 MB)
“Mixed-Tool Performance Analysis on Hybrid Multicore Architectures,”
First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, September 2010.
(1.24 MB)
“OpenCL Evaluation for Numerical Linear Algebra Library Development,”
Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.
(2.69 MB)
“Using Software-Based Performance Counters to Expose Low-Level Open MPI Performance Information,”
EuroMPI, Chicago, IL, ACM, September 2017.
DOI: 10.1145/3127024.3127039 (745.58 KB)
“Applying Aspect-Oriented Programming Concepts to a Component-based Programming Model,”
IPDPS 2003, Workshop on NSF-Next Generation Software, Nice, France, March 2003.
(66.99 KB)
“Improvements in the Efficient Composition of Applications,”
IPDPS 2004, NGS Workshop (to appear), Sante Fe, 00 2004.
(42.85 KB)
“Least Squares Solvers for Distributed-Memory Machines with GPU Accelerators,”
ACM International Conference on Supercomputing (ICS '19), Phoenix, Arizona, ACM, pp. 117–126, June 2019.
DOI: https://dl.acm.org/doi/abs/10.1145/3330345.3330356 (1.63 MB)
“The Component Structure of a Self-Adapting Numerical Software System,”
International Journal of Parallel Programming, vol. 33, no. 2, June 2005.
(64.88 KB)
“Automatic Determination of Matrix-Blocks,”
Lapack Working Note 151, University of Tennessee Computer Science Technical Report, no. UT-CS-01-458, January 2001.
(1.15 MB)
“A Proposed Standard for Matrix Metadata,”
Innovative Computing Laboratory Technical Report, no. ICL-UT-03-02, Submitted to ACM TOMS, November 2003.
(13.39 KB)
“The 'Weighted Modification' Incomplete Factorisation Method,”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-436, December 1999.
(198.71 KB)
“Polynomial Acceleration of Optimised Multi-grid Smoothers; Basic Theory,”
ICL Technical Report, vol. 156, no. ICL-UT-02-03, January 2002.
(100.66 KB)
“On the Existence Problem of Incomplete Factorisation Methods,”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-435, December 1999.
(222.2 KB)
“Numerical Metadata API Reference,”
Innovative Computing Laboratory Technical Report, February 2007.
(454.79 KB)
“IBP - Internet Backplane Protocol: Infrastructure for Distributed Storage (V O.2),”
University of Tennessee Computer Science Department Technical Report, no. UT-CS-99-430, February 1999.
(37.72 KB)
“An Asynchronous Algorithm on NetSolve Global Computing System,”
PRiSM - Laboratoire de recherche en informatique, Université de Versailles St-Quentin Technical Report, March 2004.
(377.33 KB)
“Fault Tolerant Communication Library and Applications for High Performance Computing,”
Los Alamos Computer Science Institute (LACSI) Symposium 2003 (presented), Santa Fe, NM, October 2003.
(146.05 KB)
“Parallel IO Support for Meta-Computing Applications: MPI_Connect IO Applied to PACX-MPI,”
8th European PVM/MPI User's Group Meeting, Lecture Notes in Computer Science, vol. 2131, Greece, Springer Verlag, Berlin, September 2001.
(129.3 KB)
“Building and using a Fault Tolerant MPI implementation,”
International Journal of High Performance Applications and Supercomputing (to appear), 00 2004.
“Scalable Networked Information Processing Environment (SNIPE),”
Journal on Future Generation Computer Systems, vol. 15, no. 5/6, pp. 595-605, January 1999.
(189.21 KB)
“Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing,”
International Journal for High Performance Applications and Supercomputing (to appear), April 2004.
(186.9 KB)
“HARNESS and Fault Tolerant MPI,”
Parallel Computing, vol. 27, no. 11, pp. 1479-1496, January 2001.
(164.2 KB)
“HARNESS Fault Tolerant MPI Design, Usage and Performance Issues,”
Future Generation Computer Systems, vol. 18, no. 8, pp. 1127-1142, January 2002.
(403.41 KB)
“Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems,”
Proceedings of ISC2004 (to appear), Heidelberg, Germany, June 2004.
(548.38 KB)
“Fault Tolerant MPI for the HARNESS Meta-Computing System,”
Proceedings of International Conference of Computational Science - ICCS 2001, Lecture Notes in Computer Science, vol. 2073, Berlin, Springer Verlag, pp. 355-366, 00 2001.
DOI: 10.1007/3-540-45545-0_44
“Scalable Fault Tolerant MPI: Extending the Recovery Algorithm,”
Proceedings of 12th European Parallel Virtual Machine and Message Passing Interface Conference - Euro PVM/MPI, vol. 3666, Sorrento (Naples) , Italy, Springer-Verlag Berlin, pp. 67, September 2005.
(144.86 KB)
“Flexible collective communication tuning architecture applied to Open MPI,”
2006 Euro PVM/MPI (submitted), Bonn, Germany, January 2006.
(206.58 KB)
“FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World,”
Lecture Notes in Computer Science: Proceedings of EuroPVM-MPI 2000, (Hungary: Springer Verlag, 2000), pp. V1908,346-353, January 2000.
(51.95 KB)
“Resilience for Stencil Computations with Latent Errors,”
International Conference on Parallel Processing (ICPP), Bristol, UK, IEEE Computer Society Press, August 2017.
(1.19 MB)
“