Publications
Limitations of the Playstation 3 for High Performance Cluster Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-07-597 (Also LAPACK Working Note 185), 00 2007.
(171.01 KB)
“Memory Leak Detection in Fortran Applications using TAU,”
Proc. DoD HPCMP Users Group Conference (HPCMP-UGC'07), Pittsburgh, PA, IEEE Computer Society, January 2007.
“Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems,”
International Journal of High Performance Computer Applications (to appear), August 2007.
(157.4 KB)
“MPI Collective Algorithm Selection and Quadtree Encoding,”
Parallel Computing (Special Edition: EuroPVM/MPI 2006): Elsevier, 00 2007.
(308.39 KB)
“Multithreading for synchronization tolerance in matrix factorization,”
Journal of Physics: Conference Series, SciDAC 2007, vol. 78, no. 2007, January 2007.
(577.73 KB)
“Netlib and NA-Net: building a scientific computing community,”
In IEEE Annals of the History of Computing (to appear), August 2007.
(352.71 KB)
“A New Approach to MPI Collective Communication Implementations,”
Distributed and Parallel Systems: Springer US, pp. 45-54, 2007.
DOI: 10.1007/978-0-387-69858-8_5 (140.2 KB)
“Numerical Metadata API Reference,”
Innovative Computing Laboratory Technical Report, February 2007.
(454.79 KB)
“Optimal Routing in Binomial Graph Networks,”
The International Conference on Parallel and Distributed Computing, applications and Technologies (PDCAT), Adelaide, Australia, IEEE Computer Society, December 2007.
“Parallel Tiled QR Factorization for Multicore Architectures,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-598 (also LAPACK Working Note 190), 00 2007.
(277.92 KB)
“Performance Analysis of MPI Collective Operations,”
Cluster computing, vol. 10, no. 2: Springer Netherlands, pp. 127-143, June 2007.
(1018.28 KB)
“Performance of Various Computers Using Standard Linear Equations Software (Linpack Benchmark Report),”
University of Tennessee Computer Science Dept. Technical Report CS-89-85, 00 2007.
(6.42 MB)
“Recovery Patterns for Iterative Methods in a Parallel Unstable Environment,”
SIAM SISC (to appear), May 2007.
(241.36 KB)
“Reliability Analysis of Self-Healing Network using Discrete-Event Simulation,”
Proceedings of Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07): IEEE Computer Society, pp. 437-444, May 2007.
“Remembering Ken Kennedy,”
SciDAC Review, vol. 5, no. 2007, 00 2007.
(519.68 KB)
“Results of the PERI survey of SciDAC applications,”
Journal of Physics: Conference Series, SciDAC 2007, vol. 78, no. 2007, January 2007.
(692.83 KB)
“Retrospect: Deterministic Relay of MPI Applications for Interactive Distributed Debugging,”
Accepted for Euro PVM/MPI 2007: Springer, September 2007.
“Revisiting Matrix Product on Master-Worker Platforms,”
International Journal of Foundations of Computer Science (IJFCS) (accepted), 00 2007.
(248.66 KB)
“Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale Shared Memory Multiprocessors,”
Proceedings of the 2007 International Conference on Computational Science (ICCS 2007), vol. 4487-4490, Beijing, China, Springer LNCS, pp. 815-822, 2007.
DOI: 10.1007/978-3-540-72586-2_115 (145.84 KB)
“SCOP3: A Rough Guide to Scientific Computing On the PlayStation 3,”
University of Tennessee Computer Science Dept. Technical Report, UT-CS-07-595, 00 2007.
(1.74 MB)
“Self Adapting Application Level Fault Tolerance for Parallel and Distributed Computing,”
Proceedings of Workshop on Self Adapting Application Level Fault Tolerance for Parallel and Distributed Computing at IPDPS, pp. 1-8, March 2007.
(162.47 KB)
“Self-Healing in Binomial Graph Networks,”
2nd International Workshop On Reliability in Decentralized Distributed Systems (RDDS 2007), Vilamoura, Algarve, Portugal, November 2007.
(322.39 KB)
“Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization,”
UT Computer Science Technical Report (Also LAPACK Working Note 184), no. UT-CS-07-596, January 2007.
(751.57 KB)
“Specification and detection of performance problems with ASL,”
Concurrency and Computation: Practice and Experience, vol. 19, no. 11: John Wiley and Sons Ltd., pp. 1451-1464, January 2007.
“The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot,”
Journal of Computational Physics, vol. 223, pp. 774-782, 00 2007.
(452.6 KB)
“On Using Incremental Profiling for the Performance Analysis of Shared Memory Parallel Applications,”
Proceedings of the 13th International Euro-Par Conference on Parallel Processing (Euro-Par '07), Rennes, France, Springer LNCS, January 2007.
““,”
8th International Conference on Computational Science (ICCS), Proceedings Parts I, II, and III, Lecture Notes in Computer Science, vol. 5101, Krakow, Poland, Springer Berlin, January 2008.
“,”
15th European PVM/MPI Users' Group Meeting, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, vol. 5205, Dublin Ireland, Springer Berlin, January 2008.
“,”
7th International parallel Processing and Applied Mathematics Conference, Lecture Notes in Comptuer Science, vol. 4967, Gdansk, Poland, Springer Berlin, January 2008.
Algorithm-Based Fault Tolerance for Fail-Stop Failures,”
IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 12, January 2008.
(340.49 KB)
“Algorithmic Based Fault Tolerance Applied to High Performance Computing,”
University of Tennessee Computer Science Technical Report, UT-CS-08-620 (also LAPACK Working Note 205), January 2008.
(313.55 KB)
“Analytical Modeling for Affinity-Based Thread Scheduling on Multicore Platforms,”
University of Tennessee Computer Science Technical Report, UT-CS-08-626, January 2008.
(650.75 KB)
“A Comparison of Search Heuristics for Empirical Code Optimization,”
The 3rd international Workshop on Automatic Performance Tuning, Tsukuba, Japan, October 2008.
(772.48 KB)
“Computing the Conditioning of the Components of a Linear Least Squares Solution,”
VECPAR '08, High Performance Computing for Computational Science, Toulouse, France, January 2008.
(374.97 KB)
“Custom assignment of MPI ranks for parallel multi-dimensional FFTs: Evaluation of BG/P versus BG/L,”
Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA-08), Sydney, Australia, IEEE Computer Society, pp. 271-283, January 2008.
(2.6 MB)
“DARPA's HPCS Program: History, Models, Tools, Languages,”
in Advances in Computers, vol. 72: Elsevier, January 2008.
(3.61 MB)
“Detection and Analysis of Iterative Behavior in Parallel Applications,”
Proceedings of the 2008 International Conference on Computational Science (ICCS 2008), vol. 5103, Krakow, Poland, pp. 261-267, January 2008.
(141.02 KB)
“Enhancing the Performance of Dense Linear Algebra Solvers on GPUs (in the MAGMA Project)
, Austin, TX, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC08), November 2008.
(5.28 MB)
Exploiting Mixed Precision Floating Point Hardware in Scientific Computations,”
in High Performance Computing and Grids in Action, Amsterdam, IOS Press, January 2008.
(92.95 KB)
“Exploring New Architectures in Accelerating CFD for Air Force Applications,”
Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.
(492.86 KB)
“Fast and Small Short Vector SIMD Matrix Multiplication Kernels for the CELL Processor,”
University of Tennessee Computer Science Technical Report, no. UT-CS-08-609, (also LAPACK Working Note 189), January 2008.
(500.99 KB)
“Fault Tolerance Management for a Hierarchical GridRPC Middleware,”
8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), Lyon, France, January 2008.
(319.79 KB)
“High Performance GridRPC Middleware,”
Recent developments in Grid Technology and Applications: Nova Science Publishers, 00 2008.
(923.06 KB)
“How Elegant Code Evolves With Hardware: The Case Of Gaussian Elimination,”
in Beautiful Code Leading Programmers Explain How They Think (Chapter 14), pp. 243-282, January 2008.
(257 KB)
“HPCS Library Study Effort,”
University of Tennessee Computer Science Technical Report, UT-CS-08-617, January 2008.
(73.22 KB)
“The Impact of Paravirtualized Memory Hierarchy on Linear Algebra Computational Kernels and Software,”
ACM/IEEE International Symposium on High Performance Distributed Computing, Boston, MA., June 2008.
(403.89 KB)
“Interactive Grid-Access Using Gridsolve and Giggle,”
Computing and Informatics, vol. 27, no. 2, pp. 233-248,ISSN1335-9150, 00 2008.
(533.4 KB)
“Interior State Computation of Nano Structures,”
PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway, May 2008.
(137.12 KB)
“The LINPACK Benchmark: Past, Present, and Future,”
Concurrency: Practice and Experience, vol. 15, pp. 803-820, 00 2008.
(94.86 KB)
“Matrix Product on Heterogeneous Master Worker Platforms,”
2008 PPoPP Conference, Salt Lake City, Utah, January 2008.
“