Publications
Accurate Cache and TLB Characterization Using Hardware Counters,”
International Conference on Computational Science (ICCS 2004), Krakow, Poland, Springer, June 2004.
(167.1 KB)
“Automating the Large-Scale Collection and Analysis of Performance,”
5th LCI International Conference on Linux Clusters: The HPC Revolution, Austin, Texas, May 2004.
(511.6 KB)
“A Comparison of Counting and Sampling Modes of Using Performance Monitoring Hardware,”
International Conference on Computational Science (ICCS 2002), Amsterdam, Netherlands, Springer, April 2002.
(122 KB)
“End-user Tools for Application Performance Analysis, Using Hardware Counters,”
International Conference on Parallel and Distributed Computing Systems, Dallas, TX, August 2001.
(306.54 KB)
“Experiences and Lessons Learned with a Portable Interface to Hardware Performance Counters,”
PADTAD Workshop, IPDPS 2003, Nice, France, IEEE, April 2003.
(432.57 KB)
“Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations,”
2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, IEEE, April 2013.
(307.24 KB)
“The PAPI Cross-Platform Interface to Hardware Performance Counters,”
Department of Defense Users' Group Conference Proceedings, Biloxi, Mississippi, June 2001.
(328.56 KB)
“Performance Instrumentation and Measurement for Terascale Systems,”
ICCS 2003 Terascale Workshop, Melbourne, Australia, Springer, Berlin, Heidelberg, June 2003.
(5.36 MB)
“Performance Profiling and Analysis of DoD Applications using PAPI and TAU,”
Proceedings of DoD HPCMP UGC 2005, Nashville, TN, IEEE, June 2005.
(322.56 KB)
“Using PAPI for Hardware Performance Monitoring on Linux Systems,”
Conference on Linux Clusters: The HPC Revolution, Urbana, Illinois, Linux Clusters Institute, June 2001.
(422.35 KB)
“An Algebra for Cross-Experiment Performance Analysis,”
2004 International Conference on Parallel Processing (ICCP-04), Montreal, Quebec, Canada, August 2004.
(166.12 KB)
“Automatic Experimental Analysis of Communication Patterns in Virtual Topologies,”
In Proceedings of the International Conference on Parallel Processing, Oslo, Norway, IEEE Computer Society, June 2005.
(227.13 KB)
“Continuous Runtime Profiling of OpenMP Applications,”
Proceedings of the 2007 Conference on Parallel Computing (PARCO 2007), Juelich and Aachen, Germany, January 2007.
(408.01 KB)
“The Design of an Auto-tuning I/O Framework on Cray XT5 System,”
Cray Users Group Conference (CUG'11) (Best Paper Finalist), Fairbanks, Alaska, May 2011.
(459.57 KB)
“Detection and Analysis of Iterative Behavior in Parallel Applications,”
Proceedings of the 2008 International Conference on Computational Science (ICCS 2008), vol. 5103, Krakow, Poland, pp. 261-267, January 2008.
(141.02 KB)
“Efficient Pattern Search in Large Traces through Successive Refinement,”
Proceedings of Euro-Par 2004, Pisa, Italy, Springer-Verlag, August 2004.
(177.46 KB)
“Evaluation of the HPC Challenge Benchmarks in Virtualized Environments,”
6th Workshop on Virtualization in High-Performance Cloud Computing, Bordeaux, France, August 2011.
(114.73 KB)
“Experiments with Strassen's Algorithm: From Sequential to Parallel,”
18th IASTED International Conference on Parallel and Distributed Computing and Systems PDCS 2006 (submitted), Dallas, Texas, January 2006.
(514.33 KB)
“Exploring New Architectures in Accelerating CFD for Air Force Applications,”
Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, January 2008.
(492.86 KB)
“Feedback-Directed Thread Scheduling with Memory Considerations,”
IEEE International Symposium on High Performance Distributed Computing, Monterey Bay, CA, June 2007.
(297.24 KB)
“Improving Time to Solution with Automated Performance Analysis,”
Second Workshop on Productivity and Performance in High-End Computing (P-PHEC) at 11th International Symposium on High Performance Computer Architecture (HPCA-2005), San Francisco, February 2005.
(112.63 KB)
“L2 Cache Modeling for Scientific Applications on Chip Multi-Processors,”
Proceedings of the 2007 International Conference on Parallel Processing, Xi'an, China, IEEE Computer Society, January 2007.
(654.11 KB)
“Large Event Traces in Parallel Performance Analysis,”
8th Workshop 'Parallel Systems and Algorithms' (PASA), Lecture Notes in Informatics, no. ICL-UT-06-08, Frankfurt/Main, Germany, Gesellschaft für Informatik, March 2006.
(92.47 KB)
“Making Performance Analysis and Tuning Part of the Software Development Cycle,”
Proceedings of DoD HPCMP UGC 2009, San Diego, CA, IEEE, June 2009.
“Measuring Energy and Power with PAPI,”
International Workshop on Power-Aware Systems and Architectures, Pittsburgh, PA, September 2012.
(146.79 KB)
“Memory Leak Detection in Fortran Applications using TAU,”
Proc. DoD HPCMP Users Group Conference (HPCMP-UGC'07), Pittsburgh, PA, IEEE Computer Society, January 2007.
“Metacomputing Support for the SARA3D Structural Acoustics Application,”
Department of Defense Users' Group Conference (to appear), Biloxi, Mississippi, June 2001.
(64.58 KB)
“Modeling the Office of Science Ten Year Facilities Plan: The PERI Architecture Tiger Team,”
SciDAC 2009, Journal of Physics: Conference Series, vol. 180(2009)012039, San Diego, California, IOP Publishing, July 2009.
(906.39 KB)
“OpenMP-centric Performance Analysis of Hybrid Applications,”
Proc. 2008 IEEE International Conference on Cluster Computing (CLUSTER 2008), Tsukuba, Japan, January 2008.
(218.63 KB)
“PAPI: A Portable Interface to Hardware Performance Counters,”
Proceedings of Department of Defense HPCMP Users Group Conference, June 1999.
(57.77 KB)
“Parallel I/O for EQM Applications,”
Department of Defense Users' Group Conference Proceedings (to appear),, Biloxi, Mississippi, June 2001.
(81.41 KB)
“A Pattern-Based Approach to Automated Application Performance Analysis,”
Workshop on Patterns in High Performance Computing, University of Illinois at Urbana-Champaign, May 2005.
(3.47 MB)
“Performance Analysis of GYRO: A Tool Evaluation,”
In Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 2005.
(172.07 KB)
“Performance Evaluation for Petascale Quantum Simulation Tools,”
Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 2010.
“Performance evaluation for petascale quantum simulation tools,”
Proceedings of CUG09, Atlanta, GA, May 2009.
(1.09 MB)
“Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications,”
Second International Workshop on OpenMP, Reims, France, January 2006.
(350.9 KB)
“Power-Aware Prediction Models of Hybrid (MPI/OpenMP) Scientific Applications,”
International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 2011.
(479.49 KB)
“Results of the PERI survey of SciDAC applications,”
Journal of Physics: Conference Series, SciDAC 2007, vol. 78, no. 2007, January 2007.
(692.83 KB)
“A Scalable Approach to MPI Application Performance Analysis,”
In Proc. of the 12th European Parallel Virtual Machine and Message Passing Interface Conference: Springer LNCS, September 2005.
(988.58 KB)
“A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters,”
Proceedings of SuperComputing 2000 (SC'00), Dallas, TX, November 2000.
(178.15 KB)
“A Scalable Non-blocking Multicast Scheme for Distributed DAG Scheduling,”
The International Conference on Computational Science 2009 (ICCS 2009), vol. 5544, Baton Rouge, LA, pp. 195-204, May 2009.
(228.45 KB)
“Secure Remote Access to Numerical Software and Computational Hardware,”
Proceedings of the DoD HPC Users Group Conference (HPCUG) 2000, Albuquerque, NM, June 2000.
(172.6 KB)
“Usage of the Scalasca Toolset for Scalable Performance Analysis of Large-scale Parallel Applications,”
Proceedings of the 2nd International Workshop on Tools for High Performance Computing, Stuttgart, Germany, Springer, pp. 157-167, January 2008.
(229.2 KB)
“Visualizing the Program Execution Control Flow of OpenMP Applications,”
Proc. 4th International Workshop on OpenMP (IWOMP 2008), West Lafayette, Indiana, Lecture Notes in Computer Science 5004, pp. 181-190, January 2008.
(194.25 KB)
“Active Netlib: An Active Mathematical Software Collection for Inquiry-based Computational Science and Engineering Education,”
Journal of Digital Information special issue on Interactivity in Digital Libraries, vol. 2, no. 4, 00 2002.
(182.59 KB)
“Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems,”
IEEE Cluster 2009, New Orleans, August 2009.
(395.53 KB)
“Automatic Analysis of Inefficiency Patterns in Parallel Applications,”
Concurrency and Computation: Practice and Experience, vol. 19, no. 11, pp. 1481-1496, August 2007.
(233.31 KB)
“Automatic analysis of inefficiency patterns in parallel applications,”
Concurrency and Computation: Practice and Experience, Special issue "Automatic Performance Analysis" (submitted), 00 2005.
(233.31 KB)
“Autotuned Parallel I/O for Highly Scalable Biosequence Analysis,”
TeraGrid'11, Salt Lake City, Utah, July 2011.
(275.34 KB)
“Capturing and Analyzing the Execution Control Flow of OpenMP Applications,”
International Journal of Parallel Programming, vol. 37, no. 3, pp. 266-276, 00 2009.
“