INTRODUCTION
CONE is a call-graph profiler for MPI applications on AIX Power4 platforms
which maps hardware-counter data onto the full call graph including line
numbers.
CONE is based on a run-time call-graph tracking technique
developed by IBM, which is described in this
paper.
CONE implements a
deterministic profiling approach as opposed to statistical profiling,
that is, measurements are made whenever the current call path changes.
CONE uses DPCL to traverse and instrument the executable in binary
form and causes the target application to make calls to a probe module
responsible for performance monitoring. The performance data collected
include wall-clock time, user time, and system time as well as
different hardware counters accessible via the PAPI library. The data
are written to a file and can be displayed using CUBE, a generic
presentation component capable of displaying a wide variety of
hierarchical performance data.
CONE was developed as a part of the
PAPI project.
DOWNLOAD
This software is
free but
copyright © 2003 by
Innovative Computing Laboratary,
University of Tennessee, Knoxville. By downloading and using this
software you automatically agree to comply with the regulations as
described in the
license agreement.
Source in gzipped tar format
|
Version
| Date
| Description
|
1.0
| 25-DEC-2003
| Initial version
|
INSTALLATION
CONE installation consists of the following steps:
- Install CUBE. Be sure that the location of cube-config is in your search path.
- Specify the CONE build path by modifying the $(HOME)
and $(PREFIX) variables in Makefile.defs in the cone directory. The default prefix is set to
$(HOME)/cone-1.0.
In Makefile.defs
PREFIX=CONE_BUILD_PATH
- Specify the DPCL include path and library path in $(DPCLINC) and $(DPCLLIB)
variables respectively. The default DPCL path is set to /usr/lpp/ppe.dpcl.
In Makefile.defs
DPCLINC = -I/(DPCL path)/include
DPCLLIB = -L/(DPCL path)/lib -ldpcl
- Specify the PAPI include path and library path in $(PAPIINC), $(PAPILIB) and
$(PAPISLIB) variables respectively. CONE uses the PAPI-3.0-beta version of PAPI. So, make sure that you install the PAPI-3.0-beta version of PAPI.
In Makefile.defs
PAPIINC = -I$(HOME)/(PAPI path)/include
PAPILIB = -L$(HOME)/(PAPI path)/libpapi.a -lpapi
PAPISLIB = $(HOME)/(PAPI path)/libpapi.so
- Specify the PMAPI library path in $(PMAPILIB) variable. The default
PMAPI path is set to /usr/pmapi/.
In Makefile.defs
PMAPILIB = -L/(PMAPI path)/lib -lpmapi
- Make the CONE source in the src/.. directory.
% gmake
- Install CONE.
% gmake install
- Modify your $(PATH) environment variable to include $(CONE PREFIX)/bin.
USAGE
A typical CONE session has the following syntax:
The options and parameters have the following meaning:
[-poe | -nopoe]
| [OPTIONAL] Specifies
whether the CONE target application is a parallel (poe)
or a serial (nopoe) application. The default target application type is set to -poe.
|
[-main mainfun]
| [OPTIONAL] If the main function of a target application is not called "main" (e.g., like in many
Fortran programs), the user needs to specify its name using this option. If not specified, CONE will assume that the
name of the main function is "main".
|
[-evs n]
| [OPTIONAL] Chooses a particular event set n for measuring
hardware performance counters. The default event set used is the event set 0.
Below is the description of the four event sets currently supported by CONE.
Description of Event Sets
|
EVS Option
| Events Measured using PAPI
|
-evs 0
| Cycles, L1 cache Data misses, L1 Cache Data Accesses, TLB Misses, Total Instructions
|
-evs 1
| Cycles, Total Instructions, L1 Data Cache Read, L1 Data Cache Write
|
-evs 2
| Cycles, Total Instructions,Total Integer Instructions
|
-evs 3
| Cycles, Total Instructions, Total FP Instructions, FP SQRT Instructions, FP DIV Instructions
|
|
[-host hostname]
| [OPTIONAL] if specified, then hostname sets the name of
the host machine running the target application. The default specification of host is set to "localhost".
|
[executable]
| Executable of the target application.
|
[poe options]
| These are the usual parameters specified with poe. For example -procs n.
|
CONE produces an output file named as $(executable).cube
which can be viewed using CUBE.
LOAD-LEVELER
Many AIX platforms require jobs exceeding a certain number of
processes to be launched using
LoadLeveler. To perform measurements for an application started in this way, you can invoke CONE from your LL script.
An example of using CONE with LoadLeveler can be found here. This is a LL script for submitting jobs of
the target application mpi_hello. All job parameters, such as the desired number of processes, can be modified as usual.
LIMITATIONS
One of the main limitations of CONE results from the limitation of the underlying instrumentation library DPCL. Since DPCL identifies a function called from a function call-site only by name, CONE is unable to cope with applications defining a function name twice for functions.
CONE is unable to statically identify indirect calls made via a function pointer at runtime. This limits the usability of CONE for C++ applications.
CONE is unable to support multi-threaded applications.
CONE does not support recursive applications.
REFERENCES
-
Luiz De Rose, Felix Wolf: CATCH - A Call-Graph Based Automatic
Tool for Capture of Hardware Performance Metrics for MPI and OpenMP
Applications. Proc. of the 8th International Euro-Par Conference
, Paderborn, Germany, LNCS 2400, pp. 167-176, Springer,
2002