COntrol flow event Notification Engine

Version 1.0, December 2003

© Innovative Computing Laboratory, University of Tennessee, Knoxville


INTRODUCTION

CONE is a call-graph profiler for MPI applications on AIX Power4 platforms which maps hardware-counter data onto the full call graph including line numbers. CONE is based on a run-time call-graph tracking technique developed by IBM, which is described in this paper. CONE implements a deterministic profiling approach as opposed to statistical profiling, that is, measurements are made whenever the current call path changes.

CONE uses DPCL to traverse and instrument the executable in binary form and causes the target application to make calls to a probe module responsible for performance monitoring. The performance data collected include wall-clock time, user time, and system time as well as different hardware counters accessible via the PAPI library. The data are written to a file and can be displayed using CUBE, a generic presentation component capable of displaying a wide variety of hierarchical performance data.

CONE was developed as a part of the PAPI project.


DOWNLOAD

This software is free but copyright © 2003 by Innovative Computing Laboratary, University of Tennessee, Knoxville. By downloading and using this software you automatically agree to comply with the regulations as described in the license agreement.

Source in gzipped tar format
Version Date Description
1.0 25-DEC-2003 Initial version


INSTALLATION

CONE installation consists of the following steps:
  1. Install CUBE. Be sure that the location of cube-config is in your search path.
    
    
    
  2. Specify the CONE build path by modifying the $(HOME) and $(PREFIX) variables in Makefile.defs in the cone directory. The default prefix is set to $(HOME)/cone-1.0.
    
        In Makefile.defs
    
        PREFIX=CONE_BUILD_PATH
        
  3. Specify the DPCL include path and library path in $(DPCLINC) and $(DPCLLIB) variables respectively. The default DPCL path is set to /usr/lpp/ppe.dpcl.
    
        In Makefile.defs
    
        DPCLINC = -I/(DPCL path)/include
        DPCLLIB = -L/(DPCL path)/lib -ldpcl
        
  4. Specify the PAPI include path and library path in $(PAPIINC), $(PAPILIB) and $(PAPISLIB) variables respectively. CONE uses the PAPI-3.0-beta version of PAPI. So, make sure that you install the PAPI-3.0-beta version of PAPI.
    
        In Makefile.defs
    
        PAPIINC  = -I$(HOME)/(PAPI path)/include 
        PAPILIB  = -L$(HOME)/(PAPI path)/libpapi.a -lpapi
        PAPISLIB =  $(HOME)/(PAPI path)/libpapi.so
        
  5. Specify the PMAPI library path in $(PMAPILIB) variable. The default PMAPI path is set to /usr/pmapi/.
    
        In Makefile.defs
    
        PMAPILIB = -L/(PMAPI path)/lib -lpmapi
        
  6. Make the CONE source in the src/.. directory.
    
        % gmake
        
  7. Install CONE.
    
        % gmake install
        
  8. Modify your $(PATH) environment variable to include $(CONE PREFIX)/bin.


USAGE

A typical CONE session has the following syntax:

     cone  [-poe | -nopoe]  [-main  mainfun] [-evs n]
           [-host hostname] [executable] [poe options]
The options and parameters have the following meaning:
[-poe | -nopoe] [OPTIONAL]   Specifies whether the CONE target application is a parallel (poe) or a serial (nopoe) application. The default target application type is set to -poe.
[-main mainfun] [OPTIONAL]  If the main function of a target application is not called "main" (e.g., like in many Fortran programs), the user needs to specify its name using this option. If not specified, CONE will assume that the name of the main function is "main".
[-evs n] [OPTIONAL]  Chooses a particular event set n for measuring hardware performance counters. The default event set used is the event set 0. Below is the description of the four event sets currently supported by CONE.

Description of Event Sets
EVS Option Events Measured using PAPI
-evs 0 Cycles, L1 cache Data misses, L1 Cache Data Accesses, TLB Misses, Total Instructions
-evs 1 Cycles, Total Instructions, L1 Data Cache Read, L1 Data Cache Write
-evs 2 Cycles, Total Instructions,Total Integer Instructions
-evs 3 Cycles, Total Instructions, Total FP Instructions, FP SQRT Instructions, FP DIV Instructions

[-host hostname] [OPTIONAL] if specified, then hostname sets the name of the host machine running the target application. The default specification of host is set to "localhost".
[executable] Executable of the target application.
[poe options] These are the usual parameters specified with poe. For example -procs n.

CONE produces an output file named as $(executable).cube which can be viewed using CUBE.


LOAD-LEVELER

Many AIX platforms require jobs exceeding a certain number of processes to be launched using LoadLeveler. To perform measurements for an application started in this way, you can invoke CONE from your LL script.

An example of using CONE with LoadLeveler can be found here. This is a LL script for submitting jobs of the target application mpi_hello. All job parameters, such as the desired number of processes, can be modified as usual.


LIMITATIONS

  • One of the main limitations of CONE results from the limitation of the underlying instrumentation library DPCL. Since DPCL identifies a function called from a function call-site only by name, CONE is unable to cope with applications defining a function name twice for functions.
    
    
  • CONE is unable to statically identify indirect calls made via a function pointer at runtime. This limits the usability of CONE for C++ applications.
    
    
  • CONE is unable to support multi-threaded applications.
    
    
  • CONE does not support recursive applications.
    
     
    
    
    


    REFERENCES

    • Luiz De Rose, Felix Wolf: CATCH - A Call-Graph Based Automatic Tool for Capture of Hardware Performance Metrics for MPI and OpenMP Applications. Proc. of the 8th International Euro-Par Conference , Paderborn, Germany, LNCS 2400, pp. 167-176, Springer, 2002