Page 1 of 1

Core wise performance counters

PostPosted: Sun Apr 19, 2015 5:55 pm
by jppatel3
Hi all,

I am using PAPI for one of course project for profiling of applications on Multicore processors (i5 and i7).
I have some basic doubts about the library.
1) I read that PAPI is thread specific instead of core specific and it takes care of context switching. Does that means I will get specified events during the life time of thread irrespective of migration of that thread on different cores.
2) When I type papi_avail it shows number of Hardware counters as 11. I want to know that whether 11 is number of total Hardware counters available for each thread or for all threads. In one of the presentation I read that PAPI supports 8 programmable counters per core or 4 if there is hyper threading. I want to read 2-3 events (more if it supports) for four different threads mapped to different cores(with hyper threading enabled) running simultaneously.

Thanks in Advance !!

Re: Core wise performance counters

PostPosted: Thu Apr 23, 2015 8:10 pm
by s_ragate
Hi,

1) Yes, by default you will get the counts specific to the thread irrespective of migration of that on different cores. But, PAPI also has some advance features that allows you to get the counts per core for all the threads/processes together running in it.
2) I believe, papi_avail gives the number of possible preset events for the particular architecture. For more details on events in PAPI, please go through the following link
http://icl.cs.utk.edu/projects/papi/wiki/Events

Yes, you can count many events simultaneously and if you are counting events on 4 threads mapped to 4 different cores, that is more good news because you will get enough counters for most of the events. However, the maximum number of simultaneous events is limited by the processor hardware support, You need to refer to your processor manual to understand those details. Again, if there is no support from hardware you can use "multiplex" support provided by PAPI. Some of the preset events already use certain combination of available native events. So in that case starting a preset event count implies starting of 2 or 3 native events simultaneously.

Hope that answers your query.

Re: Core wise performance counters

PostPosted: Mon Jun 08, 2015 7:52 pm
by pulsamurah
can i just cross compile PAPI and offload papi.h to the MIC?

Re: Core wise performance counters

PostPosted: Fri Jun 26, 2015 1:12 pm
by s_ragate
pulsamurah,

When you build PAPI with MIC component on the host system, that is what basically happens. You are cross compiling PAPI for MIC component. Later when you use the PAPI to instrument your code and build the code it along with papi.h and libpapi.a static library, the generated executable can be offloaded to MIC for performance measurement.