PAPI counters to measure child process?
I am trying to use PAPI to measure cache hit rates for child processes. My C program initializes the PAPI library and starts my counters, then forks a child process to execute a benchmark. The parent waits for the child to finish, stops the counters and reports their values. I'm running this program with 'taskset...' to tie the parent & child to a single processor core. It seems that the counters are only counting misses for the parent process, rather than for the processor core. Am I approaching this problem the wrong way? Should this give me measurements for the core or just for the process that makes the calls to PAPI? Any help is greatly appreciated... Thanks.