by Dan Terpstra » Wed Oct 07, 2009 1:36 pm
In general, PAPI measures and reports event counts on a per process or per thread basis. The counter state is saved and restored at context switch so you only see events related to your process. However, on multicore Opterons, there is an exception to this generality. Some events count chip-wide activity, like cache events in the L3 shared cache for Shanghai and Istanbul. These events are counted on a set of shared shadow counters that don't get properly saved and restored at context switch. I don't recall whether the event you're measuring, CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE, is one of these maverick events or not. I suspect it is not, but haven't confirmed that.
Caveat emptor!