Hello everyone,
I have a question about using PAPI. I am wondering can PAPI measure the number of a specific instructions to be executed?
For example, a matrix multiplication assembly code (generated by the gcc flag -S) may contain both vectorized instruction (mulps) and scalar instruction (mulss), can PAPI measure the number of execution of such instructions?
Thanks in advance.
