Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications