PPT Slide
Instructions: For this exercise, use the files provided in the directory matmul2-f. You will need to edit the file matmul.f. One possible solution has been provided in the file matmul.f.ANS.
- Compile by typing make matmul and execute matmul, recording the Mflops values returned for kji, dgemm and mydgemm. You will get some “0.000” values. Those are from areas where you are expected to edit the code and are not doing anything currently.
- Note: In order to speed up your execution, you can comment out each routine after you have finished recording its execution rates. For example, you could comment out the kji, dgemm and mydgemm routines now and you would not have to wait for them to execute in future runs.
- Edit matmul.f and uncomment and correct the routine kjib which should be a blocked version of kji ( use blocks of size 32). Compile and execute the code, recording the Mflops values.
- Edit matmul.f and uncomment and correct the routine kjiu which should be an unrolled version of kji. Compile and execute the code, recording the Mflops values.
- Which optimizations achieved the best performance?