Diagnosis and Optimization of Application Prefetching Performance

TitleDiagnosis and Optimization of Application Prefetching Performance
Publication TypeConference Paper
Year of Publication2013
AuthorsMarin, G., C. McCurdy, and J. Vetter
EditorMalony, A. D., M. Nemirovsky, and S. Midkiff
Conference NameProceedings of the 27th ACM International Conference on Supercomputing (ICS '13)
Date Published2013-06
PublisherACM Press
Conference LocationEugene, Oregon, USA
ISBN Number9781450321303

Hardware prefetchers are effective at recognizing streaming memory access patterns and at moving data closer to the processing units to hide memory latency. However, hardware prefetchers can track only a limited number of data streams due to finite hardware resources. In this paper, we introduce the term streaming concurrency to characterize the number of parallel, logical data streams in an application. We present a simulation algorithm for understanding the streaming concurrency at any point in an application, and we show that this metric is a good predictor of the number of memory requests initiated by streaming prefetchers. Next, we try to understand the causes behind poor prefetching performance. We identified four prefetch unfriendly conditions and we show how to classify an application’s memory references based on these conditions. We evaluated our analysis using the SPEC CPU2006 benchmark suite. We selected two benchmarks with unfavorable access patterns and transformed them to improve their prefetching effectiveness. Results show that making applications more prefetcher friendly can yield meaningful performance gains.