MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs

Monil, M. A. H.; Lee, Seyong; Vetter, Jeffrey; Malony, Allen

doi:10.1007/978-3-031-07312-0_12

MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs

Conference · Sun May 01 00:00:00 EDT 2022

DOI:https://doi.org/10.1007/978-3-031-07312-0_12· OSTI ID:1887689

^[1]; ^[1]; ^[1]; Malony, Allen ^[2]

ORNL
University of Oregon

Application memory access patterns are crucial in deciding how much traffic is served by the cache and forwarded to the dynamic random-access memory (DRAM). However, predicting such memory traffic is difficult because of the interplay of prefetchers, compilers, parallel execution, and innovations in manufacturer-specific micro-architectures. This research introduced MAPredict, a static analysis-driven framework that addresses these challenges to predict last-level cache (LLC)-DRAM traffic. By exploring and analyzing the behavior of modern Intel processors, MAPredict formulates cache-aware analytical models. MAPredict invokes these models to predict LLC-DRAM traffic by combining the application model, machine model, and user-provided hints to capture dynamic information. MAPredict successfully predicts LLC-DRAM traffic for different regular access patterns and provides the means to combine static and empirical observations for irregular access patterns. Evaluating 130 workloads from six applications on recent Intel micro-architectures, MAPredict yielded an average accuracy of 99% for streaming, 91% for strided, and 92% for stencil patterns. By coupling static and empirical methods, up to 97% average accuracy was obtained for random access patterns on different micro-architectures.

View Conference

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1887689

Country of Publication:: United States

Language:: English

References (22)

Evaluating the Intel Skylake Xeon Processor for HPC Workloads Hammond, Simon; Vaughan, Courtenay; Hughes, Clay 2018 International Conference on High Performance Computing & Simulation (HPCS) https://doi.org/10.1109/HPCS.2018.00064	conference	July 2018
Quantitatively Modeling Application Resilience with the Data Vulnerability Factor Yu, Li; Li, Dong; Mittal, Sparsh SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.62	conference	November 2014
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments Treibig, Jan; Hager, Georg; Wellein, Gerhard 2010 International Conference on Parallel Processing Workshops (ICPPW), 2010 39th International Conference on Parallel Processing Workshops https://doi.org/10.1109/ICPPW.2010.38	conference	September 2010
Aspen-based performance and energy modeling frameworks Umar, Mariam; Moore, Shirley V.; Meredith, Jeremy S. Journal of Parallel and Distributed Computing, Vol. 120 https://doi.org/10.1016/j.jpdc.2017.11.005	journal	October 2018
COMPASS: A Framework for Automated Performance Modeling and Prediction Lee, Seyong; Meredith, Jeremy S.; Vetter, Jeffrey S. Proceedings of the 29th ACM on International Conference on Supercomputing - ICS '15 https://doi.org/10.1145/2751205.2751220	conference	January 2015
The Long and Winding Road Toward Efficient High-Performance Computing Jalby, William; Kuck, David; Malony, Allen D. Proceedings of the IEEE, Vol. 106, Issue 11 https://doi.org/10.1109/JPROC.2018.2851190	journal	November 2018
Tuyere Peng, Ivy Bo; Vetter, Jeffrey S.; Moore, Shirley V. Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing https://doi.org/10.1145/3208040.3208057	conference	June 2018
Roofline: an insightful visual performance model for multicore architectures Williams, Samuel; Waterman, Andrew; Patterson, David Communications of the ACM, Vol. 52, Issue 4 https://doi.org/10.1145/1498765.1498785	journal	April 2009
Understanding the Impact of Memory Access Patterns in Intel Processors Monil, Mohammad Alaul Haque; Lee, Seyong; Vetter, Jeffrey S. 2020 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC) https://doi.org/10.1109/MCHPC51950.2020.00012	conference	November 2020
Ramulator: A Fast and Extensible DRAM Simulator Kim, Yoongu; Yang, Weikun; Mutlu, Onur IEEE Computer Architecture Letters, Vol. 15, Issue 1 https://doi.org/10.1109/LCA.2015.2414456	journal	January 2016
Cetus: A Source-to-Source Compiler Infrastructure for Multicores Dave, Chirag; Bae, Hansang; Min, Seung-Jai Computer, Vol. 42, Issue 12 https://doi.org/10.1109/MC.2009.385	journal	December 2009
LULESH Programming Model and Performance Ports Overview Karlin, I. https://doi.org/10.2172/1059462	report	December 2012
Characterizing Power and Performance of GPU Memory Access Allen, Tyler; Ge, Rong 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC) https://doi.org/10.1109/E2SC.2016.012	conference	November 2016
Gables: A Roofline Model for Mobile SoCs Hill, Mark; Janapa Reddi, Vijay 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA) https://doi.org/10.1109/HPCA.2019.00047	conference	February 2019
Main memory and cache performance of intel sandy bridge and AMD bulldozer Molka, Daniel; Hackenberg, Daniel; Schöne, Robert Proceedings of the workshop on Memory Systems Performance and Correctness https://doi.org/10.1145/2618128.2618129	conference	June 2014
The Tau Parallel Performance System Shende, Sameer S.; Malony, Allen D. The International Journal of High Performance Computing Applications, Vol. 20, Issue 2 https://doi.org/10.1177/1094342006064482	journal	May 2006
Performance Evaluation of an Intel Haswell-and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications Saini, Subhash; Hood, Robert; Chang, Johnny 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0167	conference	December 2016
DRAMSim2: A Cycle Accurate Memory System Simulator Rosenfeld, P.; Cooper-Balis, E.; Jacob, B. IEEE Computer Architecture Letters, Vol. 10, Issue 1 https://doi.org/10.1109/L-CA.2011.4	journal	January 2011
Performance Evaluation of Intel Broadwell Nodes Based Supercomputer Using Computational Fluid Dynamics and Climate Applications Saini, Subhash; Hood, Robert 2017 IEEE 19th International Conference on High Performance Computing and Communications Workshops (HPCCWS) https://doi.org/10.1109/HPCCWS.2017.00015	conference	December 2017
OpenARC: open accelerator research compiler for directive-based, efficient heterogeneous computing Lee, Seyong; Vetter, Jeffrey S. Proceedings of the 23rd international symposium on High-performance parallel and distributed computing - HPDC '14 https://doi.org/10.1145/2600212.2600704	conference	January 2014
Performance Analysis with Cache-Aware Roofline Model in Intel Advisor Marques, Diogo; Duarte, Helder; Ilic, Aleksandar 2017 International Conference on High Performance Computing & Simulation (HPCS) https://doi.org/10.1109/HPCS.2017.150	conference	July 2017
Aspen: A domain specific language for performance modeling Spafford, Kyle L.; Vetter, Jeffrey S. 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.20	conference	November 2012

Similar Records

Understanding the Impact of Memory Access Patterns in Intel Processors

Conference · Sun Nov 01 00:00:00 EDT 2020 · OSTI ID:1779130

Comparing LLC-Memory Traffic between CPU and GPU Architectures

Conference · Mon Nov 01 00:00:00 EDT 2021 · OSTI ID:1887663

A Quantitative Measure of Memory Reference Regularity

Conference · Mon Oct 01 00:00:00 EDT 2001 · OSTI ID:15006306

MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs

Citation Formats

References (22)

Similar Records

Related Subjects