Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs

Conference ·

Application memory access patterns are crucial in deciding how much traffic is served by the cache and forwarded to the dynamic random-access memory (DRAM). However, predicting such memory traffic is difficult because of the interplay of prefetchers, compilers, parallel execution, and innovations in manufacturer-specific micro-architectures. This research introduced MAPredict, a static analysis-driven framework that addresses these challenges to predict last-level cache (LLC)-DRAM traffic. By exploring and analyzing the behavior of modern Intel processors, MAPredict formulates cache-aware analytical models. MAPredict invokes these models to predict LLC-DRAM traffic by combining the application model, machine model, and user-provided hints to capture dynamic information. MAPredict successfully predicts LLC-DRAM traffic for different regular access patterns and provides the means to combine static and empirical observations for irregular access patterns. Evaluating 130 workloads from six applications on recent Intel micro-architectures, MAPredict yielded an average accuracy of 99% for streaming, 91% for strided, and 92% for stencil patterns. By coupling static and empirical methods, up to 97% average accuracy was obtained for random access patterns on different micro-architectures.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1887689
Country of Publication:
United States
Language:
English

References (22)

Evaluating the Intel Skylake Xeon Processor for HPC Workloads conference July 2018
Quantitatively Modeling Application Resilience with the Data Vulnerability Factor conference November 2014
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments
  • Treibig, Jan; Hager, Georg; Wellein, Gerhard
  • 2010 International Conference on Parallel Processing Workshops (ICPPW), 2010 39th International Conference on Parallel Processing Workshops https://doi.org/10.1109/ICPPW.2010.38
conference September 2010
Aspen-based performance and energy modeling frameworks journal October 2018
COMPASS: A Framework for Automated Performance Modeling and Prediction conference January 2015
The Long and Winding Road Toward Efficient High-Performance Computing journal November 2018
Tuyere
  • Peng, Ivy Bo; Vetter, Jeffrey S.; Moore, Shirley V.
  • Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing https://doi.org/10.1145/3208040.3208057
conference June 2018
Roofline: an insightful visual performance model for multicore architectures journal April 2009
Understanding the Impact of Memory Access Patterns in Intel Processors conference November 2020
Ramulator: A Fast and Extensible DRAM Simulator journal January 2016
Cetus: A Source-to-Source Compiler Infrastructure for Multicores journal December 2009
LULESH Programming Model and Performance Ports Overview report December 2012
Characterizing Power and Performance of GPU Memory Access conference November 2016
Gables: A Roofline Model for Mobile SoCs conference February 2019
Main memory and cache performance of intel sandy bridge and AMD bulldozer conference June 2014
The Tau Parallel Performance System journal May 2006
Performance Evaluation of an Intel Haswell-and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications
  • Saini, Subhash; Hood, Robert; Chang, Johnny
  • 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0167
conference December 2016
DRAMSim2: A Cycle Accurate Memory System Simulator journal January 2011
Performance Evaluation of Intel Broadwell Nodes Based Supercomputer Using Computational Fluid Dynamics and Climate Applications conference December 2017
OpenARC: open accelerator research compiler for directive-based, efficient heterogeneous computing conference January 2014
Performance Analysis with Cache-Aware Roofline Model in Intel Advisor conference July 2017
Aspen: A domain specific language for performance modeling
  • Spafford, Kyle L.; Vetter, Jeffrey S.
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.20
conference November 2012

Similar Records

Understanding the Impact of Memory Access Patterns in Intel Processors
Conference · Sun Nov 01 00:00:00 EDT 2020 · OSTI ID:1779130

Comparing LLC-Memory Traffic between CPU and GPU Architectures
Conference · Mon Nov 01 00:00:00 EDT 2021 · OSTI ID:1887663

A Quantitative Measure of Memory Reference Regularity
Conference · Mon Oct 01 00:00:00 EDT 2001 · OSTI ID:15006306

Related Subjects