MAPredict: Static Analysis Driven Memory Access Prediction Framework for Modern CPUs
- ORNL
- University of Oregon
Application memory access patterns are crucial in deciding how much traffic is served by the cache and forwarded to the dynamic random-access memory (DRAM). However, predicting such memory traffic is difficult because of the interplay of prefetchers, compilers, parallel execution, and innovations in manufacturer-specific micro-architectures. This research introduced MAPredict, a static analysis-driven framework that addresses these challenges to predict last-level cache (LLC)-DRAM traffic. By exploring and analyzing the behavior of modern Intel processors, MAPredict formulates cache-aware analytical models. MAPredict invokes these models to predict LLC-DRAM traffic by combining the application model, machine model, and user-provided hints to capture dynamic information. MAPredict successfully predicts LLC-DRAM traffic for different regular access patterns and provides the means to combine static and empirical observations for irregular access patterns. Evaluating 130 workloads from six applications on recent Intel micro-architectures, MAPredict yielded an average accuracy of 99% for streaming, 91% for strided, and 92% for stencil patterns. By coupling static and empirical methods, up to 97% average accuracy was obtained for random access patterns on different micro-architectures.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1887689
- Country of Publication:
- United States
- Language:
- English
Evaluating the Intel Skylake Xeon Processor for HPC Workloads
|
conference | July 2018 |
Quantitatively Modeling Application Resilience with the Data Vulnerability Factor
|
conference | November 2014 |
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments
|
conference | September 2010 |
Aspen-based performance and energy modeling frameworks
|
journal | October 2018 |
COMPASS: A Framework for Automated Performance Modeling and Prediction
|
conference | January 2015 |
The Long and Winding Road Toward Efficient High-Performance Computing
|
journal | November 2018 |
Tuyere
|
conference | June 2018 |
Roofline: an insightful visual performance model for multicore architectures
|
journal | April 2009 |
Understanding the Impact of Memory Access Patterns in Intel Processors
|
conference | November 2020 |
Ramulator: A Fast and Extensible DRAM Simulator
|
journal | January 2016 |
Cetus: A Source-to-Source Compiler Infrastructure for Multicores
|
journal | December 2009 |
| LULESH Programming Model and Performance Ports Overview | report | December 2012 |
Characterizing Power and Performance of GPU Memory Access
|
conference | November 2016 |
Gables: A Roofline Model for Mobile SoCs
|
conference | February 2019 |
Main memory and cache performance of intel sandy bridge and AMD bulldozer
|
conference | June 2014 |
The Tau Parallel Performance System
|
journal | May 2006 |
Performance Evaluation of an Intel Haswell-and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications
|
conference | December 2016 |
DRAMSim2: A Cycle Accurate Memory System Simulator
|
journal | January 2011 |
Performance Evaluation of Intel Broadwell Nodes Based Supercomputer Using Computational Fluid Dynamics and Climate Applications
|
conference | December 2017 |
OpenARC: open accelerator research compiler for directive-based, efficient heterogeneous computing
|
conference | January 2014 |
Performance Analysis with Cache-Aware Roofline Model in Intel Advisor
|
conference | July 2017 |
Aspen: A domain specific language for performance modeling
|
conference | November 2012 |
Similar Records
Comparing LLC-Memory Traffic between CPU and GPU Architectures
A Quantitative Measure of Memory Reference Regularity