Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap
Distributed shared memory architectures (DSM`s) such as the Origin 2000 are being implemented which extend the concept of single-processor cache hierarchies across an entire physically-distributed multiprocessor machine. The scalability of a DSM machine is inherently tied to memory hierarchy performance, including such issues as latency hiding techniques in the architecture, global cache-coherence protocols, memory consistency models and, of course, the inherent locality of reference in algorithms of interest. In this paper, we characterize application performance with a {open_quotes}memory-centric{close_quotes} view. Using a simple mean value analysis (MVA) strategy and empirical performance data, we infer the contribution of each level in the memory system to the application`s overall cycles per instruction (cpi). We account for the overlap of processor execution with memory accesses - a key parameter which is not directly measurable on the Origin systems. We infer the separate contributions of three major architecture features in the memory subsystem of the Origin 2000: cache size, outstanding loads-under-miss, and memory latency.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE, Washington, DC (United States)
- DOE Contract Number:
- W-7405-ENG-36
- OSTI ID:
- 621718
- Report Number(s):
- LA-UR-97-3462; CONF-980214-; ON: DE98000349; TRN: AD-a340 077
- Resource Relation:
- Conference: 4. international symposium on high-performance computing architecture, Las Vegas, NV (United States), 1-4 Feb 1998; Other Information: PBD: 1997
- Country of Publication:
- United States
- Language:
- English
Similar Records
An empirical hierarchical memory model based on hardware performance counters
A mean value analysis multiprocessor model incorporating superscalar processors and latency tolerating techniques