skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap

Abstract

Distributed shared memory architectures (DSM`s) such as the Origin 2000 are being implemented which extend the concept of single-processor cache hierarchies across an entire physically-distributed multiprocessor machine. The scalability of a DSM machine is inherently tied to memory hierarchy performance, including such issues as latency hiding techniques in the architecture, global cache-coherence protocols, memory consistency models and, of course, the inherent locality of reference in algorithms of interest. In this paper, we characterize application performance with a {open_quotes}memory-centric{close_quotes} view. Using a simple mean value analysis (MVA) strategy and empirical performance data, we infer the contribution of each level in the memory system to the application`s overall cycles per instruction (cpi). We account for the overlap of processor execution with memory accesses - a key parameter which is not directly measurable on the Origin systems. We infer the separate contributions of three major architecture features in the memory subsystem of the Origin 2000: cache size, outstanding loads-under-miss, and memory latency.

Authors:
; ; ;
Publication Date:
Research Org.:
Los Alamos National Lab., NM (United States)
Sponsoring Org.:
USDOE, Washington, DC (United States)
OSTI Identifier:
621718
Report Number(s):
LA-UR-97-3462; CONF-980214-
ON: DE98000349; TRN: AD-a340 077
DOE Contract Number:  
W-7405-ENG-36
Resource Type:
Conference
Resource Relation:
Conference: 4. international symposium on high-performance computing architecture, Las Vegas, NV (United States), 1-4 Feb 1998; Other Information: PBD: 1997
Country of Publication:
United States
Language:
English
Subject:
99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; ALGORITHMS; VALIDATION; MATHEMATICAL MODELS; MICROPROCESSORS; PERFORMANCE; MEMORY DEVICES; COMPUTER ARCHITECTURE; PARALLEL PROCESSING; HYDRODYNAMICS; TRANSPORT THEORY; H CODES; N CODES; L CODES; S CODES

Citation Formats

Lubeck, Olaf M, Luo, Yong, Wasserman, Harvey J, and Bassetti, Federico. Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap. United States: N. p., 1997. Web.
Lubeck, Olaf M, Luo, Yong, Wasserman, Harvey J, & Bassetti, Federico. Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap. United States.
Lubeck, Olaf M, Luo, Yong, Wasserman, Harvey J, and Bassetti, Federico. Wed . "Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap". United States. https://www.osti.gov/servlets/purl/621718.
@article{osti_621718,
title = {Development and Validation of a Hierarchical Memory Model Incorporating CPU- and Memory-Operation Overlap},
author = {Lubeck, Olaf M and Luo, Yong and Wasserman, Harvey J and Bassetti, Federico},
abstractNote = {Distributed shared memory architectures (DSM`s) such as the Origin 2000 are being implemented which extend the concept of single-processor cache hierarchies across an entire physically-distributed multiprocessor machine. The scalability of a DSM machine is inherently tied to memory hierarchy performance, including such issues as latency hiding techniques in the architecture, global cache-coherence protocols, memory consistency models and, of course, the inherent locality of reference in algorithms of interest. In this paper, we characterize application performance with a {open_quotes}memory-centric{close_quotes} view. Using a simple mean value analysis (MVA) strategy and empirical performance data, we infer the contribution of each level in the memory system to the application`s overall cycles per instruction (cpi). We account for the overlap of processor execution with memory accesses - a key parameter which is not directly measurable on the Origin systems. We infer the separate contributions of three major architecture features in the memory subsystem of the Origin 2000: cache size, outstanding loads-under-miss, and memory latency.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {1997},
month = {12}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: