The structured memory access architecture: An implementation and performance-evaluation
Abstract
The Structured Memory Access (SMS) architecture implementation presented in this thesis is formulated with the intention of alleviating two well-known inefficiencies that exist in current scalar computer architectures: address generation overhead and memory bandwidth utilization. Furthermore, the SMA architecture introduces an additional level of parallelism which is not present in current pipelined supercomputers, namely, overlapped execution of the access process and execute process on two distinct special-purpose, asynchronously-coupled processors. Each processor executes a separate instruction stream to perform its specific task which, together, are functionally equivalent in a conventional program. Our simulation results show that, for typical numerical programs, the access processor (MAP) is capable of achieving slip, i.e., running sufficiently ahead of the execute processor (CP) so that operand fetch requests for data items required by the CP are issued early enough and rapidly enough for the CP rarely to experience any memory access wait time. In this manner the SMA tolerates long memory access time, albeit high bandwidth, paths to memory without sacrificing performance. Speedups relative to the Cray-1 in scalar mode often exceed two, due to dual processing and reductions in memory wait time. 17 refs., 11 figs., 3 tabs.
- Authors:
- Publication Date:
- Research Org.:
- Illinois Univ., Urbana (USA). Center for Supercomputing Research and Development
- OSTI Identifier:
- 5546068
- Report Number(s):
- DOE/ER/25001-86; UILU-ENG-86-8008
ON: DE88003527
- DOE Contract Number:
- FG02-85ER25001
- Resource Type:
- Technical Report
- Resource Relation:
- Other Information: Thesis (M.S.). Portions of this document are illegible in microfiche products
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; MEMORY MANAGEMENT; COMPUTER ARCHITECTURE; PERFORMANCE TESTING; COMPUTERIZED SIMULATION; CRAY COMPUTERS; IMPLEMENTATION; COMPUTERS; SIMULATION; TESTING; 990210* - Supercomputers- (1987-1989)
Citation Formats
Cyr, J. B. The structured memory access architecture: An implementation and performance-evaluation. United States: N. p., 1986.
Web. doi:10.2172/5546068.
Cyr, J. B. The structured memory access architecture: An implementation and performance-evaluation. United States. https://doi.org/10.2172/5546068
Cyr, J. B. 1986.
"The structured memory access architecture: An implementation and performance-evaluation". United States. https://doi.org/10.2172/5546068. https://www.osti.gov/servlets/purl/5546068.
@article{osti_5546068,
title = {The structured memory access architecture: An implementation and performance-evaluation},
author = {Cyr, J. B.},
abstractNote = {The Structured Memory Access (SMS) architecture implementation presented in this thesis is formulated with the intention of alleviating two well-known inefficiencies that exist in current scalar computer architectures: address generation overhead and memory bandwidth utilization. Furthermore, the SMA architecture introduces an additional level of parallelism which is not present in current pipelined supercomputers, namely, overlapped execution of the access process and execute process on two distinct special-purpose, asynchronously-coupled processors. Each processor executes a separate instruction stream to perform its specific task which, together, are functionally equivalent in a conventional program. Our simulation results show that, for typical numerical programs, the access processor (MAP) is capable of achieving slip, i.e., running sufficiently ahead of the execute processor (CP) so that operand fetch requests for data items required by the CP are issued early enough and rapidly enough for the CP rarely to experience any memory access wait time. In this manner the SMA tolerates long memory access time, albeit high bandwidth, paths to memory without sacrificing performance. Speedups relative to the Cray-1 in scalar mode often exceed two, due to dual processing and reductions in memory wait time. 17 refs., 11 figs., 3 tabs.},
doi = {10.2172/5546068},
url = {https://www.osti.gov/biblio/5546068},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Aug 01 00:00:00 EDT 1986},
month = {Fri Aug 01 00:00:00 EDT 1986}
}