skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A mean-value performance analysis of a new multiprocessor architecture

Abstract

This paper presents a preliminary performance analysis of a new large-cycle multiprocessor; the Wisconsin Multicube. A key characteristic of the machine is that it is based on shared buses and a snooping cache coherence protocol. The organization of the shared buses and shared memory is unique and non-hierarchical. The two-dimensional version of the architecture is envisioned as scaling to 1024 processors. The authors develop an approximate mean-value analysis of bus interference for the proposed cache coherence protocol. The model includes FCFS scheduling at the bus queues with deterministic bus access times, and asynchronous memory write-backs and an invalidation requests. They use the model to investigate the feasibility of the multiprocessor, and to study some initial system design issues. The authors' results indicate that a 1024-processor system can operate at 75 - 95% of its peak processing power, if the mean time between cache misses is larger than 1000 bus cycles (i.e. 50 microseconds for 20 MHz buses; 25 microseconds for 40 MHz buses). This miss rate is not unreasonable for the cache sizes specified in the design, which are comparable to main memory sizes in existing multiprocessors. The authors also present results which address the issues of optimal cache blockmore » size, optimal size of the two-dimensional Multicube, the effect of the broadcast invalidations on system performance, and the viability of several hardware techniques for reducing the latency for remote memory requests.« less

Authors:
;
Publication Date:
OSTI Identifier:
6888772
Resource Type:
Book
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ARRAY PROCESSORS; COMPUTER ARCHITECTURE; DATA TRANSMISSION; INTEGRATED CIRCUITS; MEMORY DEVICES; PERFORMANCE TESTING; COMMUNICATIONS; ELECTRONIC CIRCUITS; MICROELECTRONIC CIRCUITS; TESTING; 990210* - Supercomputers- (1987-1989)

Citation Formats

Leutenegger, S T, and Vernon, M K. A mean-value performance analysis of a new multiprocessor architecture. United States: N. p., 1988. Web.
Leutenegger, S T, & Vernon, M K. A mean-value performance analysis of a new multiprocessor architecture. United States.
Leutenegger, S T, and Vernon, M K. Fri . "A mean-value performance analysis of a new multiprocessor architecture". United States.
@article{osti_6888772,
title = {A mean-value performance analysis of a new multiprocessor architecture},
author = {Leutenegger, S T and Vernon, M K},
abstractNote = {This paper presents a preliminary performance analysis of a new large-cycle multiprocessor; the Wisconsin Multicube. A key characteristic of the machine is that it is based on shared buses and a snooping cache coherence protocol. The organization of the shared buses and shared memory is unique and non-hierarchical. The two-dimensional version of the architecture is envisioned as scaling to 1024 processors. The authors develop an approximate mean-value analysis of bus interference for the proposed cache coherence protocol. The model includes FCFS scheduling at the bus queues with deterministic bus access times, and asynchronous memory write-backs and an invalidation requests. They use the model to investigate the feasibility of the multiprocessor, and to study some initial system design issues. The authors' results indicate that a 1024-processor system can operate at 75 - 95% of its peak processing power, if the mean time between cache misses is larger than 1000 bus cycles (i.e. 50 microseconds for 20 MHz buses; 25 microseconds for 40 MHz buses). This miss rate is not unreasonable for the cache sizes specified in the design, which are comparable to main memory sizes in existing multiprocessors. The authors also present results which address the issues of optimal cache block size, optimal size of the two-dimensional Multicube, the effect of the broadcast invalidations on system performance, and the viability of several hardware techniques for reducing the latency for remote memory requests.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {1988},
month = {1}
}

Book:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this book.

Save / Share: