| | |
Summary: Latency Hiding on COMA Multiprocessors
Tarek S. Abdelrahman
Department of Electrical and Computer Engineering
The University of Toronto
Toronto, Ontario, Canada M5S 1A4
Abstract
Cache Only Memory Access (COMA) multiprocessors support scalable coher-
ent shared memory with a uniform memory access programming model. The
cache-based organization of memory results in long memory access latencies.
Latency hiding mechanisms can reduce effective memory latency by making
data present in a processor's local memory by the time the data is needed. In
this paper, we study the effectiveness of latency hiding mechanisms on the
KSR2 multiprocessor in improving the performance of three programs. The
communication patterns of each program are analyzed and mechanisms for
latency hiding are applied. Results from a 52-processor system indicate that
the use of these mechanisms hides a significant portion of remote memory
accesses and that application performance benefits. The overhead associated
with the use of these mechanisms can limit the extent of this benefit.
1 Introduction
Cache Only Memory Access (COMA) multiprocessors scale to large numbers of processors
|