Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Informatica 17 page xxx{yyy 1 Overlap of Computation and Communication on Shared-Memory

Summary: Informatica 17 page xxx{yyy 1
Overlap of Computation and Communication on Shared-Memory
Tarek S. Abdelrahman and Gary Liu
Department of Electrical and Computer Engineering
The University of Toronto
Toronto, Ontario, Canada M5S 3G4
Keywords: networks-of-workstations, distributed-shared memory, compiler optimizations, locality
enhancement, latency hiding, loop transformations.
Edited by:
Received: Revised: Accepted:
This paper describes and evaluates a compiler transformation that improves the perfor-
mance of parallel programs on Network-of-Workstation (NOW) shared-memory multi-
processors. The transformation overlaps the communication time resulting form non-
local memory accesses with the computation time in parallel loops to e ectively hide the
latency of the remote accesses. The transformation peels from a parallel loop iterations
that access remote data and re-schedules them after the execution of iterations that
access only local data (local-only iterations). Asynchronous prefetching of remote data
is used to overlap non-local access latency with the execution of local-only iterations.


Source: Abdelrahman, Tarek S. - Department of Electrical and Computer Engineering, University of Toronto


Collections: Computer Technologies and Information Sciences