Systolic-array optimizing compiler
The WARP machine is a linear array of ten programmable processors and is capable of executing 100 million floating-point operations per second (100 MFLOPS). The individual processors, or cells, derive their performance from a wide instruction set and a high degree of internal pipelining and parallelism. Can an array of high-performance cells be programmed to cooperate at a fine grain of parallelism The author's thesis is that systolic arrays of high-performance cells can be programmed effectively using a high-level language. The solution has two components: a machine abstraction and compiler optimizations for systolic arrays, and code-scheduling techniques for horizontally microcoded or VLIW processors. In the proposed machine abstraction, individual cells are programmed in a high-level programming language; inter-cell communication is explicitly specified by asynchronous primitives: receive and send operations. This machine abstraction offers both efficiency and generality. It is shown that software pipelining is a practical and efficient code-scheduling technique for highly parallel and pipelined processors. The ideas and techniques in this thesis were validated by the implementation of an optimizing compiler for Warp.
- Research Organization:
- Carnegie-Mellon Univ., Pittsburgh, PA (USA)
- OSTI ID:
- 5634401
- Country of Publication:
- United States
- Language:
- English
Similar Records
Bulldog: a compiler for VLIW architectures
The Warp computer: Architecture, implementation, and performance