Compute intensity and the FFT
Conference
·
OSTI ID:46262
- Cray Research Superservers, Inc., Beaverton, OR (United States)
The fast Fourier transform (FFT) is a challenging algorithm to implement efficiently on a parallel computer. Recent algorithm advances have led to greatly improved FFT performance on parallel vector computers such as the CRAY-2 and CRAY Y-MP. Variations on these techniques can be used to extend this improved performance to other parallel architectures. A simple evaluation reveals that the FFT inherently has a relatively high degree of computation per data word, or compute intensity. This high compute intensity is lost when the FFT computation is reduced to simple vector operations. Viewing the algorithm from a high level and exploiting compute intensity is the key to achieving high performance on parallel computers such as the CRAY APP. This paper describes how high compute intensity programming techniques combined with algorithms in the literature can result in efficient single and multidimensional FFTs on large numbers of processors on the CRAY APP. The CRAY APP is a shared-memory parallel computer based on the Intel i860 microprocessor. It incorporates up to 84 i860s in an architecture which allows for very efficient gang scheduling and barrier synchronization. FFT Performance figures for various data set sizes and processor configurations are included.
- OSTI ID:
- 46262
- Report Number(s):
- CONF-931115--
- Country of Publication:
- United States
- Language:
- English
Similar Records
A high-performance FFT algorithm for vector supercomputers
Ocean predictability studies in a parallel computing environment. Final report
Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared-memory architectures
Journal Article
·
Thu Dec 31 23:00:00 EST 1987
· Int. J. Supercomput. Appl.; (United States)
·
OSTI ID:6933065
Ocean predictability studies in a parallel computing environment. Final report
Technical Report
·
Sat Oct 31 23:00:00 EST 1992
·
OSTI ID:505314
Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared-memory architectures
Journal Article
·
Fri May 01 00:00:00 EDT 1987
· IEEE Trans. Comput.; (United States)
·
OSTI ID:6595452