A high performance parallel algorithm for 1-D FFT
Book
·
OSTI ID:87608
- IBM T.J. Watson Research Center, Yorktown Heights, NY (United States)
In this paper the authors propose a parallel high performance FFT algorithm based on a multi-dimensional formulation. They use this to solve a commonly encountered FFT based kernel on a distributed memory parallel machine, the IBM scalable parallel system, SP1. The kernel requires a forward FFT computation of an input sequence, multiplication of the transformed data by a coefficient array, and finally an inverse FFT computation of the resultant data. They show that the multi-dimensional formulation helps in reducing the communication costs and also improves the single node performance by effectively utilizing the memory system of the node. They implemented this kernel on the IBM SP1 and observed a performance of 1.25 GFLOPS on a 64-node machine.
- OSTI ID:
- 87608
- Report Number(s):
- CONF-941118--; ISBN 0-8186-6605-6
- Country of Publication:
- United States
- Language:
- English
Similar Records
Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared-memory architectures
fftMPI, a library for performing 2d and 3d FFTs in parallel
A scalable parallel block algorithm for band Cholesky factorization
Journal Article
·
Fri May 01 00:00:00 EDT 1987
· IEEE Trans. Comput.; (United States)
·
OSTI ID:6595452
fftMPI, a library for performing 2d and 3d FFTs in parallel
Software
·
Tue Apr 24 20:00:00 EDT 2018
·
OSTI ID:code-45665
A scalable parallel block algorithm for band Cholesky factorization
Conference
·
Thu Nov 30 23:00:00 EST 1995
·
OSTI ID:125547