## Massively parallel hypercube FFTs: CM-2 implementation and error analysis of a parallel trigonometric factor generation method

On parallel computers, the way the data elements are mapped to the processors may have a large effect on the timing performance of a given algorithm. In our previous paper, we have examined a few mapping strategies for the ordered radix-2 DIF (decimation-in-frequency) Fast Fourier Transform. In particular, we have shown how reduction of communication can be achieved by combining the order and computational phases through the use of i-cycles. A parallel methods was also presented for computing the trigonometric factors which requires neither trigonometric function evaluation nor interprocessor communication. This paper first reviews some of the experimental results onmore »