Parallel fast gauss transform
- ORNL
- Siemens Corporate Research
- New York University
We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N{sup 2}) time. The parallel time complexity estimates for our algorithms are O(N/n{sub p}) for uniform point distributions and O( (N/n{sub p}) log (N/n{sub p}) + n{sub p}log n{sub p}) for non-uniform distributions using n{sub p} CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits 'diagonal translation'. We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer. Our implementation is 'kernel-independent' and can handle other 'Gaussian-type' kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
- Sponsoring Organization:
- USDOE Office of Nuclear Energy (NE)
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 1033540
- Resource Relation:
- Conference: ACM/IEEE Supercomputing, New Orleans, LA, USA, 20101113, 20101113
- Country of Publication:
- United States
- Language:
- English
Similar Records
Data Locality Enhancement of Dynamic Simulations for Exascale Computing (Final Report)
A parallel geometric multigrid method for finite elements on octree meshes