Implementation of a cell-wise Block-Gauss-Seidel iterative method for SN transport on a hybrid parallel computer architecture
- Los Alamos National Laboratory
We have implemented a cell-wise, block-Gauss-Seidel (bGS) iterative algorithm, for the solution of the S{sub n} transport equations on the Roadrunner hybrid, parallel computer architecture. A compute node of this massively parallel machine comprises AMD Opteron cores that are linked to a Cell Broadband Engine{trademark} (Cell/B.E.). LAPACK routines have been ported to the Cell/B.E. in order to make use of its parallel Synergistic Processing Elements (SPEs). The bGS algorithm is based on the LU factorization and solution of a linear system that couples the fluxes for all S{sub n} angles and energy groups on a mesh cell. For every cell of a mesh that has been parallel decomposed on the higher-level Opteron processors, a linear system is transferred to the Cell/B.E. and the parallel LAPACK routines are used to compute a solution, which is then transferred back to the Opteron, where the rest of the computations for the S{sub n} transport problem take place. Compared to standard parallel machines, a hundred-fold speedup of the bGS was observed on the hybrid Roadrunner architecture. Numerical experiments with strong and weak parallel scaling demonstrate the bGS method is viable and compares favorably to full parallel sweeps (FPS) on two-dimensional, unstructured meshes when it is applied to optically thick, multi-material problems. As expected, however, it is not as efficient as FPS in optically thin problems.
- Research Organization:
- Los Alamos National Laboratory (LANL)
- Sponsoring Organization:
- DOE
- DOE Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1044156
- Report Number(s):
- LA-UR-10-08284; LA-UR-10-8284
- Country of Publication:
- United States
- Language:
- English
Similar Records
Managing the bottlenecks in parallel Gauss-Seidel type algorithms for power flow analysis
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes