Implementation of a cell-wise Block-Gauss-Seidel iterative method for SN transport on a hybrid parallel computer architecture
- Los Alamos National Laboratory
We have implemented a cell-wise, block-Gauss-Seidel (bGS) iterative algorithm, for the solution of the S{sub n} transport equations on the Roadrunner hybrid, parallel computer architecture. A compute node of this massively parallel machine comprises AMD Opteron cores that are linked to a Cell Broadband Engine{trademark} (Cell/B.E.). LAPACK routines have been ported to the Cell/B.E. in order to make use of its parallel Synergistic Processing Elements (SPEs). The bGS algorithm is based on the LU factorization and solution of a linear system that couples the fluxes for all S{sub n} angles and energy groups on a mesh cell. For every cell of a mesh that has been parallel decomposed on the higher-level Opteron processors, a linear system is transferred to the Cell/B.E. and the parallel LAPACK routines are used to compute a solution, which is then transferred back to the Opteron, where the rest of the computations for the S{sub n} transport problem take place. Compared to standard parallel machines, a hundred-fold speedup of the bGS was observed on the hybrid Roadrunner architecture. Numerical experiments with strong and weak parallel scaling demonstrate the bGS method is viable and compares favorably to full parallel sweeps (FPS) on two-dimensional, unstructured meshes when it is applied to optically thick, multi-material problems. As expected, however, it is not as efficient as FPS in optically thin problems.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1044156
- Report Number(s):
- LA-UR-10-08284; LA-UR-10-8284; TRN: US201214%%349
- Resource Relation:
- Conference: M&C 2011 ; May 8, 2011 ; Rio de Janeiro, Brazil
- Country of Publication:
- United States
- Language:
- English
Similar Records
Gauss-Seidel Accelerated: Implementing Flow Solvers on Field Programmable Gate Arrays
Two-Stage Gauss-Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU Cluster: Preprint