# Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport

## Abstract

Block-structured adaptive mesh refinement (AMR) uses a mesh structure built up out of locally-uniform rectangular grids. In the BoxLib parallel framework used by the Raptor code, each processor operates on one or more of these grids at each refinement level. The decomposition of the mesh into grids and the distribution of these grids among processors may change every few timesteps as a calculation proceeds. Finer grids use smaller timesteps than coarser grids, requiring additional work to keep the system synchronized and ensure conservation between different refinement levels. In a paper for NECDC 2002 I presented preliminary results on implementation of parallel transport sweeps on the AMR mesh, conjugate gradient acceleration, accuracy of the AMR solution, and scalar speedup of the AMR algorithm compared to a uniform fully-refined mesh. This paper continues with a more in-depth examination of the parallel scaling properties of the scheme, both in single-level and multi-level calculations. Both sweeping and setup costs are considered. The algorithm scales with acceptable performance to several hundred processors. Trends suggest, however, that this is the limit for efficient calculations with traditional transport sweeps, and that modifications to the sweep algorithm will be increasingly needed as job sizes in the thousands ofmore »

- Authors:

- Publication Date:

- Research Org.:
- Lawrence Livermore National Lab., Livermore, CA (US)

- Sponsoring Org.:
- US Department of Energy (US)

- OSTI Identifier:
- 15011611

- Report Number(s):
- UCRL-CONF-208312

TRN: US0501288

- DOE Contract Number:
- W-7405-ENG-48

- Resource Type:
- Conference

- Resource Relation:
- Conference: Presented at: NECDC 2004, Livermore, CA (US), 10/04/2004--10/07/2004; Other Information: PBD: 29 Nov 2004

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 73 NUCLEAR PHYSICS AND RADIATION PHYSICS; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ACCELERATION; ACCURACY; ALGORITHMS; DISCRETE ORDINATE METHOD; DISTRIBUTION; IMPLEMENTATION; MODIFICATIONS; PERFORMANCE; SCALARS; TRANSPORT

### Citation Formats

```
Howell, L H.
```*Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport*. United States: N. p., 2004.
Web.

```
Howell, L H.
```*Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport*. United States.

```
Howell, L H. Mon .
"Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport". United States. https://www.osti.gov/servlets/purl/15011611.
```

```
@article{osti_15011611,
```

title = {Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport},

author = {Howell, L H},

abstractNote = {Block-structured adaptive mesh refinement (AMR) uses a mesh structure built up out of locally-uniform rectangular grids. In the BoxLib parallel framework used by the Raptor code, each processor operates on one or more of these grids at each refinement level. The decomposition of the mesh into grids and the distribution of these grids among processors may change every few timesteps as a calculation proceeds. Finer grids use smaller timesteps than coarser grids, requiring additional work to keep the system synchronized and ensure conservation between different refinement levels. In a paper for NECDC 2002 I presented preliminary results on implementation of parallel transport sweeps on the AMR mesh, conjugate gradient acceleration, accuracy of the AMR solution, and scalar speedup of the AMR algorithm compared to a uniform fully-refined mesh. This paper continues with a more in-depth examination of the parallel scaling properties of the scheme, both in single-level and multi-level calculations. Both sweeping and setup costs are considered. The algorithm scales with acceptable performance to several hundred processors. Trends suggest, however, that this is the limit for efficient calculations with traditional transport sweeps, and that modifications to the sweep algorithm will be increasingly needed as job sizes in the thousands of processors become common.},

doi = {},

journal = {},

number = ,

volume = ,

place = {United States},

year = {2004},

month = {11}

}