skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Persistent Octrees for Parallel Mesh Refinement Through Non-Volatile Byte-Addressable Memory

Abstract

We report that octree-based mesh adaptation has enabled simulations of complex physical phenomena. Existing meshing algorithms were proposed with the assumption that computer memory is volatile. Consequently, for failure recovery, the in-core algorithms need to save memory states as snapshots with slow file I/Os. The out-of-core algorithms store octants on disks for persistence. However, neither of them was designed to leverage unique characteristics of non-volatile byte-addressable memory (NVBM). In this paper, we propose a novel data structure Distributed Persistent Merged octree (DPM-octree) for both meshing and in-memory storage of persistent octrees using NVBM. It is a multi-version data structure and can recover from failures using its earlier persistent version stored in NVBM. In addition, we design a feature-directed sampling approach to help dynamically transform the DPM-octree layout for reducing NVBM-induced memory write latency. DPM-octree uses parity trees which are created using erasure coding and stored in NVBM to support low-latency in-memory octant recovery after data loss. DPM-octree has been successfully integrated with Gerris software for simulation of fluid dynamics. Finally, our experimental results with real-world scientific workloads show that DPM-octree scales up to 1.1 billion mesh elements with 1000 processors on the Titan supercomputer.

Authors:
 [1];  [1]; ORCiD logo [2];  [1]
  1. Washington State Univ., Vancouver, WA (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1469556
Report Number(s):
LA-UR-18-23313
Journal ID: ISSN 1045-9219
Grant/Contract Number:  
AC52-06NA25396
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Transactions on Parallel and Distributed Systems
Additional Journal Information:
Journal Volume: 30; Journal Issue: 3; Journal ID: ISSN 1045-9219
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Octree; Adaptive Mesh Refinement; Non-volatile Byte-addressable Memory

Citation Formats

Nguyen, Bao, Tan, Hua, Davis, Kei, and Zhang, Xuechen. Persistent Octrees for Parallel Mesh Refinement Through Non-Volatile Byte-Addressable Memory. United States: N. p., 2018. Web. doi:10.1109/TPDS.2018.2867867.
Nguyen, Bao, Tan, Hua, Davis, Kei, & Zhang, Xuechen. Persistent Octrees for Parallel Mesh Refinement Through Non-Volatile Byte-Addressable Memory. United States. doi:10.1109/TPDS.2018.2867867.
Nguyen, Bao, Tan, Hua, Davis, Kei, and Zhang, Xuechen. Thu . "Persistent Octrees for Parallel Mesh Refinement Through Non-Volatile Byte-Addressable Memory". United States. doi:10.1109/TPDS.2018.2867867. https://www.osti.gov/servlets/purl/1469556.
@article{osti_1469556,
title = {Persistent Octrees for Parallel Mesh Refinement Through Non-Volatile Byte-Addressable Memory},
author = {Nguyen, Bao and Tan, Hua and Davis, Kei and Zhang, Xuechen},
abstractNote = {We report that octree-based mesh adaptation has enabled simulations of complex physical phenomena. Existing meshing algorithms were proposed with the assumption that computer memory is volatile. Consequently, for failure recovery, the in-core algorithms need to save memory states as snapshots with slow file I/Os. The out-of-core algorithms store octants on disks for persistence. However, neither of them was designed to leverage unique characteristics of non-volatile byte-addressable memory (NVBM). In this paper, we propose a novel data structure Distributed Persistent Merged octree (DPM-octree) for both meshing and in-memory storage of persistent octrees using NVBM. It is a multi-version data structure and can recover from failures using its earlier persistent version stored in NVBM. In addition, we design a feature-directed sampling approach to help dynamically transform the DPM-octree layout for reducing NVBM-induced memory write latency. DPM-octree uses parity trees which are created using erasure coding and stored in NVBM to support low-latency in-memory octant recovery after data loss. DPM-octree has been successfully integrated with Gerris software for simulation of fluid dynamics. Finally, our experimental results with real-world scientific workloads show that DPM-octree scales up to 1.1 billion mesh elements with 1000 processors on the Titan supercomputer.},
doi = {10.1109/TPDS.2018.2867867},
journal = {IEEE Transactions on Parallel and Distributed Systems},
number = 3,
volume = 30,
place = {United States},
year = {2018},
month = {8}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: