skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: In situ and in-transit analysis of cosmological simulations

Abstract

Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent ‘offline’ analysis. We demonstrate this approach in the compressible gasdynamics/N-body code Nyx, a hybrid MPI+OpenMP code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically halting the main simulation and analyzing each component of data that they own (‘ in situ’). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other ‘sidecar’ group, which post-processes it while the simulation continues (‘in-transit’). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the in situ and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge treesmore » to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.« less

Authors:
ORCiD logo [1];  [1];  [1];  [1];  [1];  [1];  [1]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
OSTI Identifier:
1328630
Report Number(s):
LBNL-1006104
Journal ID: ISSN 2197-7909; ir:1006104
Grant/Contract Number:
AC02-05CH11231
Resource Type:
Journal Article: Published Article
Journal Name:
Computational Astrophysics and Cosmology
Additional Journal Information:
Journal Volume: 3; Journal Issue: 1; Journal ID: ISSN 2197-7909
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
79 ASTRONOMY AND ASTROPHYSICS; cosmology; post-processing; halo-finding; power spectra; in situ in-transit

Citation Formats

Friesen, Brian, Almgren, Ann, Lukic, Zarija, Weber, Gunther, Morozov, Dmitriy, Beckner, Vincent, and Day, Marcus. In situ and in-transit analysis of cosmological simulations. United States: N. p., 2016. Web. doi:10.1186/s40668-016-0017-2.
Friesen, Brian, Almgren, Ann, Lukic, Zarija, Weber, Gunther, Morozov, Dmitriy, Beckner, Vincent, & Day, Marcus. In situ and in-transit analysis of cosmological simulations. United States. doi:10.1186/s40668-016-0017-2.
Friesen, Brian, Almgren, Ann, Lukic, Zarija, Weber, Gunther, Morozov, Dmitriy, Beckner, Vincent, and Day, Marcus. 2016. "In situ and in-transit analysis of cosmological simulations". United States. doi:10.1186/s40668-016-0017-2.
@article{osti_1328630,
title = {In situ and in-transit analysis of cosmological simulations},
author = {Friesen, Brian and Almgren, Ann and Lukic, Zarija and Weber, Gunther and Morozov, Dmitriy and Beckner, Vincent and Day, Marcus},
abstractNote = {Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent ‘offline’ analysis. We demonstrate this approach in the compressible gasdynamics/N-body code Nyx, a hybrid MPI+OpenMP code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically halting the main simulation and analyzing each component of data that they own (‘in situ’). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other ‘sidecar’ group, which post-processes it while the simulation continues (‘in-transit’). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the in situ and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge trees to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.},
doi = {10.1186/s40668-016-0017-2},
journal = {Computational Astrophysics and Cosmology},
number = 1,
volume = 3,
place = {United States},
year = 2016,
month = 8
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record at 10.1186/s40668-016-0017-2

Save / Share:
  • Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent ‘offline’ analysis. We demonstrate this approach in the compressible gasdynamics/N-body code Nyx, a hybrid MPI+OpenMP code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically haltingmore » the main simulation and analyzing each component of data that they own (‘ in situ’). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other ‘sidecar’ group, which post-processes it while the simulation continues (‘in-transit’). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the in situ and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge trees to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.« less
  • Peculiar velocity fields of galaxies, computed by cosmological N-body simulations, are analyzed in detail. This Monte Carlo approach provides not only the expectation values of the bulk flow velocity and the cosmic Mach number, but their probability distributions. The distribution of the cosmic Mach number is found to be almost Maxwellian in linear and mildly nonlinear regimes, independent of the shape of the density fluctuation spectrum. Applying the resulting probability distribution to the cold dark matter models based on the inflationary scenario, it is found that the constraints derived by Ostriker and Suto (1990) hold at the 90-percent confidence level.more » 34 refs.« less
  • We developed a model to simulate a novel inelastic neutron scattering (INS) system for in situ non-destructive analysis of soil using standard Monte Carlo Neutron Photon (MCNP5a) transport code. The volumes from which 90%, 95%, and 99% of the total signal are detected were estimated to be 0.23 m{sup 3}, 0.37 m{sup 3}, and 0.79 m{sup 3}, respectively. Similarly, we assessed the instrument's sampling footprint and depths. In addition we discuss the impact of the carbon's depth distribution on sampled depth.
  • The generation of short pulses of ion beams through the interaction of an intense laser with a plasma sheath offers the possibility of compact and cheaper ion sources for many applications--from fast ignition and radiography of dense targets to hadron therapy and injection into conventional accelerators. To enable the efficient analysis of large-scale, high-fidelity particle accelerator simulations using the Warp simulation suite, the authors introduce the Warp In situ Visualization Toolkit (WarpIV). WarpIV integrates state-of-the-art in situ visualization and analysis using VisIt with Warp, supports management and control of complex in situ visualization and analysis workflows, and implements integrated analyticsmore » to facilitate query- and feature-based data analytics and efficient large-scale data analysis. WarpIV enables for the first time distributed parallel, in situ visualization of the full simulation data using high-performance compute resources as the data is being generated by Warp. The authors describe the application of WarpIV to study and compare large 2D and 3D ion accelerator simulations, demonstrating significant differences in the acceleration process in 2D and 3D simulations. WarpIV is available to the public via https://bitbucket.org/berkeleylab/warpiv. The Warp In situ Visualization Toolkit (WarpIV) supports large-scale, parallel, in situ visualization and analysis and facilitates query- and feature-based analytics, enabling for the first time high-performance analysis of large-scale, high-fidelity particle accelerator simulations while the data is being generated by the Warp simulation suite. Furthermore, this supplemental material https://extras.computer.org/extra/mcg2016030022s1.pdf provides more details regarding the memory profiling and optimization and the Yee grid recentering optimization results discussed in the main article.« less