skip to main content

Title: Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach

Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial in situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.
 [1] ;  [2] ;  [1] ;  [3] ;  [2] ;  [4] ;  [2] ;  [1] ;  [1]
  1. ORNL
  2. Los Alamos National Laboratory (LANL)
  3. Argonne National Laboratory (ANL)
  4. Lawrence Livermore National Laboratory (LLNL)
Publication Date:
OSTI Identifier:
DOE Contract Number:
Resource Type:
Resource Relation:
Conference: International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, USA, 20151115, 20151120
Research Org:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org:
USDOE Office of Science (SC)
Country of Publication:
United States
Workflow; Cosmology; Big data