Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach
- ORNL
- Los Alamos National Laboratory (LANL)
- Argonne National Laboratory (ANL)
- Lawrence Livermore National Laboratory (LLNL)
Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial in situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1244184
- Resource Relation:
- Conference: International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, USA, 20151115, 20151120
- Country of Publication:
- United States
- Language:
- English
Similar Records
PipeSight: A High-Performance Computing Platform for Pipeline Integrity Management
Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters