skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach

Abstract

Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial in situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.

Authors:
 [1];  [2];  [1];  [3];  [2];  [4];  [2];  [1];  [1]
  1. ORNL
  2. Los Alamos National Laboratory (LANL)
  3. Argonne National Laboratory (ANL)
  4. Lawrence Livermore National Laboratory (LLNL)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1244184
DOE Contract Number:
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, USA, 20151115, 20151120
Country of Publication:
United States
Language:
English
Subject:
Workflow; Cosmology; Big data

Citation Formats

Messer, Bronson, Sewell, Christopher, Heitmann, Katrin, Finkel, Dr. Hal J, Fasel, Patricia, Zagaris, George, Pope, Adrian, Habib, Salman, and Parete-Koon, Suzanne T. Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach. United States: N. p., 2015. Web.
Messer, Bronson, Sewell, Christopher, Heitmann, Katrin, Finkel, Dr. Hal J, Fasel, Patricia, Zagaris, George, Pope, Adrian, Habib, Salman, & Parete-Koon, Suzanne T. Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach. United States.
Messer, Bronson, Sewell, Christopher, Heitmann, Katrin, Finkel, Dr. Hal J, Fasel, Patricia, Zagaris, George, Pope, Adrian, Habib, Salman, and Parete-Koon, Suzanne T. Thu . "Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach". United States. doi:. https://www.osti.gov/servlets/purl/1244184.
@article{osti_1244184,
title = {Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach},
author = {Messer, Bronson and Sewell, Christopher and Heitmann, Katrin and Finkel, Dr. Hal J and Fasel, Patricia and Zagaris, George and Pope, Adrian and Habib, Salman and Parete-Koon, Suzanne T},
abstractNote = {Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial in situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: