skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Co-design Framework for Online Data Analysis and Reduction

Journal Article · · Concurrency and Computation. Practice and Experience
DOI:https://doi.org/10.1002/cpe.6519· OSTI ID:1817542

Science applications preparing for the exascale era are increasingly exploring in situ computations comprising of simulation-analysis-reduction pipelines coupled in-memory. Efficient composition and execution of such complex pipelines for a target platform is a codesign process that evaluates the impact and tradeoffs of various application- and system-specific parameters. In this article, we describe a toolset for automating performance studies of composed HPC applications that perform online data reduction and analysis. We describe Cheetah, a new framework for composing parametric studies on coupled applications, and Savanna, a runtime engine for orchestrating and executing campaigns of codesign experiments. Furthermore, this toolset facilitates understanding the impact of various factors such as process placement, synchronicity of algorithms, and storage versus compute requirements for online analysis of large data. Ultimately, we aim to create a catalog of performance results that can help scientists understand tradeoffs when designing next-generation simulations that make use of online processing techniques. We illustrate the design of Cheetah and Savanna, and present application examples that use this framework to conduct codesign studies on small clusters as well as leadership class supercomputers.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC05-00OR22725; AC02-05CH11231; AC02-06CH11357
OSTI ID:
1817542
Alternate ID(s):
OSTI ID: 1869488; OSTI ID: 1887662
Journal Information:
Concurrency and Computation. Practice and Experience, Vol. 34, Issue 14; ISSN 1532-0626
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English

References (27)

A Vision for Managing Extreme-Scale Data Hoards conference July 2019
Optimal scheduling of in-situ analysis for large-scale scientific simulations
  • Malakar, Preeti; Vishwanath, Venkatram; Munson, Todd
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807656
conference January 2015
Fault-aware, utility-based job scheduling on Blue, Gene/P systems conference August 2009
Expertus: A Generator Approach to Automate Performance Testing in IaaS Clouds
  • Jayasinghe, Deepal; Swint, Galen; Malkowski, Simon
  • 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), 2012 IEEE Fifth International Conference on Cloud Computing https://doi.org/10.1109/CLOUD.2012.98
conference June 2012
Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems journal January 2005
Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications conference May 2018
Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS journal August 2013
Feature-preserving Lossy Compression for In Situ Data Analysis conference August 2020
Zoo: a desktop experiment management environment journal June 1997
Experiment Management and Analysis with perfbase conference September 2005
Optimal Execution of Co-analysis for Large-Scale Molecular Dynamics Simulations
  • Malakar, Preeti; Vishwanath, Venkatram; Knight, Christopher
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.59
conference November 2016
Swift: A language for distributed parallel scripting journal September 2011
The Exascale Computing Project journal May 2017
Fast Error-Bounded Lossy HPC Data Compression with SZ conference May 2016
Flux: A Next-Generation Resource Management Framework for Large HPC Centers
  • Ahn, Dong H.; Garlick, Jim; Grondona, Mark
  • 2014 43nd International Conference on Parallel Processing Workshops (ICCPW), 2014 43rd International Conference on Parallel Processing Workshops https://doi.org/10.1109/ICPPW.2014.15
conference September 2014
Z-checker: A framework for assessing lossy compression of scientific data journal November 2017
Fixed-Rate Compressed Floating-Point Arrays journal December 2014
The Tau Parallel Performance System journal May 2006
Multilevel Techniques for Compression and Reduction of Scientific Data---The Unstructured Case journal January 2020
Complex Patterns in a Simple System journal July 1993
The Exascale Framework for High Fidelity coupled Simulations (EFFIS): Enabling whole device modeling in fusion science journal May 2021
FTK: A Simplicial Spacetime Meshing Framework for Robust and Scalable Feature Tracking journal January 2021
SLURM: Simple Linux Utility for Resource Management book January 2003
Computing Just What You Need: Online Data Analysis and Reduction at Extreme Scales conference December 2017
Coupling Exascale Multiphysics Applications: Methods and Lessons Learned conference October 2018
Parsl: Pervasive Parallel Programming in Python
  • Babuji, Yadu; Foster, Ian; Wilde, Michael
  • Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '19 https://doi.org/10.1145/3307681.3325400
conference January 2019
A Codesign Framework for Online Data Analysis and Reduction conference November 2019