skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Adaptive elasticity policies for staging-based in situ visualization

Journal Article · · Future Generations Computer Systems
ORCiD logo [1];  [2];  [3];  [3];  [3]
  1. Rutgers Univ., New Brunswick, NJ (United States)
  2. Argonne National Lab. (ANL), Lemont, IL (United States)
  3. Univ. of Utah, Salt Lake City, UT (United States)

In situ processing aims to alleviate the growing gap between computation and I/O capabilities by performing data processing close to the data source. In situ processing is widely used to process data generated by multiple data sources, including observation data from edge devices or scientific observational facilities and the simulation data generated by scientific computation on a high-performance computing (HPC) platform. For a scientific workflow that is run on an HPC platform and composed of a simulation program and an in situ data analytics or visualization (abbreviated as ana/vis) task, there is an implicit assumption that the computing resources assigned to the workflow keep static during the workflow execution. However, with the converging trend between the HPC and cloud computing platform, running the in situ ana/vis task in an elastic way is promising to decrease its overhead and improve its resource utilization rate. Resource elasticity represents the ability to change resource configurations such as the number of computing nodes/processes during workflow execution. An elastic job may dynamically adjust resource configurations; it may use a few resources at the beginning and more resources toward the end of the job when interesting data appear. However, it is hard to predict a priori how many computing nodes/processes need to be added/removed during the workflow execution to adapt to changing workflow needs. How to efficiently guide elasticity operations, such as growing or shrinking the number of processes used for in situ analysis during workflow execution, is an open-ended research question. In this article, we present adaptive elasticity policies that adopt workflow runtime information collected during workflow execution to predict how to trigger the addition/removal of processes in order to minimize in situ processing overhead. Taking in situ visualization tasks as an example, we integrate the presented elasticity policies into a staging-based elastic workflow and evaluate its efficiency in multiple elasticity scenarios. Compared with the situation without elasticity or with a static elasticity policy that uses a fixed number of processes for each rescaling operation, the adaptive elasticity policy can save overhead in finding a proper resource configuration and improve resource utilization efficiency. Furthermore, one experiment illustrates that the adaptive elasticity policy saves 41% of core-hours compared with the situation without the resource elasticity.

Research Organization:
Univ. of Utah, Salt Lake City, UT (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC)
Grant/Contract Number:
SC0023130; AC02-06CH11357; 17-SC-20-SC
OSTI ID:
1961976
Alternate ID(s):
OSTI ID: 1908025; OSTI ID: 2335715
Journal Information:
Future Generations Computer Systems, Vol. 142; ISSN 0167-739X
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (33)

SciPy 1.0: fundamental algorithms for scientific computing in Python journal February 2020
Bootstrapping in-situ workflow auto-tuning via combining performance models of component applications conference November 2021
HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges journal April 2018
Towards autonomic data management for staging-based coupled scientific workflows journal December 2020
A Violently Tornadic Supercell Thunderstorm Simulation Spanning a Quarter-Trillion Grid Volumes: Computational Challenges, I/O Framework, and Visualizations of Tornadogenesis journal September 2019
Elasticity in Cloud Computing: State of the Art and Research Challenges journal March 2018
CoREC journal May 2020
The vision of autonomic computing journal January 2003
LSMR: An Iterative Algorithm for Sparse Least-Squares Problems journal January 2011
Augmenting computing capabilities at the edge by jointly exploiting mobile devices: A survey journal November 2018
Post-failure recovery of MPI communication capability: Design and rationale journal June 2013
Power-Law Distributions in Empirical Data journal November 2009
Visualizing with VTK: a tutorial journal January 2000
E-Hpc conference January 2017
FUTURES-DPE: towards dynamic provisioning and execution of geosimulations in HPC environments
  • Shashidharan, Ashwin; Vatsavai, Ranga Raju; Meentemeyer, Ross K.
  • Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems - SIGSPATIAL '18 https://doi.org/10.1145/3274895.3274948
conference January 2018
A Q-learning approach for the autoscaling of scientific workflows in the Cloud journal February 2022
Optimal Execution of Co-analysis for Large-Scale Molecular Dynamics Simulations
  • Malakar, Preeti; Vishwanath, Venkatram; Knight, Christopher
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.59
conference November 2016
Infrastructure and API Extensions for Elastic Execution of MPI Applications conference September 2016
DataSpaces: an interaction and coordination framework for coupled simulation workflows journal February 2011
Online Learning for Offloading and Autoscaling in Energy Harvesting Mobile Edge Computing journal September 2017
In situ and in-transit analysis of cosmological simulations journal August 2016
Meta-heuristic based autoscaling of cloud-based parameter sweep experiments with unreliable virtual machines instances journal July 2018
Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows
  • Jin, Tong; Zhang, Fan; Sun, Qian
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503301
conference January 2013
A terminology for in situ visualization and analysis systems journal August 2020
Enabling Adaptive Scientific Workflows Via Trigger Detection
  • Salloum, Maher; Bennett, Janine C.; Pinar, Ali
  • Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization - ISAV2015 https://doi.org/10.1145/2828612.2828619
conference January 2015
Exploring Alternative Approaches to Implement an Elasticity Policy conference July 2011
Trigger Detection for Adaptive Scientific Workflows Using Percentile Sampling journal January 2016
ParaView Catalyst: Enabling In Situ Data Analysis and Visualization
  • Ayachit, Utkarsh; Bauer, Andrew; Geveci, Berk
  • Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization - ISAV2015 https://doi.org/10.1145/2828612.2828624
conference January 2015
A Hybrid In Situ Approach for Cost Efficient Image Database Generation journal January 2022
DataStager: scalable data staging services for petascale applications journal June 2010
A Large-Scale Malleable Tsunami Simulation Realized on an Elastic MPI Infrastructure conference May 2017
Mochi: Composing Data Services for High-Performance Computing Environments journal January 2020
Amdahl's Law in the Datacenter Era: A Market for Fair Processor Allocation conference February 2018