Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

On the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime Systems

Conference ·
 [1];  [1];  [2];  [3]
  1. University of Hawaii at Manoa, Honolulu
  2. University of Southern California, Information Sciences Institute
  3. ORNL

Runtime systems that automate the execution of applications on distributed cyberinfrastructures need to make scheduling decisions. Researchers have proposed many scheduling algorithms, but most of them are designed based on analytical models and assumptions that may not hold in practice. The literature is thus rife with algorithms that have been evaluated only within the scope of their underlying assumptions but whose practical effectiveness is unclear. It is thus difficult for developers to decide which algorithm to implement in their runtime systems.To obviate the above difficulty, we propose an approach by which the runtime system executes, throughout application execution, simulations of this very execution. Each simulation is for a different algorithm in a scheduling algorithm portfolio, and the best algorithm is selected based on simulation results. The main objective of this work is to evaluate the feasibility and potential merit of this portfolio scheduling approach, even in the presence of simulation inaccuracy, when compared to the traditional one-algorithm approach. We perform this evaluation via a case study in the context of scientific workflows. Our main finding is that portfolio scheduling can outperform the best one-algorithm approach even in the presence of relatively large simulation inaccuracies.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1922323
Resource Relation:
Journal Volume: 13592; Conference: 25th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2022) - Lyon, , France - 6/3/2022 4:00:00 AM-6/3/2022 4:00:00 AM
Country of Publication:
United States
Language:
English

References (24)

A Survey on Scheduling Strategies for Workflows in Cloud Environment and Emerging Trends August 2019
Validity of the single processor approach to achieving large scale computing capabilities January 1967
Workflow scheduling algorithms in cloud environment - A survey March 2014
GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing November 2002
CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms August 2010
Obtaining dynamic scheduling policies with simulation and machine learning January 2017
Versatile, scalable, and accurate simulation of distributed applications and platforms October 2014
Developing accurate and scalable simulators of production workflow management systems with WRENCH November 2020
WfCommons: A framework for enabling scientific workflow research and development March 2022
Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS clouds November 2013
Self-tuning systems January 1999
Online Tuning of EASY-Backfilling using Queue Reordering Policies October 2018
Workflow scheduling in heterogeneous computing systems : A survey October 2017
LogGOPSim: simulating large-scale applications in the LogGOPS model January 2010
DISSECT-CF: A simulator to foster energy-aware scheduling in infrastructure clouds November 2015
Fostering Energy-Awareness in Simulations behind Scientific Workflow Management Systems December 2014
A Survey of Data-Intensive Scientific Workflow Management March 2015
CloudNetSim++: A toolkit for data center simulations in OMNET++ December 2014
FogNetSim++: A Toolkit for Modeling and Simulation of Distributed Fog Environment January 2018
A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments: Workflow Scheduling Algorithms for Clouds December 2016
The self-tuning dynP job-scheduler January 2002
Portfolio-Based Selection of Robust Dynamic Loop Scheduling Algorithms Using Machine Learning May 2014
PSINS: An Open Source Event Tracer and Execution Simulator for MPI Applications January 2009
On the validity of flow-level tcp network models for grid and cloud simulations October 2013