Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Modular performance prediction for scientific workflows using Machine Learning

Journal Article · · Future Generations Computer Systems
 [1];  [2];  [2];  [2]
  1. Univ. of California, San Diego, La Jolla, CA (United States). San Diego Supercomputer Center; Univ. of California, San Diego, La Jolla, CA (United States)
  2. Univ. of California, San Diego, La Jolla, CA (United States). San Diego Supercomputer Center
Scientific workflows provide an opportunity for declarative computational experiment design in an intuitive and efficient way. A distributed workflow is typically executed on a variety of resources, and it uses a variety of computational algorithms or tools to achieve the desired outcomes. Such a variety imposes additional complexity in scheduling these workflows on large scale computers. As computation becomes more distributed, insights into expected workload that a workflow presents become critical for effective resource allocation. In this paper, we present a modular framework that leverages Machine Learning for creating precise performance predictions of a workflow. The central idea is to partition a workflow in such a way that makes the task of forecasting each atomic unit manageable and gives us a way to combine the individual predictions efficiently. We recognize a combination of an executable and a specific physical resource as a single module. This gives us a handle to characterize workload and machine power as a single unit of prediction. Overall, our modular technique of creating atomic modules and deployment of longest-path approach to estimate workflow performance, allows the framework to adapt to highly complex nested directed acyclic workflows and scale to new scenarios, since it does not make assumptions of underlying workflow structure. We present performance estimation results of independent workflow modules executed on the XSEDE SDSC Comet cluster using various Machine Learning algorithms. The results provide insights into the behavior and effectiveness of different algorithms in the context of scientific workflow performance prediction.
Research Organization:
Univ. of California, San Diego, CA (United States)
Sponsoring Organization:
National Institutes of Health (NIH); National Science Foundation (NSF); USDOE; USDOE Office of Science (SC)
Grant/Contract Number:
SC0012630
OSTI ID:
1851724
Alternate ID(s):
OSTI ID: 1776457
Journal Information:
Future Generations Computer Systems, Journal Name: Future Generations Computer Systems Journal Issue: C Vol. 114; ISSN 0167-739X
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (21)

Scientific workflow management and the Kepler system
  • Ludäscher, Bertram; Altintas, Ilkay; Berkley, Chad
  • Concurrency and Computation: Practice and Experience, Vol. 18, Issue 10 https://doi.org/10.1002/cpe.994
journal January 2006
Support-vector networks journal September 1995
A multi-strategy collaborative prediction model for the runtime of online tasks in computing cluster/grid journal October 2010
Milepost GCC: Machine Learning Enabled Self-tuning Compiler journal January 2011
Workflows and e-Science: An overview of workflow system features and capabilities journal May 2009
Characterizing and profiling scientific workflows journal March 2013
Biomedical Big Data Training Collaborative (BBDTC): An effort to bridge the talent gap in biomedical science and research journal May 2017
A novel statistical time-series pattern based interval forecasting strategy for activity durations in workflow systems journal March 2011
Kepler + CometCloud: Dynamic Scientific Workflow Execution on Federated Cloud Resources journal January 2016
Random Forests journal January 2001
Gene Selection for Cancer Classification using Support Vector Machines journal January 2002
A tutorial on support vector regression journal August 2004
Predicting the Execution Time of Workflow Activities Based on Their Input Features
  • Miu, Tudor; Missier, Paolo
  • 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: High Performance Computing, Networking Storage and Analysis https://doi.org/10.1109/SC.Companion.2012.21
conference November 2012
On Performance Modeling and Prediction in Support of Scientific Workflow Optimization conference July 2011
A regression-based approach to scalability prediction conference January 2008
Challenges and approaches for distributed workflow-driven analysis of large-scale biological data: vision paper conference January 2012
Analysis of benchmark characteristics and benchmark performance prediction journal November 1996
Large memory high performance computing enables comparison across human gut microbiome of patients with autoimmune diseases and healthy subjects
  • Wu, Sitao; Li, Weizhong; Smarr, Larry
  • XSEDE '13: Extreme Science and Engineering Discovery Environment: Gateway to Discovery, Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery https://doi.org/10.1145/2484762.2484828
conference July 2013
Prophesy: an infrastructure for performance analysis and modeling of parallel and grid applications journal March 2003
The future of scientific workflows journal April 2017
machine. journal October 2001

Similar Records

iDDS: intelligent distributed dispatch and scheduling for workflow orchestration
Journal Article · Fri Jan 23 19:00:00 EST 2026 · European Physical Journal. C, Particles and Fields (Online) · OSTI ID:3017617

Asynchronous Execution of Heterogeneous Tasks in ML-Driven HPC Workflows
Conference · Fri May 19 00:00:00 EDT 2023 · OSTI ID:2333668

Integration of scanning probe microscope with high-performance computing: Fixed-policy and reward-driven workflows implementation
Journal Article · Sun Sep 15 20:00:00 EDT 2024 · Review of Scientific Instruments · OSTI ID:2571043