skip to main content

DOE PAGESDOE PAGES

Title: A characterization of workflow management systems for extreme-scale applications

We present that the automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compelling case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. Finally, the paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.
Authors:
 [1] ;  [2] ;  [3] ;  [4] ;  [5] ;  [1]
  1. University of Southern California, Marina del Rey, CA (United States)
  2. British Geological Survey, Lyell Centre, Edinburgh (United Kingdom); University of Edinburgh (United Kingdom). School of Informatics
  3. Univ. of Athens (Greece). Department of Informatics and Telecommunication
  4. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  5. Univ. of Manchester (United Kingdom). School of Computer Science
Publication Date:
Report Number(s):
LLNL-JRNL-706700
Journal ID: ISSN 0167-739X
Grant/Contract Number:
AC52-07NA27344; SC0012636
Type:
Accepted Manuscript
Journal Name:
Future Generations Computer Systems
Additional Journal Information:
Journal Volume: 75; Journal Issue: C; Journal ID: ISSN 0167-739X
Publisher:
Elsevier
Research Org:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Scientific workflows; Workflow management systems; Extreme-scale computing; in situ processing
OSTI Identifier:
1408072