Tigres Workflow Library: Supporting Scientific Pipelines on HPC Systems

Hendrix, Valerie; Fox, James; Ghoshal, Devarshi; Ramakrishnan, Lavanya

doi:10.1109/CCGrid.2016.54

Tigres Workflow Library: Supporting Scientific Pipelines on HPC Systems

Journal Article · Thu Jul 21 00:00:00 EDT 2016 · Proceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016

DOI:https://doi.org/10.1109/CCGrid.2016.54· OSTI ID:1379520

Hendrix, Valerie ^[1]; Fox, James ^[1]; Ghoshal, Devarshi ^[1]; Ramakrishnan, Lavanya ^[1]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

The growth in scientific data volumes has resulted in the need for new tools that enable users to operate on and analyze data on large-scale resources. In the last decade, a number of scientific workflow tools have emerged. These tools often target distributed environments, and often need expert help to compose and execute the workflows. Data-intensive workflows are often ad-hoc, they involve an iterative development process that includes users composing and testing their workflows on desktops, and scaling up to larger systems. In this paper, we present the design and implementation of Tigres, a workflow library that supports the iterative workflow development cycle of data-intensive workflows. Tigres provides an application programming interface to a set of programming templates i.e., sequence, parallel, split, merge, that can be used to compose and execute computational and data pipelines. We discuss the results of our evaluation of scientific and synthetic workflows showing Tigres performs with minimal template overheads (mean of 13 seconds over all experiments). We also discuss various factors (e.g., I/O performance, execution mechanisms) that affect the performance of scientific workflows on HPC systems.

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

Grant/Contract Number:: AC02-05CH11231

OSTI ID:: 1379520

Journal Information:: Proceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016, Journal Name: Proceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016

Country of Publication:: United States

Language:: English

Similar Records

Experiences with user-centered design for the tigres workflow API

Conference · Tue Dec 31 23:00:00 EST 2013 · OSTI ID:1797724

Template Interfaces for Agile Parallel Data-Intensive Science

Software · Thu Apr 27 20:00:00 EDT 2017 · OSTI ID:code-55037

MaDaTS: Managing Data on Tiered Storage for Scientific Workflows

Journal Article · Sat Dec 31 23:00:00 EST 2016 · OSTI ID:1544386

Related Subjects

97 MATHEMATICS AND COMPUTING
Arrays
Collaboration
Data Analysis
High Performance Computing
Libraries
Monitoring
Pipelines
Programming
Scientific Workflows
Syntactics

Tigres Workflow Library: Supporting Scientific Pipelines on HPC Systems

Citation Formats

Similar Records

Related Subjects