skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: MaDaTS: Managing Data on Tiered Storage for Scientific Workflows

Abstract

Scientific workflows are processing large amounts of data through complex simulation and analysis tasks. Meanwhile, the need to minimize I/O costs on next generation systems and the evolution of new technologies (NVRAMs, SSDs etc.) is resulting in deeper storage hierarchies on High Performance Computing (HPC) systems. A multi-tiered storage hierarchy introduces complexities in workflow and data management. There is need for simple and flexible data abstractions that can allow users to seamlessly manage workflow data and tasks on HPC systems with multiple storage tiers. MaDaTS (Managing Data on Tiered Storage for Scientific Workflows) provides an API and a command-line tool that allows users to manage their workflows and data on tiered storage (Ghoshal & Ramakrishnan (2017)).

Authors:
ORCiD logo [1]; ORCiD logo [1]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1582034
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Open Source Software
Additional Journal Information:
Journal Volume: 3; Journal Issue: 30; Journal ID: ISSN 2475-9066
Publisher:
Open Source Initiative - NumFOCUS; Copyright - Open Journals
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Ghoshal, Devarshi, and Ramakrishnan, Lavanya. MaDaTS: Managing Data on Tiered Storage for Scientific Workflows. United States: N. p., 2018. Web. doi:10.21105/joss.00830.
Ghoshal, Devarshi, & Ramakrishnan, Lavanya. MaDaTS: Managing Data on Tiered Storage for Scientific Workflows. United States. doi:10.21105/joss.00830.
Ghoshal, Devarshi, and Ramakrishnan, Lavanya. Mon . "MaDaTS: Managing Data on Tiered Storage for Scientific Workflows". United States. doi:10.21105/joss.00830. https://www.osti.gov/servlets/purl/1582034.
@article{osti_1582034,
title = {MaDaTS: Managing Data on Tiered Storage for Scientific Workflows},
author = {Ghoshal, Devarshi and Ramakrishnan, Lavanya},
abstractNote = {Scientific workflows are processing large amounts of data through complex simulation and analysis tasks. Meanwhile, the need to minimize I/O costs on next generation systems and the evolution of new technologies (NVRAMs, SSDs etc.) is resulting in deeper storage hierarchies on High Performance Computing (HPC) systems. A multi-tiered storage hierarchy introduces complexities in workflow and data management. There is need for simple and flexible data abstractions that can allow users to seamlessly manage workflow data and tasks on HPC systems with multiple storage tiers. MaDaTS (Managing Data on Tiered Storage for Scientific Workflows) provides an API and a command-line tool that allows users to manage their workflows and data on tiered storage (Ghoshal & Ramakrishnan (2017)).},
doi = {10.21105/joss.00830},
journal = {Journal of Open Source Software},
number = 30,
volume = 3,
place = {United States},
year = {2018},
month = {10}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

MaDaTS: Managing Data on Tiered Storage for Scientific Workflows
conference, January 2017

  • Ghoshal, Devarshi; Ramakrishnan, Lavanya
  • Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '17
  • DOI: 10.1145/3078597.3078611