skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Vision for Managing Extreme-Scale Data Hoards

Abstract

Scientific data collections grow ever larger, both in terms of the size of individual data items and of the number and complexity of items. To use and manage them, it is important to directly address issues of robust and actionable provenance. We identify three key drivers as our focus: managing the size and complexity of metadata, lack of a priori information to match usage intents between publishers and consumers of data, and support for campaigns over collections of data driven by multi-disciplinary, collaborating teams. We introduce the Hoarde abstraction as an attempt to formalize a way of looking at collections of data to make them more tractable for later use. Hoarde leverages middleware and systems infrastructures for scientific and technical data management. Through the lens of a select group of challenging data usage scenarios, we discuss some of the aspects of implementation, usage, and forward portability of this new view on data management.

Authors:
 [1]; ORCiD logo [1];  [2]; ORCiD logo [1];  [1]; ORCiD logo [1];  [3]; ORCiD logo [1]
  1. ORNL
  2. The HDF Group
  3. Sandia National Laboratories (SNL)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1558517
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: International Conference on Distributed Computing Systems (ICDCS 2019) - Dallas, Texas, United States of America - 7/7/2019 8:00:00 AM-7/9/2019 8:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Logan, Jeremy, Mehta, Kshitij V., Heber, Gerd, Klasky, Scott A., Kurc, Tahsin M., Podhorszki, Norbert, Widener, Patrick, and Wolf, Matthew D. A Vision for Managing Extreme-Scale Data Hoards. United States: N. p., 2019. Web. doi:10.1109/ICDCS.2019.00179.
Logan, Jeremy, Mehta, Kshitij V., Heber, Gerd, Klasky, Scott A., Kurc, Tahsin M., Podhorszki, Norbert, Widener, Patrick, & Wolf, Matthew D. A Vision for Managing Extreme-Scale Data Hoards. United States. https://doi.org/10.1109/ICDCS.2019.00179
Logan, Jeremy, Mehta, Kshitij V., Heber, Gerd, Klasky, Scott A., Kurc, Tahsin M., Podhorszki, Norbert, Widener, Patrick, and Wolf, Matthew D. 2019. "A Vision for Managing Extreme-Scale Data Hoards". United States. https://doi.org/10.1109/ICDCS.2019.00179. https://www.osti.gov/servlets/purl/1558517.
@article{osti_1558517,
title = {A Vision for Managing Extreme-Scale Data Hoards},
author = {Logan, Jeremy and Mehta, Kshitij V. and Heber, Gerd and Klasky, Scott A. and Kurc, Tahsin M. and Podhorszki, Norbert and Widener, Patrick and Wolf, Matthew D.},
abstractNote = {Scientific data collections grow ever larger, both in terms of the size of individual data items and of the number and complexity of items. To use and manage them, it is important to directly address issues of robust and actionable provenance. We identify three key drivers as our focus: managing the size and complexity of metadata, lack of a priori information to match usage intents between publishers and consumers of data, and support for campaigns over collections of data driven by multi-disciplinary, collaborating teams. We introduce the Hoarde abstraction as an attempt to formalize a way of looking at collections of data to make them more tractable for later use. Hoarde leverages middleware and systems infrastructures for scientific and technical data management. Through the lens of a select group of challenging data usage scenarios, we discuss some of the aspects of implementation, usage, and forward portability of this new view on data management.},
doi = {10.1109/ICDCS.2019.00179},
url = {https://www.osti.gov/biblio/1558517}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {7}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: