skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SDS: A Framework for Scientific Data Services

Abstract

Large-scale scientific applications typically write their data to parallel file systems with organizations designed to achieve fast write speeds. Analysis tasks frequently read the data in a pattern that is different from the write pattern, and therefore experience poor I/O performance. In this paper, we introduce a prototype framework for bridging the performance gap between write and read stages of data access from parallel file systems. We call this framework Scientific Data Services, or SDS for short. This initial implementation of SDS focuses on reorganizing previously written files into data layouts that benefit read patterns, and transparently directs read calls to the reorganized data. SDS follows a client-server architecture. The SDS Server manages partial or full replicas of reorganized datasets and serves SDS Clients' requests for data. The current version of the SDS client library supports HDF5 programming interface for reading data. The client library intercepts HDF5 calls and transparently redirects them to the reorganized data. The SDS client library also provides a querying interface for reading part of the data based on user-specified selective criteria. We describe the design and implementation of the SDS client-server architecture, and evaluate the response time of the SDS Server and the performance benefitsmore » of SDS.« less

Authors:
; ;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1164907
Report Number(s):
LBNL-6490E
DOE Contract Number:  
DE-AC02-05CH11231
Resource Type:
Conference
Resource Relation:
Conference: 8th parallel data storage workshop held in conjunction with sc13 , Denver, CO, November 18, 2013
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Scientific Data Services, SDS

Citation Formats

Dong, Bin, Byna, Surendra, and Wu, Kesheng. SDS: A Framework for Scientific Data Services. United States: N. p., 2013. Web.
Dong, Bin, Byna, Surendra, & Wu, Kesheng. SDS: A Framework for Scientific Data Services. United States.
Dong, Bin, Byna, Surendra, and Wu, Kesheng. 2013. "SDS: A Framework for Scientific Data Services". United States. https://www.osti.gov/servlets/purl/1164907.
@article{osti_1164907,
title = {SDS: A Framework for Scientific Data Services},
author = {Dong, Bin and Byna, Surendra and Wu, Kesheng},
abstractNote = {Large-scale scientific applications typically write their data to parallel file systems with organizations designed to achieve fast write speeds. Analysis tasks frequently read the data in a pattern that is different from the write pattern, and therefore experience poor I/O performance. In this paper, we introduce a prototype framework for bridging the performance gap between write and read stages of data access from parallel file systems. We call this framework Scientific Data Services, or SDS for short. This initial implementation of SDS focuses on reorganizing previously written files into data layouts that benefit read patterns, and transparently directs read calls to the reorganized data. SDS follows a client-server architecture. The SDS Server manages partial or full replicas of reorganized datasets and serves SDS Clients' requests for data. The current version of the SDS client library supports HDF5 programming interface for reading data. The client library intercepts HDF5 calls and transparently redirects them to the reorganized data. The SDS client library also provides a querying interface for reading part of the data based on user-specified selective criteria. We describe the design and implementation of the SDS client-server architecture, and evaluate the response time of the SDS Server and the performance benefits of SDS.},
doi = {},
url = {https://www.osti.gov/biblio/1164907}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Oct 31 00:00:00 EDT 2013},
month = {Thu Oct 31 00:00:00 EDT 2013}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: