skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: RADAR: Runtime Asymmetric Data-access Driven Scientific Data Replication

Abstract

Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these factors, we present a partial data replication system called RADAR. We capture datatype- and collective-aware I/O access patterns (indicating logical access) via MPI-IO tracing and use a combination of coarse-grained and fine-grained performance modeling to evaluate and select optimized physical data distributions for the task at hand. Unlike conventional methods, we store all replica data and metadata, along with the original untouched data, under a single file container using the object abstraction in parallel filesystems. Our system results in manyfold improvements in some commonly used subvolume decomposition access patterns.Moreover, the modeling approach can determine whether such optimizations should be undertaken in the first place.

Authors:
; ; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1498740
DOE Contract Number:  
AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: 2014 International Supercomputing Conference, 06/22/14 - 06/26/14, Leipzig, Germany
Country of Publication:
United States
Language:
English
Subject:
access pattern; data layout; data replication; high performance computing; layout optimization

Citation Formats

Jenkins, John, Zou, Xiaocheng, Tang, Houjun, Kimpe, Dries, Ross, Robert, and Samatova, Nagiza F. RADAR: Runtime Asymmetric Data-access Driven Scientific Data Replication. United States: N. p., 2014. Web. doi:10.1007/978-3-319-07518-1_19.
Jenkins, John, Zou, Xiaocheng, Tang, Houjun, Kimpe, Dries, Ross, Robert, & Samatova, Nagiza F. RADAR: Runtime Asymmetric Data-access Driven Scientific Data Replication. United States. https://doi.org/10.1007/978-3-319-07518-1_19
Jenkins, John, Zou, Xiaocheng, Tang, Houjun, Kimpe, Dries, Ross, Robert, and Samatova, Nagiza F. 2014. "RADAR: Runtime Asymmetric Data-access Driven Scientific Data Replication". United States. https://doi.org/10.1007/978-3-319-07518-1_19. https://www.osti.gov/servlets/purl/1498740.
@article{osti_1498740,
title = {RADAR: Runtime Asymmetric Data-access Driven Scientific Data Replication},
author = {Jenkins, John and Zou, Xiaocheng and Tang, Houjun and Kimpe, Dries and Ross, Robert and Samatova, Nagiza F.},
abstractNote = {Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these factors, we present a partial data replication system called RADAR. We capture datatype- and collective-aware I/O access patterns (indicating logical access) via MPI-IO tracing and use a combination of coarse-grained and fine-grained performance modeling to evaluate and select optimized physical data distributions for the task at hand. Unlike conventional methods, we store all replica data and metadata, along with the original untouched data, under a single file container using the object abstraction in parallel filesystems. Our system results in manyfold improvements in some commonly used subvolume decomposition access patterns.Moreover, the modeling approach can determine whether such optimizations should be undertaken in the first place.},
doi = {10.1007/978-3-319-07518-1_19},
url = {https://www.osti.gov/biblio/1498740}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2014},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: