skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PARLO: PArallel Run-Time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Pattern

Conference ·
OSTI ID:1096350

Download Citation Email Print Request Permissions Save to Project The size and scope of cutting-edge scientific simulations are growing much faster than the I/O and storage capabilities of their run-time environments. The growing gap is exacerbated by exploratory, data-intensive analytics, such as querying simulation data with multivariate, spatio-temporal constraints, which induces heterogeneous access patterns that stress the performance of the underlying storage system. Previous work addresses data layout and indexing techniques to improve query performance for a single access pattern, which is not sufficient for complex analytics jobs. We present PARLO a parallel run-time layout optimization framework, to achieve multi-level data layout optimization for scientific applications at run-time before data is written to storage. The layout schemes optimize for heterogeneous access patterns with user-specified priorities. PARLO is integrated with ADIOS, a high-performance parallel I/O middleware for large-scale HPC applications, to achieve user-transparent, light-weight layout optimization for scientific datasets. It offers simple XML-based configuration for users to achieve flexible layout optimization without the need to modify or recompile application codes. Experiments show that PARLO improves performance by 2 to 26 times for queries with heterogeneous access patterns compared to state-of-the-art scientific database management systems. Compared to traditional post-processing approaches, its underlying run-time layout optimization achieves a 56% savings in processing time and a reduction in storage overhead of up to 50%. PARLO also exhibits a low run-time resource requirement, while also limiting the performance impact on running applications to a reasonable level.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1096350
Resource Relation:
Conference: Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, Delft, Netherlands, 20130513, 20130516
Country of Publication:
United States
Language:
English

Similar Records

PARLO: PArallel Run-Time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Patterns
Conference · Tue Jun 25 00:00:00 EDT 2013 · PROCEEDINGS OF THE 2013 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID 2013) · OSTI ID:1096350

...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
Conference · Thu Jan 01 00:00:00 EST 2009 · OSTI ID:1096350

Automated Cache Performance Analysis And Optimization
Technical Report · Mon Dec 23 00:00:00 EST 2013 · OSTI ID:1096350

Related Subjects