Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Design of FastQuery: How to Generalize Indexing and Querying System for Scientific Data

Technical Report ·
DOI:https://doi.org/10.2172/1051264· OSTI ID:1051264

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies such as FastBit are critical for facilitating interactive exploration of large datasets. These technologies rely on adding auxiliary information to existing datasets to accelerate query processing. To use these indices, we need to match the relational data model used by the indexing systems with the array data model used by most scientific data, and to provide an efficient input and output layer for reading and writing the indices. In this work, we present a flexible design that can be easily applied to most scientific data formats. We demonstrate this flexibility by applying it to two of the most commonly used scientific data formats, HDF5 and NetCDF. We present two case studies using simulation data from the particle accelerator and climate simulation communities. To demonstrate the effectiveness of the new design, we also present a detailed performance study using both synthetic and real scientific workloads.

Research Organization:
Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
Sponsoring Organization:
Computational Research Division
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1051264
Report Number(s):
LBNL-5088E
Country of Publication:
United States
Language:
English

Similar Records

FastQuery: A Parallel Indexing System for Scientific Data
Conference · Fri Jul 29 00:00:00 EDT 2011 · OSTI ID:1056551

HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets usingFast Bitmap Indices
Conference · Wed Mar 29 23:00:00 EST 2006 · OSTI ID:881620

HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets UsingFast Bitmap Indices
Conference · Tue Dec 06 23:00:00 EST 2005 · OSTI ID:881619