skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Final Report: Efficient Databases for MPC Microdata

Technical Report ·
DOI:https://doi.org/10.2172/1048538· OSTI ID:1048538

The purpose of this grant was to develop the theory and practice of high-performance databases for massive streamed datasets. Over the last three years, we have developed fast indexing technology, that is, technology for rapidly ingesting data and storing that data so that it can be efficiently queried and analyzed. During this project we developed the technology so that high-bandwidth data streams can be indexed and queried efficiently. Our technology has been proven to work data sets composed of tens of billions of rows when the data streams arrives at over 40,000 rows per second. We achieved these numbers even on a single disk driven by two cores. Our work comprised (1) new write-optimized data structures with better asymptotic complexity than traditional structures, (2) implementation, and (3) benchmarking. We furthermore developed a prototype of TokuFS, a middleware layer that can handle microdata I/O packaged up in an MPI-IO abstraction.

Research Organization:
Tokutek
Sponsoring Organization:
USDOE
DOE Contract Number:
FG02-08ER25853
OSTI ID:
1048538
Report Number(s):
DOE/ER25853-1
Country of Publication:
United States
Language:
English

Similar Records

...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
Conference · Thu Jan 01 00:00:00 EST 2009 · OSTI ID:1048538

Adding Data Management Services to Parallel File Systems
Technical Report · Wed Mar 04 00:00:00 EST 2015 · OSTI ID:1048538

PETASCALE DATA STORAGE INSTITUTE (PDSI) Final Report
Technical Report · Mon Nov 26 00:00:00 EST 2012 · OSTI ID:1048538

Related Subjects