skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: POPE: A distributed query system for high performance analysis of very large persistent object stores

Conference ·
OSTI ID:207477

Analysis of large physics data sets is a major computing task at Fermilab. One step in such an analysis involves culling ``interesting`` events via the use of complex query criteria. What makes this unusual is the scale required: 100`s of gigabytes of event data must be scanned at 10`s of megabytes per second for the typical queries that are applied, and data must be extracted from 10`s of terabytes based on the result of the query. The Physics Object Persistency Manager (POPM) system is a solution tailored to this scale of problem. A running POPM environment can support multiple queries in progress, each scanning at rates exceeding 10 megabytes per second, all of which are sharing access to a very large persistent address space distributed across multiple disks on multiple hosts. Specifically, POPM employs the following techniques to permit this scale of performance and access: Persistent objects: Experimental data to be scanned is ``populated`` as a data structure into the persistent address space supported by POPM. C++ classes with a few key overloaded operators provide nearly transparent semantics for access to the persistent storage. Distributed and parallel I/O: The persistent address space is automatically distributed across disks of multiple ``I/O nodes`` within the POPM system. A striping unit concept is implemented in POPM, permitting fast parallel I/O across the storage nodes, even for small single queries. Efficient Shared access: POPM implements an efficient mechanism for arbitration and multiplexing of I/O access among multiple queries on the same or separate compute nodes.

Research Organization:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States)
DOE Contract Number:
AC02-76CH03000
OSTI ID:
207477
Report Number(s):
FNAL/C-96/002; CONF-960116-4; ON: DE96005245
Resource Relation:
Conference: Hawaii international conference on system sciences, WaiLea, HI (United States), 3-6 Jan 1996; Other Information: PBD: Jan 1996
Country of Publication:
United States
Language:
English

Similar Records

Flexible storage services for parallel data mining
Conference · Tue Dec 31 00:00:00 EST 1996 · OSTI ID:207477

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
Journal Article · Fri Jan 17 00:00:00 EST 2020 · Journal of Computer Science and Technology · OSTI ID:207477

Query estimation and order-optimized iteration in very large federations
Conference · Mon May 04 00:00:00 EDT 1998 · OSTI ID:207477