POPE: A distributed query system for high performance analysis of very large persistent object stores
Analysis of large physics data sets is a major computing task at Fermilab. One step in such an analysis involves culling ``interesting`` events via the use of complex query criteria. What makes this unusual is the scale required: 100`s of gigabytes of event data must be scanned at 10`s of megabytes per second for the typical queries that are applied, and data must be extracted from 10`s of terabytes based on the result of the query. The Physics Object Persistency Manager (POPM) system is a solution tailored to this scale of problem. A running POPM environment can support multiple queries in progress, each scanning at rates exceeding 10 megabytes per second, all of which are sharing access to a very large persistent address space distributed across multiple disks on multiple hosts. Specifically, POPM employs the following techniques to permit this scale of performance and access: Persistent objects: Experimental data to be scanned is ``populated`` as a data structure into the persistent address space supported by POPM. C++ classes with a few key overloaded operators provide nearly transparent semantics for access to the persistent storage. Distributed and parallel I/O: The persistent address space is automatically distributed across disks of multiple ``I/O nodes`` within the POPM system. A striping unit concept is implemented in POPM, permitting fast parallel I/O across the storage nodes, even for small single queries. Efficient Shared access: POPM implements an efficient mechanism for arbitration and multiplexing of I/O access among multiple queries on the same or separate compute nodes.
- Research Organization:
- Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
- Sponsoring Organization:
- USDOE, Washington, DC (United States)
- DOE Contract Number:
- AC02-76CH03000
- OSTI ID:
- 207477
- Report Number(s):
- FNAL/C-96/002; CONF-960116-4; ON: DE96005245
- Resource Relation:
- Conference: Hawaii international conference on system sciences, WaiLea, HI (United States), 3-6 Jan 1996; Other Information: PBD: Jan 1996
- Country of Publication:
- United States
- Language:
- English
Similar Records
ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
Query estimation and order-optimized iteration in very large federations