skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Distributed data access in the sequential access model at the D0 experiment at Fermilab

Conference ·

The authors present the Sequential Access Model (SAM), which is the data handling system for D0, one of two primary High Energy Experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode a user application needs to process a stream of data, by accessing each data unit exactly once, the order of data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for significant optimizations of system performance, decrease of user file latency and increase of overall throughput. In particular, caching is done with the knowledge of all the files needed in the near future, defined as all the files of the already running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system (MSS) called Enstore[2] and also developed at Fermilab. (The tape drives are served by an ADIC AML/2 Automated Tape Library). At any given time, SAM has a small fraction of the data cached on disk for processing. In the present paper, the authors discuss how data is delivered onto disk and how it is accessed by user applications. They will concentrate on data retrieval (consumption) from the MSS; when SAM is used for storing of data, the mechanisms are rather symmetrical. All of the data managed by SAM is cataloged in great detail in a relational database (ORACLE). The database also serves as the persistency mechanism for the SAM servers described in this paper. Any client or server in the SAM system which needs to store or retrieve information from the database does so through the interfaces of a CORBA-based database server. The users (physicists) use the database to define, based on physics selection criteria, datasets of their interest. Once the query is defined and resolved into a set of files, actual data processing, called a project, may begin. Obviously, running projects involves data transfer and resource management. The computing facilities with their CPU, disk, and other hardware resources are logically partitioned into collections of resources called stations. A station may be a single node, a fraction thereof (some of the machine's disks and/or CPUs may constitute a station) or a collection of smaller nodes. It is equipped with a server, called station master (SM), that coordinates data delivery and projects using the data. User requests to actually run a project proceed through the SM, which determines the amount of cache replacement, if any, needed to run the project. If viable, the user job is submitted into a station-associated batch queue, otherwise the project is rejected and the user may try another station.

Research Organization:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE; USDOE Office of Energy Research (ER) (US)
DOE Contract Number:
AC02-76CH03000
OSTI ID:
757586
Report Number(s):
FERMILAB-Conf-00/140-E; TRN: US0003767
Resource Relation:
Conference: Ninth IEEE International Symposium on High-Performance Distributed Computing, Pittsburgh, PA (US), 08/01/2000--08/04/2000; Other Information: PBD: 5 Jul 2000
Country of Publication:
United States
Language:
English