skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Grid collector: An event catalog with automated file management

Abstract

High Energy Nuclear Physics (HENP) experiments such as STAR at BNL and ATLAS at CERN produce large amounts of data that are stored as files on mass storage systems in computer centers. In these files, the basic unit of data is an event. Analysis is typically performed on a selected set of events. The files containing these events have to be located, copied from mass storage systems to disks before analysis, and removed when no longer needed. These file management tasks are tedious and time consuming. Typically, all events contained in the files are read into memory before a selection is made. Since the time to read the events dominate the overall execution time, reading the unwanted event needlessly increases the analysis time. The Grid Collector is a set of software modules that works together to address these two issues. It automates the file management tasks and provides ''direct'' access to the selected events for analyses. It is currently integrated with the STAR analysis framework. The users can select events based on tags, such as, ''production date between March 10 and 20, and the number of charged tracks > 100.'' The Grid Collector locates the files containing relevant events, transfersmore » the files across the Grid if necessary, and delivers the events to the analysis code through the familiar iterators. There has been some research efforts to address the file management issues, the Grid Collector is unique in that it addresses the event access issue together with the file management issues. This makes it more useful to a large variety of users.« less

Authors:
; ; ; ;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Director. Office of Science. Office of Advanced Scientific Computing Research (US)
OSTI Identifier:
835164
Report Number(s):
LBNL-55563
R&D Project: KS3310; TRN: US0407488
DOE Contract Number:  
AC03-76SF00098
Resource Type:
Conference
Resource Relation:
Conference: IEEE Nuclear Science Symposium 2003, Portland, OR (US), 10/19/2003--10/25/2003; Other Information: PBD: 17 Oct 2003
Country of Publication:
United States
Language:
English
Subject:
73 NUCLEAR PHYSICS AND RADIATION PHYSICS; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; BNL; CERN; COMPUTERS; MANAGEMENT; NUCLEAR PHYSICS; PRODUCTION; STORAGE; GRID EVENT CATALOG DISK CACHE

Citation Formats

Wu, Kesheng, Zhang, Wei-Ming, Sim, Alexander, Gu, Junmin, and Shoshani, Arie. Grid collector: An event catalog with automated file management. United States: N. p., 2003. Web.
Wu, Kesheng, Zhang, Wei-Ming, Sim, Alexander, Gu, Junmin, & Shoshani, Arie. Grid collector: An event catalog with automated file management. United States.
Wu, Kesheng, Zhang, Wei-Ming, Sim, Alexander, Gu, Junmin, and Shoshani, Arie. Fri . "Grid collector: An event catalog with automated file management". United States. https://www.osti.gov/servlets/purl/835164.
@article{osti_835164,
title = {Grid collector: An event catalog with automated file management},
author = {Wu, Kesheng and Zhang, Wei-Ming and Sim, Alexander and Gu, Junmin and Shoshani, Arie},
abstractNote = {High Energy Nuclear Physics (HENP) experiments such as STAR at BNL and ATLAS at CERN produce large amounts of data that are stored as files on mass storage systems in computer centers. In these files, the basic unit of data is an event. Analysis is typically performed on a selected set of events. The files containing these events have to be located, copied from mass storage systems to disks before analysis, and removed when no longer needed. These file management tasks are tedious and time consuming. Typically, all events contained in the files are read into memory before a selection is made. Since the time to read the events dominate the overall execution time, reading the unwanted event needlessly increases the analysis time. The Grid Collector is a set of software modules that works together to address these two issues. It automates the file management tasks and provides ''direct'' access to the selected events for analyses. It is currently integrated with the STAR analysis framework. The users can select events based on tags, such as, ''production date between March 10 and 20, and the number of charged tracks > 100.'' The Grid Collector locates the files containing relevant events, transfers the files across the Grid if necessary, and delivers the events to the analysis code through the familiar iterators. There has been some research efforts to address the file management issues, the Grid Collector is unique in that it addresses the event access issue together with the file management issues. This makes it more useful to a large variety of users.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2003},
month = {10}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: