Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

DZero data-intensive computing on the Open Science Grid

Journal Article ·
OSTI ID:917082

High energy physics experiments periodically reprocess data, in order to take advantage of improved understanding of the detector and the data processing code. Between February and May 2007, the DZero experiment has reprocessed a substantial fraction of its dataset. This consists of half a billion events, corresponding to about 100 TB of data, organized in 300,000 files. The activity utilized resources from sites around the world, including a dozen sites participating to the Open Science Grid consortium (OSG). About 1,500 jobs were run every day across the OSG, consuming and producing hundreds of Gigabytes of data. Access to OSG computing and storage resources was coordinated by the SAM-Grid system. This system organized job access to a complex topology of data queues and job scheduling to clusters, using a SAM-Grid to OSG job forwarding infrastructure. For the first time in the lifetime of the experiment, a data intensive production activity was managed on a general purpose grid, such as OSG. This paper describes the implications of using OSG, where all resources are granted following an opportunistic model, the challenges of operating a data intensive activity over such large computing infrastructure, and the lessons learned throughout the project.

Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL
Sponsoring Organization:
USDOE
DOE Contract Number:
AC02-07CH11359
OSTI ID:
917082
Report Number(s):
FERMILAB-PUB-07-462-CD
Country of Publication:
United States
Language:
English

Similar Records

Experience producing simulated events for the DZero experiment on the SAM-Grid
Conference · Tue Nov 30 23:00:00 EST 2004 · OSTI ID:15016950

FermiGrid
Conference · Tue May 01 00:00:00 EDT 2007 · OSTI ID:910480

The Open Science Grid
Conference · Fri Jun 01 00:00:00 EDT 2007 · OSTI ID:910187