DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Rucio: Scientific Data Management

Abstract

Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities. In this article, we detail the fundamental concepts of Rucio, describe the architecture along with implementation details, and report operational experience from production usage.

Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ORCiD logo; ; more »; ; ; ; ; ; ; ; ; ; « less
Publication Date:
Research Org.:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP)
Contributing Org.:
CERN Collaboration
OSTI Identifier:
1542972
Report Number(s):
arXiv:1902.09857; ATL-COM-SOFT-2018-100; FERMILAB-PUB-19-095-CD; 2510-2044
Journal ID: ISSN 2510-2036; oai:inspirehep.net:1722117
Grant/Contract Number:  
AC02-07CH11359
Resource Type:
Accepted Manuscript
Journal Name:
Computing and Software for Big Science
Additional Journal Information:
Journal Volume: 3; Journal Issue: 1; Journal ID: ISSN 2510-2036
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 96 KNOWLEDGE MANAGEMENT AND PRESERVATION

Citation Formats

Barisits, Martin, Beermann, Thomas, Berghaus, Frank, Bockelman, Brian, Bogado, Joaquin, Cameron, David, Christidis, Dimitrios, Ciangottini, Diego, Dimitrov, Gancho, Elsing, Markus, Garonne, Vincent, di Girolamo, Alessandro, Goossens, Luc, Guan, Wen, Guenther, Jaroslav, Javurek, Tomas, Kuhn, Dietmar, Lassnig, Mario, Lopez, Fernando, Magini, Nicolo, Molfetas, Angelos, Nairz, Armin, Ould-Saada, Farid, Prenner, Stefan, Serfon, Cedric, Stewart, Graeme, Vaandering, Eric, Vasileva, Petya, Vigne, Ralph, and Wegner, Tobias. Rucio: Scientific Data Management. United States: N. p., 2019. Web. doi:10.1007/s41781-019-0026-3.
Barisits, Martin, Beermann, Thomas, Berghaus, Frank, Bockelman, Brian, Bogado, Joaquin, Cameron, David, Christidis, Dimitrios, Ciangottini, Diego, Dimitrov, Gancho, Elsing, Markus, Garonne, Vincent, di Girolamo, Alessandro, Goossens, Luc, Guan, Wen, Guenther, Jaroslav, Javurek, Tomas, Kuhn, Dietmar, Lassnig, Mario, Lopez, Fernando, Magini, Nicolo, Molfetas, Angelos, Nairz, Armin, Ould-Saada, Farid, Prenner, Stefan, Serfon, Cedric, Stewart, Graeme, Vaandering, Eric, Vasileva, Petya, Vigne, Ralph, & Wegner, Tobias. Rucio: Scientific Data Management. United States. https://doi.org/10.1007/s41781-019-0026-3
Barisits, Martin, Beermann, Thomas, Berghaus, Frank, Bockelman, Brian, Bogado, Joaquin, Cameron, David, Christidis, Dimitrios, Ciangottini, Diego, Dimitrov, Gancho, Elsing, Markus, Garonne, Vincent, di Girolamo, Alessandro, Goossens, Luc, Guan, Wen, Guenther, Jaroslav, Javurek, Tomas, Kuhn, Dietmar, Lassnig, Mario, Lopez, Fernando, Magini, Nicolo, Molfetas, Angelos, Nairz, Armin, Ould-Saada, Farid, Prenner, Stefan, Serfon, Cedric, Stewart, Graeme, Vaandering, Eric, Vasileva, Petya, Vigne, Ralph, and Wegner, Tobias. Fri . "Rucio: Scientific Data Management". United States. https://doi.org/10.1007/s41781-019-0026-3. https://www.osti.gov/servlets/purl/1542972.
@article{osti_1542972,
title = {Rucio: Scientific Data Management},
author = {Barisits, Martin and Beermann, Thomas and Berghaus, Frank and Bockelman, Brian and Bogado, Joaquin and Cameron, David and Christidis, Dimitrios and Ciangottini, Diego and Dimitrov, Gancho and Elsing, Markus and Garonne, Vincent and di Girolamo, Alessandro and Goossens, Luc and Guan, Wen and Guenther, Jaroslav and Javurek, Tomas and Kuhn, Dietmar and Lassnig, Mario and Lopez, Fernando and Magini, Nicolo and Molfetas, Angelos and Nairz, Armin and Ould-Saada, Farid and Prenner, Stefan and Serfon, Cedric and Stewart, Graeme and Vaandering, Eric and Vasileva, Petya and Vigne, Ralph and Wegner, Tobias},
abstractNote = {Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities. In this article, we detail the fundamental concepts of Rucio, describe the architecture along with implementation details, and report operational experience from production usage.},
doi = {10.1007/s41781-019-0026-3},
journal = {Computing and Software for Big Science},
number = 1,
volume = 3,
place = {United States},
year = {Fri Aug 09 00:00:00 EDT 2019},
month = {Fri Aug 09 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Fig. 1 Fig. 1: The namespace is organized with collections and files. Collections can either be containers or datasets. Containers consist of containers or datasets. Datasets consist of files only. Files can be in multiple datasets

Save / Share:

Works referenced in this record:

Machine learning of network metrics in ATLAS Distributed Data Management
journal, October 2017


Testing as a Service with HammerCloud
journal, June 2014

  • Llamas, Ramón Medrano; Barrand, Quentin; Elmsheuser, Johannes
  • Journal of Physics: Conference Series, Vol. 513, Issue 6
  • DOI: 10.1088/1742-6596/513/6/062031

ATLAS Replica Management in Rucio: Replication Rules and Subscriptions
journal, June 2014


The GridSite Web/Grid security system
journal, April 2010


FTS3: New Data Movement Service For WLCG
journal, June 2014


LHCOPN and LHCONE: Status and Future Evolution
journal, December 2015


Overview of ATLAS PanDA Workload Management
journal, December 2011


Multilevel Workflow System in the ATLAS Experiment
journal, May 2015


EOS as the present and future solution for data storage at CERN
journal, December 2015


Managing ATLAS data on a petabyte-scale with DQ2
journal, July 2008


The ATLAS Data Acquisition System in LHC Run 2
journal, October 2017


The new CERN tape software - getting ready for total performance
journal, December 2015


DIRAC in Large Particle Physics Experiments
journal, October 2017


ROOT — An object oriented data analysis framework
journal, April 1997

  • Brun, Rene; Rademakers, Fons
  • Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 389, Issue 1-2
  • DOI: 10.1016/S0168-9002(97)00048-X

AthenaMT: upgrading the ATLAS software framework for the many-core world with multi-threading
journal, October 2017


AGIS: The ATLAS Grid Information System
journal, December 2012

  • Anisenkov, Alexey; Belov, Sergey; Di Girolamo, Alessandro
  • Journal of Physics: Conference Series, Vol. 396, Issue 3
  • DOI: 10.1088/1742-6596/396/3/032006

Caching Servers for ATLAS
journal, October 2017


Dynamic federation of grid and cloud storage
journal, September 2016

  • Furano, Fabrizio; Keeble, Oliver; Field, Laurence
  • Physics of Particles and Nuclei Letters, Vol. 13, Issue 5
  • DOI: 10.1134/S1547477116050186

A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks
journal, October 2014

  • Nunes, Bruno Astuto A.; Mendonca, Marc; Nguyen, Xuan-Nam
  • IEEE Communications Surveys & Tutorials, Vol. 16, Issue 3
  • DOI: 10.1109/SURV.2014.012214.00180

DPM evolution: a disk operations management engine for DPM
journal, October 2017


Storage resource manager version 2.2: design, implementation, and testing experience
journal, July 2008


LHC Machine
journal, August 2008


The ATLAS Tier-0: Overview and operational experience
journal, April 2010


The ATLAS Experiment at the CERN Large Hadron Collider
journal, August 2008


The CMS experiment at the CERN LHC
journal, August 2008


Unified Monitoring Architecture for IT and Grid Services
journal, October 2017


ATLAS Distributed Computing Experience and Performance During the LHC Run-2
journal, October 2017


Distributed computing in practice: the Condor experience
journal, January 2005

  • Thain, Douglas; Tannenbaum, Todd; Livny, Miron
  • Concurrency and Computation: Practice and Experience, Vol. 17, Issue 2-4, p. 323-356
  • DOI: 10.1002/cpe.938

Globus Toolkit Version 4: Software for Service-Oriented Systems
journal, July 2006


AGIS: The ATLAS Grid Information System
journal, June 2014


Distributed computing in practice: the Condor experience
journal, January 2005

  • Thain, Douglas; Tannenbaum, Todd; Livny, Miron
  • Concurrency and Computation: Practice and Experience, Vol. 17, Issue 2-4, p. 323-356
  • DOI: 10.1002/cpe.938

AGIS: The ATLAS Grid Information System
journal, December 2012

  • Anisenkov, Alexey; Belov, Sergey; Di Girolamo, Alessandro
  • Journal of Physics: Conference Series, Vol. 396, Issue 3
  • DOI: 10.1088/1742-6596/396/3/032006

ATLAS Replica Management in Rucio: Replication Rules and Subscriptions
journal, June 2014


Machine learning of network metrics in ATLAS Distributed Data Management
journal, October 2017


DPM evolution: a disk operations management engine for DPM
journal, October 2017


DIRAC in Large Particle Physics Experiments
journal, October 2017