DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Accessing Data Federations with CVMFS

Abstract

Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the data scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. Here, we will show performance metrics accessing the data federation throughmore » CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.« less

Authors:
 [1];  [1];  [2];  [3];  [3]
  1. Univ. of Nebraska, Lincoln, NE (United States)
  2. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
  3. European Organization for Nuclear Research (CERN), Geneva (Switzerland)
Publication Date:
Research Org.:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP)
OSTI Identifier:
1399106
Report Number(s):
FERMILAB-CONF-17-407-CD
Journal ID: ISSN 1742-6588; 1630215
Grant/Contract Number:  
AC02-07CH11359
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Physics. Conference Series
Additional Journal Information:
Journal Volume: 898; Journal ID: ISSN 1742-6588
Publisher:
IOP Publishing
Country of Publication:
United States
Language:
English
Subject:
96 KNOWLEDGE MANAGEMENT AND PRESERVATION

Citation Formats

Weitzel, Derek, Bockelman, Brian, Dykstra, Dave, Blomer, Jakob, and Meusel, Ren. Accessing Data Federations with CVMFS. United States: N. p., 2017. Web. doi:10.1088/1742-6596/898/6/062044.
Weitzel, Derek, Bockelman, Brian, Dykstra, Dave, Blomer, Jakob, & Meusel, Ren. Accessing Data Federations with CVMFS. United States. https://doi.org/10.1088/1742-6596/898/6/062044
Weitzel, Derek, Bockelman, Brian, Dykstra, Dave, Blomer, Jakob, and Meusel, Ren. Thu . "Accessing Data Federations with CVMFS". United States. https://doi.org/10.1088/1742-6596/898/6/062044. https://www.osti.gov/servlets/purl/1399106.
@article{osti_1399106,
title = {Accessing Data Federations with CVMFS},
author = {Weitzel, Derek and Bockelman, Brian and Dykstra, Dave and Blomer, Jakob and Meusel, Ren},
abstractNote = {Data federations have become an increasingly common tool for large collaborations such as CMS and Atlas to efficiently distribute large data files. Unfortunately, these typically are implemented with weak namespace semantics and a non-POSIX API. On the other hand, CVMFS has provided a POSIX-compliant read-only interface for use cases with a small working set size (such as software distribution). The metadata required for the CVMFS POSIX interface is distributed through a caching hierarchy, allowing it to scale to the level of about a hundred thousand hosts. In this paper, we will describe our contributions to CVMFS that merges the data scalability of XRootD-based data federations (such as AAA) with metadata scalability and POSIX interface of CVMFS. We modified CVMFS so it can serve unmodified files without copying them to the repository server. CVMFS 2.2.0 is also able to redirect requests for data files to servers outside of the CVMFS content distribution network. Finally, we added the ability to manage authorization and authentication using security credentials such as X509 proxy certificates. We combined these modifications with the OSGs StashCache regional XRootD caching infrastructure to create a cached data distribution network. Here, we will show performance metrics accessing the data federation through CVMFS compared to direct data federation access. Additionally, we will discuss the improved user experience of providing access to a data federation through a POSIX filesystem.},
doi = {10.1088/1742-6596/898/6/062044},
journal = {Journal of Physics. Conference Series},
number = ,
volume = 898,
place = {United States},
year = {Thu Nov 23 00:00:00 EST 2017},
month = {Thu Nov 23 00:00:00 EST 2017}
}