skip to main content

DOE PAGESDOE PAGES

Title: Grid site availability evaluation and monitoring at CMS

The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) uses distributed grid computing to store, process, and analyse the vast quantity of scientific data recorded every year. The computing resources are grouped into sites and organized in a tiered structure. Each site provides computing and storage to the CMS computing grid. Over a hundred sites worldwide contribute with resources from hundred to well over ten thousand computing cores and storage from tens of TBytes to tens of PBytes. In such a large computing setup scheduled and unscheduled outages occur continually and are not allowed to significantly impact data handling, processing, and analysis. Unscheduled capacity and performance reductions need to be detected promptly and corrected. CMS developed a sophisticated site evaluation and monitoring system for Run 1 of the LHC based on tools of the Worldwide LHC Computing Grid. For Run 2 of the LHC the site evaluation and monitoring system is being overhauled to enable faster detection/reaction to failures and a more dynamic handling of computing resources. Furthermore, enhancements to better distinguish site from central service issues and to make evaluations more transparent and informative to site support staff are planned.
Authors:
 [1] ;  [2] ;  [3] ; ORCiD logo [1] ;  [4]
  1. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
  2. Vilnius Univ., Vilnius (Lithuania)
  3. Univ. di Pisa & INFN, Pisa (Italy)
  4. European Organization for Nuclear Research (CERN), Geneva (Switzerland)
Publication Date:
Report Number(s):
FERMILAB-CONF-16-752-CD
Journal ID: ISSN 1742-6588; 1638611; TRN: US1800845
Grant/Contract Number:
AC02-07CH11359
Type:
Accepted Manuscript
Journal Name:
Journal of Physics. Conference Series
Additional Journal Information:
Journal Volume: 898; Journal Issue: 9; Journal ID: ISSN 1742-6588
Publisher:
IOP Publishing
Research Org:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Org:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
Country of Publication:
United States
Language:
English
Subject:
43 PARTICLE ACCELERATORS
OSTI Identifier:
1415641