Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC

Journal Article · · Computing and Software for Big Science
 [1];  [2];  [3]
  1. Cornell Univ., Ithaca, NY (United States)
  2. Heidelberg Univ. (Germany)
  3. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregate $$\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.
Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
Grant/Contract Number:
AC02-07CH11359
OSTI ID:
1437402
Report Number(s):
FERMILAB-PUB--18-074-CD; arXiv:1801.03872; 1647570
Journal Information:
Computing and Software for Big Science, Journal Name: Computing and Software for Big Science Journal Issue: 1 Vol. 2; ISSN 2510-2036
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (7)

The CMS Remote Analysis Builder (CRAB) book January 2007
Distributed computing in practice: the Condor experience
  • Thain, Douglas; Tannenbaum, Todd; Livny, Miron
  • Concurrency and Computation: Practice and Experience, Vol. 17, Issue 2-4, p. 323-356 https://doi.org/10.1002/cpe.938
journal January 2005
CMS computing operations during run 1 journal June 2014
The CMS Data Management System journal June 2014
Using the glideinWMS System as a Common Resource Provisioning Layer in CMS journal December 2015
The Pilot Way to Grid Resources Using glideinWMS conference March 2009
CMS computing operations during run 1 text January 2014

Figures / Tables (4)


Similar Records

CMS data and workflow management system
Journal Article · Mon Dec 31 23:00:00 EST 2007 · OSTI ID:1831871

Deactivation and decommissioning web log analysis using big data technology - 15710
Conference · Wed Jul 01 00:00:00 EDT 2015 · OSTI ID:22824525

CMS distributed computing workflow experience
Conference · Fri Dec 31 23:00:00 EST 2010 · J.Phys.Conf.Ser. · OSTI ID:1433875