XRootD popularity on hadoop clusters
Journal Article
·
· Journal of Physics. Conference Series
- Univ. of Pisa (Italy); Istituto Nazionale di Fisica Nucleare (INFN), Pisa (Italy)
- Istituto Nazionale di Fisica Nucleare (INFN), Pisa (Italy)
- Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
- European Organization for Nuclear Research (CERN), Geneva (Switzerland)
- European Organization for Nuclear Research (CERN), Geneva (Switzerland); CMS Collaboration, et al.
Performance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the efficiency of such monitoring. A large set of operational data - user activities, job submissions, resources, file transfers, site efficiencies, software releases, network traffic, machine logs - is being injected into a readily available Hadoop cluster, via several data streamers. The collected metadata is further organized running fast arbitrary queries; this offers the ability to test several Map&Reduce-based frameworks and measure the system speed-up when compared to the original database infrastructure. By leveraging a quality Hadoop data store and enabling an analytics framework on top, it is possible to design a mining platform to predict dataset popularity and discover patterns and correlations.
- Research Organization:
- Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), High Energy Physics (HEP)
- Contributing Organization:
- CMS Collaboration
- Grant/Contract Number:
- AC02-07CH11359
- OSTI ID:
- 1831862
- Report Number(s):
- FERMILAB-PUB--17-715-CMS; oai:inspirehep.net:1638557
- Journal Information:
- Journal of Physics. Conference Series, Journal Name: Journal of Physics. Conference Series Journal Issue: 7 Vol. 898; ISSN 1742-6588
- Publisher:
- IOP PublishingCopyright Statement
- Country of Publication:
- United States
- Language:
- English
The Worldwide LHC Computing Grid (worldwide LCG)
|
journal | July 2007 |
CMS Physics Technical Design Report, Volume II: Physics Performance
|
journal | April 2007 |
Dataset Popularity Prediction for Caching of CMS Big Data
|
journal | February 2018 |
Similar Records
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC
YARNsim: Simulating Hadoop YARN
Selective Sampling for Sensor Type Classification in Buildings
Journal Article
·
Sun Mar 18 20:00:00 EDT 2018
· Computing and Software for Big Science
·
OSTI ID:1437402
YARNsim: Simulating Hadoop YARN
Conference
·
Wed Dec 31 23:00:00 EST 2014
·
OSTI ID:1335904
Selective Sampling for Sensor Type Classification in Buildings
Conference
·
Tue Jun 09 00:00:00 EDT 2020
· 2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)
·
OSTI ID:1822657