The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
Abstract
We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
- Authors:
-
more »
- European Organization for Nuclear Research
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford
- Queen Mary University of London
- The Ohio State University
- National Institute for Subatomic Physics
- International School for Advanced Studies, National Institute for Nuclear Physics
- Lund University
- University of California, San Diego
- The University of Texas at Arlington
- University of Glasgow
- European Organization for Nuclear Research, Worcester Polytechnic Institute
- Konkuk University
- University of Adelaide
- Institute for Corpuscular Physics
- Rice University
- RWTH Aachen University
- California Institute of Technology, Fermi National Accelerator Laboratory
- Harvard University, The NSF AI Institute for Artificial Intelligence and Fundamental Interactions
- Kyungpook National University
- European Organization for Nuclear Research, National and Kapodistrian University of Athens
- University of Houston
- California Institute of Technology
- University College London
- University of Vienna, European Organization for Nuclear Research
- Publication Date:
- Research Org.:
- Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), High Energy Physics (HEP); European Research Council (ERC); Australian Research Council (ARC); Science and Technology Facilities Council (STFC); National Research Foundation of Korea (NRF); National Science Foundation (NSF)
- OSTI Identifier:
- 1842562
- Alternate Identifier(s):
- OSTI ID: 1824176
- Report Number(s):
- FERMILAB-PUB-21-285-CMS; arXiv:2105.14027
Journal ID: ISSN 2542-4653; 043
- Grant/Contract Number:
- AC02-07CH11359; SC0011726; SC0011925; SC0013607; SC0019227; SC0021187; SC0021396; ST/T000864/1; 2019R1A2C1009419; URF\R1\191524; PHY-2019786; 772369; DP180102209; CE200100008; 788223; ST/P000274/1
- Resource Type:
- Published Article
- Journal Name:
- SciPost Physics
- Additional Journal Information:
- Journal Name: SciPost Physics Journal Volume: 12 Journal Issue: 1; Journal ID: ISSN 2542-4653
- Publisher:
- Stichting SciPost
- Country of Publication:
- Netherlands
- Language:
- English
- Subject:
- 72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS
Citation Formats
Aarrestad, Thea, van Beekveld, Melissa, Bona, Marcella, Boveia, Antonio, Caron, Sascha, Davies, Joe, de Simone, Andrea, Doglioni, Caterina, Duarte, Javier, Farbin, Amir, Gupta, Honey, Hendriks, Luc, Heinrich, Lukas A., Howarth, James, Jawahar, Pratik, Jueid, Adil, Lastow, Jessica, Leinweber, Adam, Mamuzic, Judita, Merényi, Erzsébet, Morandini, Alessandro, Moskvitina, Polina, Nellist, Clara, Ngadiuba, Jennifer, Ostdiek, Bryan, Pierini, Maurizio, Ravina, Baptiste, Ruiz de Austri, Roberto, Sekmen, Sezen, Touranakou, Mary, Vaškeviciute, Marija, Vilalta, Ricardo, Vlimant, Jean-Roch, Verheyen, Rob, White, Martin, Wulff, Eric, Wallin, Erik, Wozniak, Kinga A., and Zhang, Zhongyi. The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider. Netherlands: N. p., 2022.
Web. doi:10.21468/SciPostPhys.12.1.043.
Aarrestad, Thea, van Beekveld, Melissa, Bona, Marcella, Boveia, Antonio, Caron, Sascha, Davies, Joe, de Simone, Andrea, Doglioni, Caterina, Duarte, Javier, Farbin, Amir, Gupta, Honey, Hendriks, Luc, Heinrich, Lukas A., Howarth, James, Jawahar, Pratik, Jueid, Adil, Lastow, Jessica, Leinweber, Adam, Mamuzic, Judita, Merényi, Erzsébet, Morandini, Alessandro, Moskvitina, Polina, Nellist, Clara, Ngadiuba, Jennifer, Ostdiek, Bryan, Pierini, Maurizio, Ravina, Baptiste, Ruiz de Austri, Roberto, Sekmen, Sezen, Touranakou, Mary, Vaškeviciute, Marija, Vilalta, Ricardo, Vlimant, Jean-Roch, Verheyen, Rob, White, Martin, Wulff, Eric, Wallin, Erik, Wozniak, Kinga A., & Zhang, Zhongyi. The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider. Netherlands. https://doi.org/10.21468/SciPostPhys.12.1.043
Aarrestad, Thea, van Beekveld, Melissa, Bona, Marcella, Boveia, Antonio, Caron, Sascha, Davies, Joe, de Simone, Andrea, Doglioni, Caterina, Duarte, Javier, Farbin, Amir, Gupta, Honey, Hendriks, Luc, Heinrich, Lukas A., Howarth, James, Jawahar, Pratik, Jueid, Adil, Lastow, Jessica, Leinweber, Adam, Mamuzic, Judita, Merényi, Erzsébet, Morandini, Alessandro, Moskvitina, Polina, Nellist, Clara, Ngadiuba, Jennifer, Ostdiek, Bryan, Pierini, Maurizio, Ravina, Baptiste, Ruiz de Austri, Roberto, Sekmen, Sezen, Touranakou, Mary, Vaškeviciute, Marija, Vilalta, Ricardo, Vlimant, Jean-Roch, Verheyen, Rob, White, Martin, Wulff, Eric, Wallin, Erik, Wozniak, Kinga A., and Zhang, Zhongyi. Fri .
"The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider". Netherlands. https://doi.org/10.21468/SciPostPhys.12.1.043.
@article{osti_1842562,
title = {The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider},
author = {Aarrestad, Thea and van Beekveld, Melissa and Bona, Marcella and Boveia, Antonio and Caron, Sascha and Davies, Joe and de Simone, Andrea and Doglioni, Caterina and Duarte, Javier and Farbin, Amir and Gupta, Honey and Hendriks, Luc and Heinrich, Lukas A. and Howarth, James and Jawahar, Pratik and Jueid, Adil and Lastow, Jessica and Leinweber, Adam and Mamuzic, Judita and Merényi, Erzsébet and Morandini, Alessandro and Moskvitina, Polina and Nellist, Clara and Ngadiuba, Jennifer and Ostdiek, Bryan and Pierini, Maurizio and Ravina, Baptiste and Ruiz de Austri, Roberto and Sekmen, Sezen and Touranakou, Mary and Vaškeviciute, Marija and Vilalta, Ricardo and Vlimant, Jean-Roch and Verheyen, Rob and White, Martin and Wulff, Eric and Wallin, Erik and Wozniak, Kinga A. and Zhang, Zhongyi},
abstractNote = {We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} 10 f b − 1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.},
doi = {10.21468/SciPostPhys.12.1.043},
journal = {SciPost Physics},
number = 1,
volume = 12,
place = {Netherlands},
year = {Fri Jan 28 00:00:00 EST 2022},
month = {Fri Jan 28 00:00:00 EST 2022}
}
https://doi.org/10.21468/SciPostPhys.12.1.043
Works referenced in this record:
New- phenomenology
journal, January 1991
- He, X. -G.; Joshi, G. C.; Lew, H.
- Physical Review D, Vol. 43, Issue 1
QCD or what?
journal, January 2019
- Heimel, Theo; Kasieczka, Gregor; Plehn, Tilman
- SciPost Physics, Vol. 6, Issue 3
Search for high mass dijet resonances with a new background prediction method in proton-proton collisions at $$ \sqrt{s} $$ = 13 TeV
journal, May 2020
- Sirunyan, A. M.; Tumasyan, A.; Adam, W.
- Journal of High Energy Physics, Vol. 2020, Issue 5
Quasi-model-independent search for new physics at large transverse momentum
journal, June 2001
- Abazov, V. M.; Abbott, B.; Abdesselam, A.
- Physical Review D, Vol. 64, Issue 1
LHAPDF6: parton density access in the LHC precision era
journal, March 2015
- Buckley, Andy; Ferrando, James; Lloyd, Stephen
- The European Physical Journal C, Vol. 75, Issue 3
Nonlinear principal component analysis using autoassociative neural networks
journal, February 1991
- Kramer, Mark A.
- AIChE Journal, Vol. 37, Issue 2
Energy flow networks: deep sets for particle jets
journal, January 2019
- Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
- Journal of High Energy Physics, Vol. 2019, Issue 1
Self-organized formation of topologically correct feature maps
journal, January 1982
- Kohonen, Teuvo
- Biological Cybernetics, Vol. 43, Issue 1
Learning new physics from a machine
journal, January 2019
- D’Agnolo, Raffaele Tito; Wulzer, Andrea
- Physical Review D, Vol. 99, Issue 1
Adversarially-trained autoencoders for robust unsupervised new physics searches
journal, October 2019
- Blance, Andrew; Spannowsky, Michael; Waite, Philip
- Journal of High Energy Physics, Vol. 2019, Issue 10
The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations
journal, July 2014
- Alwall, J.; Frederix, R.; Frixione, S.
- Journal of High Energy Physics, Vol. 2014, Issue 7
FastJet user manual: (for version 3.0.2)
journal, March 2012
- Cacciari, Matteo; Salam, Gavin P.; Soyez, Gregory
- The European Physical Journal C, Vol. 72, Issue 3
A comparison of optimisation algorithms for high-dimensional particle and astrophysics applications
journal, May 2021
- Balázs, Csaba; van Beekveld, Melissa; Caron, Sascha
- Journal of High Energy Physics, Vol. 2021, Issue 5
Long-lived heavy neutrinos from Higgs decays
journal, August 2018
- Deppisch, Frank F.; Liu, Wei; Mitra, Manimala
- Journal of High Energy Physics, Vol. 2018, Issue 8
A general search for new phenomena at HERA
journal, April 2009
- Aaron, F. D.; Alexa, C.; Andreev, V.
- Physics Letters B, Vol. 674, Issue 4-5
Mixture Models: Inference and Applications to Clustering.
journal, March 1989
- Lindsay, Bruce; McLachlan, G. L.; Basford, K. E.
- Journal of the American Statistical Association, Vol. 84, Issue 405
Anomaly detection with density estimation
journal, April 2020
- Nachman, Benjamin; Shih, David
- Physical Review D, Vol. 101, Issue 7
Density‐based clustering
journal, April 2011
- Kriegel, Hans‐Peter; Kröger, Peer; Sander, Jörg
- WIREs Data Mining and Knowledge Discovery, Vol. 1, Issue 3
The search for supersymmetry: Probing physics beyond the standard model
journal, January 1985
- Haber, H.
- Physics Reports, Vol. 117, Issue 2-4
New Physics Agnostic Selections For New Physics Searches
journal, January 2020
- Woźniak, Kinga Anna; Cerri, Olmo; Duarte, Javier M.
- EPJ Web of Conferences, Vol. 245
Nonparametric density estimation for high‐dimensional data—Algorithms and applications
journal, April 2019
- Wang, Zhipeng; Scott, David W.
- WIREs Computational Statistics, Vol. 11, Issue 4
Parton distributions from high-precision collider data: NNPDF Collaboration
journal, October 2017
- Ball, Richard D.; Bertone, Valerio; Carrazza, Stefano
- The European Physical Journal C, Vol. 77, Issue 10
Recursive jigsaw reconstruction: HEP event analysis in the presence of kinematic and combinatoric ambiguities
journal, December 2017
- Jackson, Paul; Rogan, Christopher
- Physical Review D, Vol. 96, Issue 11
Combining outlier analysis algorithms to identify new physics at the LHC
journal, September 2021
- van Beekveld, Melissa; Caron, Sascha; Hendriks, Luc
- Journal of High Energy Physics, Vol. 2021, Issue 9
Extending the search for new resonances with machine learning
journal, January 2019
- Collins, Jack H.; Howe, Kiel; Nachman, Benjamin
- Physical Review D, Vol. 99, Issue 1
Reducing the Dimensionality of Data with Neural Networks
journal, July 2006
- Hinton, G. E.
- Science, Vol. 313, Issue 5786
Phenomenology of the minimal extension of the standard model: and neutrinos
journal, September 2009
- Basso, Lorenzo; Belyaev, Alexander; Moretti, Stefano
- Physical Review D, Vol. 80, Issue 5
Identification of point sources in gamma rays using U-shaped convolutional neural networks and a data challenge
journal, November 2021
- Panes, Boris; Eckner, Christopher; Hendriks, Luc
- Astronomy & Astrophysics, Vol. 656
Novelty detection meets collider physics
journal, April 2020
- Hajer, Jan; Li, Ying-Ying; Liu, Tao
- Physical Review D, Vol. 101, Issue 7
Adversarially Learned Anomaly Detection on CMS open data: re-discovering the top quark
journal, February 2021
- Knapp, O.; Cerri, O.; Dissertori, G.
- The European Physical Journal Plus, Vol. 136, Issue 2
Complete set of Feynman rules for the minimal supersymmetric extension of the standard model
journal, June 1990
- Rosiek, Janusz
- Physical Review D, Vol. 41, Issue 11
Search for new resonances in mass distributions of jet pairs using 139 fb−1 of pp collisions at s$$ \sqrt{\mathrm{s}} $$ = 13 TeV with the ATLAS detector
journal, March 2020
- Aad, G.; Abbott, B.; Abbott, D. C.
- Journal of High Energy Physics, Vol. 2020, Issue 3
Topological obstructions to autoencoding
journal, April 2021
- Batson, Joshua; Haaf, C. Grace; Kahn, Yonatan
- Journal of High Energy Physics, Vol. 2021, Issue 4
The Elements of Statistical Learning
book, January 2009
- Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome
- Springer Series in Statistics
Model-independent and quasi-model-independent search for new physics at CDF
journal, July 2008
- Aaltonen, T.; Abulencia, A.; Adelman, J.
- Physical Review D, Vol. 78, Issue 1
Review of Particle Physics
journal, August 2020
- Zyla, P. A.; Barnett, R. M.; Beringer, J.
- Progress of Theoretical and Experimental Physics, Vol. 2020, Issue 8
Measurement of the production cross section in the all-jet final state in pp collisions at
journal, April 2020
- Sirunyan, A. M.; Tumasyan, A.; Adam, W.
- Physics Letters B, Vol. 803
Evidence for Jet Structure in Hadron Production by Annihilation
journal, December 1975
- Hanson, G.; Abrams, G. S.; Boyarski, A. M.
- Physical Review Letters, Vol. 35, Issue 24
Phase space sampling and inference from weighted events with autoregressive flows
journal, January 2021
- Stienen, Bob; Verheyen, Rob
- SciPost Physics, Vol. 10, Issue 2
(Machine) learning to do more with less
journal, February 2018
- Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan
- Journal of High Energy Physics, Vol. 2018, Issue 2
Classification without labels: learning from mixed samples in high energy physics
journal, October 2017
- Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
- Journal of High Energy Physics, Vol. 2017, Issue 10
Variational autoencoders for new physics mining at the Large Hadron Collider
journal, May 2019
- Cerri, Olmo; Nguyen, Thong Q.; Pierini, Maurizio
- Journal of High Energy Physics, Vol. 2019, Issue 5
DELPHES 3: a modular framework for fast simulation of a generic collider experiment
journal, February 2014
- de Favereau, J.; Delaere, C.; Demin, P.
- Journal of High Energy Physics, Vol. 2014, Issue 2
R-Parity-violating supersymmetry
journal, November 2005
- Barbier, R.; Bérat, C.; Besançon, M.
- Physics Reports, Vol. 420, Issue 1-6
Weakly supervised classification in high energy physics
journal, May 2017
- Dery, Lucio Mwinmaarong; Nachman, Benjamin; Rubbo, Francesco
- Journal of High Energy Physics, Vol. 2017, Issue 5
Anomaly Detection for Resonant New Physics with Machine Learning
journal, December 2018
- Collins, Jack; Howe, Kiel; Nachman, Benjamin
- Physical Review Letters, Vol. 121, Issue 24
Isolation Forest
conference, December 2008
- Liu, Fei Tony; Ting, Kai Ming; Zhou, Zhi-Hua
- 2008 Eighth IEEE International Conference on Data Mining (ICDM)
Beyond the Minimal Supersymmetric Standard Model: from Theory to Phenomenology
journal, March 2012
- Fuks, Benjamin
- International Journal of Modern Physics A, Vol. 27, Issue 07
Extracting and composing robust features with denoising autoencoders
conference, January 2008
- Vincent, Pascal; Larochelle, Hugo; Bengio, Yoshua
- Proceedings of the 25th international conference on Machine learning - ICML '08
Trial factors for the look elsewhere effect in high energy physics
journal, October 2010
- Gross, Eilam; Vitells, Ofer
- The European Physical Journal C, Vol. 70, Issue 1-2
Better latent spaces for better autoencoders
journal, January 2021
- Dillon, Barry; Plehn, Tilman; Sauer, Christof
- SciPost Physics, Vol. 11, Issue 3
On measuring the masses of pair-produced semi-invisibly decaying particles at hadron colliders
journal, April 2008
- Tovey, Daniel R.
- Journal of High Energy Physics, Vol. 2008, Issue 04
MUSiC: a model-unspecific search for new physics in proton–proton collisions at $$\sqrt{s} = 13\,\text {TeV} $$
journal, July 2021
- Sirunyan, A. M.; Tumasyan, A.; Adam, W.
- The European Physical Journal C, Vol. 81, Issue 7
Guiding new physics searches with unsupervised learning
journal, March 2019
- De Simone, Andrea; Jacques, Thomas
- The European Physical Journal C, Vol. 79, Issue 4
ALPGEN, a generator for hard multiparton processes in hadronic collisions
journal, July 2003
- Mangano, Michelangelo L.; Piccinini, Fulvio; Polosa, Antonio D.
- Journal of High Energy Physics, Vol. 2003, Issue 07
Searching for new physics with deep autoencoders
journal, April 2020
- Farina, Marco; Nakai, Yuichiro; Shih, David
- Physical Review D, Vol. 101, Issue 7
Global search for new physics with at CDF
journal, January 2009
- Aaltonen, T.; Adelman, J.; Akimoto, T.
- Physical Review D, Vol. 79, Issue 1
The LHC Olympics 2020 a community challenge for anomaly detection in high energy physics
journal, December 2021
- Kasieczka, Gregor; Nachman, Benjamin; Shih, David
- Reports on Progress in Physics, Vol. 84, Issue 12
An introduction to PYTHIA 8.2
journal, June 2015
- Sjöstrand, Torbjörn; Ask, Stefan; Christiansen, Jesper R.
- Computer Physics Communications, Vol. 191
A general search for new phenomena in ep scattering at HERA
journal, November 2004
- Aktas, A.; Andreev, V.; Anthonis, T.
- Physics Letters B, Vol. 602, Issue 1-2
Neural networks and principal component analysis: Learning from examples without local minima
journal, January 1989
- Baldi, Pierre; Hornik, Kurt
- Neural Networks, Vol. 2, Issue 1
Simplest model
journal, October 1991
- He, Xiao-Gang; Joshi, G. C.; Lew, H.
- Physical Review D, Vol. 44, Issue 7
Asymptotic formulae for likelihood-based tests of new physics
journal, February 2011
- Cowan, Glen; Cranmer, Kyle; Gross, Eilam
- The European Physical Journal C, Vol. 71, Issue 2
Quasi-Model-Independent Search for New High Physics at D0
journal, April 2001
- Abbott, B.; Abdesselam, A.; Abolins, M.
- Physical Review Letters, Vol. 86, Issue 17
Supersymmetry, supergravity and particle physics
journal, August 1984
- Nilles, H. P.
- Physics Reports, Vol. 110, Issue 1-2
LOF: identifying density-based local outliers
journal, June 2000
- Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.
- ACM SIGMOD Record, Vol. 29, Issue 2
Clustering high dimensional data
journal, June 2012
- Assent, Ira
- WIREs Data Mining and Knowledge Discovery, Vol. 2, Issue 4
A strategy for a general search for new phenomena using data-derived signal regions and its application within the ATLAS experiment
journal, February 2019
- Aaboud, M.; Aad, G.; Abbott, B.
- The European Physical Journal C, Vol. 79, Issue 2
Dijet Resonance Search with Weak Supervision Using Collisions in the ATLAS Detector
journal, September 2020
- Aad, G.; Abbott, B.; Abbott, D. C.
- Physical Review Letters, Vol. 125, Issue 13
Search for new physics in data at DØ using SLEUTH: A quasi-model-independent search strategy for new physics
journal, October 2000
- Abbott, B.; Abolins, M.; Abramov, V.
- Physical Review D, Vol. 62, Issue 9
Representation Learning: A Review and New Perspectives
journal, August 2013
- Bengio, Y.; Courville, A.; Vincent, P.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, Issue 8