Automating the ABCD method with machine learning
Abstract
The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. Whye demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection, and signal contamination.
- Authors:
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC); German Research Foundation (DFG)
- OSTI Identifier:
- 1766649
- Alternate Identifier(s):
- OSTI ID: 1811514
- Grant/Contract Number:
- AC02-05CH11231; SC0013607; DOE-SC0010008; SC0010008; 390833306
- Resource Type:
- Published Article
- Journal Name:
- Physical Review D
- Additional Journal Information:
- Journal Name: Physical Review D Journal Volume: 103 Journal Issue: 3; Journal ID: ISSN 2470-0010
- Publisher:
- American Physical Society
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; hypothetical particle physics models; artificial neural networks; machine learning
Citation Formats
Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., and Shih, David. Automating the ABCD method with machine learning. United States: N. p., 2021.
Web. doi:10.1103/PhysRevD.103.035021.
Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., & Shih, David. Automating the ABCD method with machine learning. United States. https://doi.org/10.1103/PhysRevD.103.035021
Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., and Shih, David. Mon .
"Automating the ABCD method with machine learning". United States. https://doi.org/10.1103/PhysRevD.103.035021.
@article{osti_1766649,
title = {Automating the ABCD method with machine learning},
author = {Kasieczka, Gregor and Nachman, Benjamin and Schwartz, Matthew D. and Shih, David},
abstractNote = {The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. Whye demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection, and signal contamination.},
doi = {10.1103/PhysRevD.103.035021},
journal = {Physical Review D},
number = 3,
volume = 103,
place = {United States},
year = {Mon Feb 22 00:00:00 EST 2021},
month = {Mon Feb 22 00:00:00 EST 2021}
}
https://doi.org/10.1103/PhysRevD.103.035021
Works referenced in this record:
Evolution of average multiplicities of quark and gluon jets
journal, February 2000
- Capella, A.; Dremin, I. M.; Gary, J. W.
- Physical Review D, Vol. 61, Issue 7
Optimal Statistical Inference in the Presence of Systematic Uncertainties Using Neural Network Optimization Based on Binned Poisson Likelihoods with Nuisance Parameters
journal, January 2021
- Wunsch, Stefan; Jörger, Simon; Wolf, Roger
- Computing and Software for Big Science, Vol. 5, Issue 1
Learning representations of irregular particle-detector geometry with distance-weighted graph networks
journal, July 2019
- Qasim, Shah Rukh; Kieseler, Jan; Iiyama, Yutaro
- The European Physical Journal C, Vol. 79, Issue 7
Search for pair-produced resonances decaying to jet pairs in proton–proton collisions at
journal, July 2015
- Khachatryan, V.; Sirunyan, A. M.; Tumasyan, A.
- Physics Letters B, Vol. 747
Deep-learned Top Tagging with a Lorentz Layer
journal, January 2018
- Butter, Anja; Kasieczka, Gregor; Plehn, Tilman
- SciPost Physics, Vol. 5, Issue 3
Electromagnetic showers beyond shower shapes
journal, January 2020
- de Oliveira, Luke; Nachman, Benjamin; Paganini, Michela
- Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 951
Energy flow networks: deep sets for particle jets
journal, January 2019
- Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
- Journal of High Energy Physics, Vol. 2019, Issue 1
How much information is in a jet?
journal, June 2017
- Datta, Kaustuv; Larkoski, Andrew
- Journal of High Energy Physics, Vol. 2017, Issue 6
End-to-End Physics Event Classification with CMS Open Data: Applying Image-Based Deep Learning to Detector Data for the Direct Classification of Collision Events at the LHC
journal, March 2020
- Andrews, M.; Paulini, M.; Gleyzer, S.
- Computing and Software for Big Science, Vol. 4, Issue 1
Learning new physics from a machine
journal, January 2019
- D’Agnolo, Raffaele Tito; Wulzer, Andrea
- Physical Review D, Vol. 99, Issue 1
Deep learning in color: towards automated quark/gluon jet discrimination
journal, January 2017
- Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
- Journal of High Energy Physics, Vol. 2017, Issue 1
Boosted and tagging with jet charge and deep learning
journal, March 2020
- Chen, Yu-Chen Janice; Chiang, Cheng-Wei; Cottin, Giovanna
- Physical Review D, Vol. 101, Issue 5
Interaction networks for the identification of boosted decays
journal, July 2020
- Moreno, Eric A.; Nguyen, Thong Q.; Vlimant, Jean-Roch
- Physical Review D, Vol. 102, Issue 1
FastJet user manual: (for version 3.0.2)
journal, March 2012
- Cacciari, Matteo; Salam, Gavin P.; Soyez, Gregory
- The European Physical Journal C, Vol. 72, Issue 3
Topology Classification with Deep Learning to Improve Real-Time Event Selection at the LHC
journal, August 2019
- Nguyen, T. Q.; Weitekamp, D.; Anderson, D.
- Computing and Software for Big Science, Vol. 3, Issue 1
Energy flow polynomials: a complete linear basis for jet substructure
journal, April 2018
- Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
- Journal of High Energy Physics, Vol. 2018, Issue 4
Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques
journal, June 2020
- Sirunyan, A. M.; Tumasyan, A.; Adam, W.
- Journal of Instrumentation, Vol. 15, Issue 06
Uncovering latent jet substructure
journal, September 2019
- Dillon, Barry M.; Faroughy, Darius A.; Kamenik, Jernej F.
- Physical Review D, Vol. 100, Issue 5
Calorimetry with deep learning: particle simulation and reconstruction for collider physics
journal, July 2020
- Belayneh, Dawit; Carminati, Federico; Farbin, Amir
- The European Physical Journal C, Vol. 80, Issue 7
Anomaly detection with density estimation
journal, April 2020
- Nachman, Benjamin; Shih, David
- Physical Review D, Vol. 101, Issue 7
Identifying boosted objects with N-subjettiness
journal, March 2011
- Thaler, Jesse; Van Tilburg, Ken
- Journal of High Energy Physics, Vol. 2011, Issue 3
Jet substructure at the Large Hadron Collider: A review of recent advances in theory and machine learning
journal, November 2019
- Larkoski, Andrew J.; Moult, Ian; Nachman, Benjamin
- Physics Reports
Fast convolutional neural networks for identifying long-lived particles in a high-granularity calorimeter
journal, December 2020
- Alimena, J.; Iiyama, Y.; Kieseler, J.
- Journal of Instrumentation, Vol. 15, Issue 12
Mass agnostic jet taggers
journal, January 2020
- Bradshaw, Layne; Mishra, Rashmish K.; Mitridate, Andrea
- SciPost Physics, Vol. 8, Issue 1
Extending the search for new resonances with machine learning
journal, January 2019
- Collins, Jack H.; Howe, Kiel; Nachman, Benjamin
- Physical Review D, Vol. 99, Issue 1
Learning to classify from impure samples with high-dimensional data
journal, July 2018
- Komiske, Patrick T.; Metodiev, Eric M.; Nachman, Benjamin
- Physical Review D, Vol. 98, Issue 1
Novelty detection meets collider physics
journal, April 2020
- Hajer, Jan; Li, Ying-Ying; Liu, Tao
- Physical Review D, Vol. 101, Issue 7
Search for Higgs Boson Decays into a Boson and a Light Hadronically Decaying Resonance Using 13 TeV Collision Data from the ATLAS Detector
journal, November 2020
- Aad, G.; Abbott, B.; Abbott, D. C.
- Physical Review Letters, Vol. 125, Issue 22
Boosting H → b b ¯ $$ H\to b\overline{b} $$ with machine learning
journal, October 2018
- Lin, Joshua; Freytsis, Marat; Moult, Ian
- Journal of High Energy Physics, Vol. 2018, Issue 10
Parameterized neural networks for high-energy physics
journal, April 2016
- Baldi, Pierre; Cranmer, Kyle; Faucett, Taylor
- The European Physical Journal C, Vol. 76, Issue 5
Maximizing boosted top identification by minimizing N-subjettiness
journal, February 2012
- Thaler, Jesse; Van Tilburg, Ken
- Journal of High Energy Physics, Vol. 2012, Issue 2
Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks
journal, October 2008
- Kaplan, David E.; Rehermann, Keith; Schwartz, Matthew D.
- Physical Review Letters, Vol. 101, Issue 14
Reducing the Dependence of the Neural Network Function to Systematic Uncertainties in the Input Space
journal, February 2020
- Wunsch, Stefan; Jörger, Simon; Wolf, Roger
- Computing and Software for Big Science, Vol. 4, Issue 1
Jet tagging via particle clouds
journal, March 2020
- Qu, Huilin; Gouskos, Loukas
- Physical Review D, Vol. 101, Issue 5
Jet-images — deep learning edition
journal, July 2016
- de Oliveira, Luke; Kagan, Michael; Mackey, Lester
- Journal of High Energy Physics, Vol. 2016, Issue 7
Quark-gluon tagging: Machine learning vs detector
journal, January 2019
- Kasieczka, Gregor; Kiefer, Nicholas; Plehn, Tilman
- SciPost Physics, Vol. 6, Issue 6
Machine learning uncertainties with adversarial neural networks
journal, January 2019
- Englert, Christoph; Galler, Peter; Harris, Philip
- The European Physical Journal C, Vol. 79, Issue 1
INFERNO: Inference-Aware Neural Optimisation
journal, November 2019
- de Castro, Pablo; Dorigo, Tommaso
- Computer Physics Communications, Vol. 244
Guiding new physics searches with unsupervised learning
journal, March 2019
- De Simone, Andrea; Jacques, Thomas
- The European Physical Journal C, Vol. 79, Issue 4
Reports of my demise are greatly exaggerated: $N$-subjettiness taggers take on jet images
journal, January 2019
- Moore, Liam; Nordström, Karl; Varma, Sreedevi
- SciPost Physics, Vol. 7, Issue 3
Searching for new physics with deep autoencoders
journal, April 2020
- Farina, Marco; Nakai, Yuichiro; Shih, David
- Physical Review D, Vol. 101, Issue 7
Interpretable deep learning for two-prong jet classification with jet spectra
journal, July 2019
- Chakraborty, Amit; Lim, Sung Hak; Nojiri, Mihoko M.
- Journal of High Energy Physics, Vol. 2019, Issue 7
Searching for exotic particles in high-energy physics with deep learning
journal, July 2014
- Baldi, P.; Sadowski, P.; Whiteson, D.
- Nature Communications, Vol. 5, Issue 1
Neural network-based top tagger with two-point energy correlations and geometry of soft emissions
journal, July 2020
- Chakraborty, Amit; Lim, Sung Hak; Nojiri, Mihoko M.
- Journal of High Energy Physics, Vol. 2020, Issue 7
Pileup mitigation at the Large Hadron Collider with graph neural networks
journal, July 2019
- Arjona Martínez, J.; Cerri, O.; Spiropulu, M.
- The European Physical Journal Plus, Vol. 134, Issue 7
QCD-aware recursive neural networks for jet physics
journal, January 2019
- Louppe, Gilles; Cho, Kyunghyun; Becot, Cyril
- Journal of High Energy Physics, Vol. 2019, Issue 1
A deep neural network to search for new long-lived particles decaying to jets
journal, August 2020
- ,
- Machine Learning: Science and Technology, Vol. 1, Issue 3, 035012
Transferability of deep learning models in searches for new physics at colliders
journal, February 2020
- Romão, M. Crispim; Castro, N. F.; Pedro, R.
- Physical Review D, Vol. 101, Issue 3
Parton shower uncertainties in jet substructure analyses with deep neural networks
journal, January 2017
- Barnard, James; Dawe, Edmund Noel; Dolan, Matthew J.
- Physical Review D, Vol. 95, Issue 1
uBoost: a boosting method for producing uniform selection efficiencies from multivariate classifiers
journal, December 2013
- Stevens, J.; Williams, M.
- Journal of Instrumentation, Vol. 8, Issue 12
QCD or what?
journal, January 2019
- Heimel, Theo; Kasieczka, Gregor; Plehn, Tilman
- SciPost Physics, Vol. 6, Issue 3
The anti- k t jet clustering algorithm
journal, April 2008
- Cacciari, Matteo; Salam, Gavin P.; Soyez, Gregory
- Journal of High Energy Physics, Vol. 2008, Issue 04
SUSY Les Houches Accord: Interfacing SUSY Spectrum Calculators, Decay Packages, and Event Generators
journal, July 2004
- Skands, P.; Allanach, B. C.; Baer, H.
- Journal of High Energy Physics, Vol. 2004, Issue 07
Pulling out all the tops with computer vision and deep learning
journal, October 2018
- Macaluso, Sebastian; Shih, David
- Journal of High Energy Physics, Vol. 2018, Issue 10
A search for top squarks with R-parity-violating decays to all-hadronic final states with the ATLAS detector in s = 8 $$ \sqrt{s}=8 $$ TeV proton-proton collisions
journal, June 2016
- Aad, G.; Abbott, B.; Abdallah, J.
- Journal of High Energy Physics, Vol. 2016, Issue 6
Decorrelated jet substructure tagging using adversarial neural networks
journal, October 2017
- Shimmin, Chase; Sadowski, Peter; Baldi, Pierre
- Physical Review D, Vol. 96, Issue 7
Simulation assisted likelihood-free anomaly detection
journal, May 2020
- Andreassen, Anders; Nachman, Benjamin; Shih, David
- Physical Review D, Vol. 101, Issue 9
Measurement of and in collisions at GeV
journal, July 1991
- Abe, F.; Amidei, D.; Apollinari, G.
- Physical Review D, Vol. 44, Issue 1
Search for Higgs boson decays into pairs of light (pseudo)scalar particles in the γγjj final state in pp collisions at with the ATLAS detector
journal, July 2018
- Aaboud, M.; Aad, G.; Abbott, B.
- Physics Letters B, Vol. 782
JEDI-net: a jet identification algorithm based on interaction networks
journal, January 2020
- Moreno, Eric A.; Cerri, Olmo; Duarte, Javier M.
- The European Physical Journal C, Vol. 80, Issue 1
PYTHIA 6.4 physics and manual
journal, May 2006
- Sjöstrand, Torbjörn; Mrenna, Stephen; Skands, Peter
- Journal of High Energy Physics, Vol. 2006, Issue 05
SUSY Les Houches Accord 2
journal, January 2009
- Allanach, B. C.; Balázs, C.; Bélanger, G.
- Computer Physics Communications, Vol. 180, Issue 1
Adversarially-trained autoencoders for robust unsupervised new physics searches
journal, October 2019
- Blance, Andrew; Spannowsky, Michael; Waite, Philip
- Journal of High Energy Physics, Vol. 2019, Issue 10
Deep-learning top taggers or the end of QCD?
journal, May 2017
- Kasieczka, Gregor; Plehn, Tilman; Russell, Michael
- Journal of High Energy Physics, Vol. 2017, Issue 5
Jet charge and machine learning
journal, October 2018
- Fraser, Katherine; Schwartz, Matthew D.
- Journal of High Energy Physics, Vol. 2018, Issue 10
Search for new phenomena in final states with large jet multiplicities and missing transverse momentum using $$ \sqrt{s} $$ = 13 TeV proton-proton collisions recorded by ATLAS in Run 2 of the LHC
journal, October 2020
- Aad, G.; Abbott, B.; Abbott, D. C.
- Journal of High Energy Physics, Vol. 2020, Issue 10
A generic anti-QCD jet tagger
journal, November 2017
- Aguilar-Saavedra, J. A.; Collins, Jack; Mishra, Rashmish K.
- Journal of High Energy Physics, Vol. 2017, Issue 11
Jet flavor classification in high-energy physics with deep neural networks
journal, December 2016
- Guest, Daniel; Collado, Julian; Baldi, Pierre
- Physical Review D, Vol. 94, Issue 11
Recursive Neural Networks in Quark/Gluon Tagging
journal, June 2018
- Cheng, Taoli
- Computing and Software for Big Science, Vol. 2, Issue 1
A search for pair-produced resonances in four-jet final states at $$\sqrt{s}=13$$ s = 13 $$\text {TeV}$$ TeV with the ATLAS detector
journal, March 2018
- Aaboud, M.; Aad, G.; Abbott, B.
- The European Physical Journal C, Vol. 78, Issue 3
Automating the construction of jet observables with machine learning
journal, November 2019
- Datta, Kaustuv; Larkoski, Andrew; Nachman, Benjamin
- Physical Review D, Vol. 100, Issue 9
Towards machine learning analytics for jet substructure
journal, September 2020
- Kasieczka, Gregor; Marzani, Simone; Soyez, Gregory
- Journal of High Energy Physics, Vol. 2020, Issue 9
Brownian distance covariance
journal, December 2009
- Székely, Gábor J.; Rizzo, Maria L.
- The Annals of Applied Statistics, Vol. 3, Issue 4
Deep Learning and Its Application to LHC Physics
journal, October 2018
- Guest, Dan; Cranmer, Kyle; Whiteson, Daniel
- Annual Review of Nuclear and Particle Science, Vol. 68, Issue 1
Robust Jet Classifiers through Distance Correlation
journal, September 2020
- Kasieczka, Gregor; Shih, David
- Physical Review Letters, Vol. 125, Issue 12
Finding new physics without learning about it: anomaly detection as a tool for searches at colliders
journal, January 2021
- Crispim Romão, M.; Castro, N. F.; Pedro, R.
- The European Physical Journal C, Vol. 81, Issue 1
Thinking outside the ROCs: Designing Decorrelated Taggers (DDT) for jet substructure
journal, May 2016
- Dolen, James; Harris, Philip; Marzani, Simone
- Journal of High Energy Physics, Vol. 2016, Issue 5
DELPHES 3: a modular framework for fast simulation of a generic collider experiment
journal, February 2014
- de Favereau, J.; Delaere, C.; Demin, P.
- Journal of High Energy Physics, Vol. 2014, Issue 2
Energy dependence of mean multiplicities in gluon and quark jets at the next-to-next-to-next-to-leading order
journal, July 1999
- Dremin, I. M.; Gary, J. W.
- Physics Letters B, Vol. 459, Issue 1-3
Variational autoencoders for new physics mining at the Large Hadron Collider
journal, May 2019
- Cerri, Olmo; Nguyen, Thong Q.; Pierini, Maurizio
- Journal of High Energy Physics, Vol. 2019, Issue 5
The Machine Learning landscape of top taggers
journal, January 2019
- Kasieczka, Gregor; Plehn, Tilman; Butter, Anja
- SciPost Physics, Vol. 7, Issue 1
Anomaly Detection for Resonant New Physics with Machine Learning
journal, December 2018
- Collins, Jack; Howe, Kiel; Nachman, Benjamin
- Physical Review Letters, Vol. 121, Issue 24
QBDT, a new boosting decision tree method with systematical uncertainties into training for High Energy Physics
journal, June 2019
- Xia, Li-Gang
- Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 930
Jet-images: computer vision inspired techniques for jet tagging
journal, February 2015
- Cogan, Josh; Kagan, Michael; Strauss, Emanuel
- Journal of High Energy Physics, Vol. 2015, Issue 2
Learning multivariate new physics
journal, January 2021
- D’Agnolo, Raffaele Tito; Grosso, Gaia; Pierini, Maurizio
- The European Physical Journal C, Vol. 81, Issue 1
Learning the latent structure of collider events
journal, October 2020
- Dillon, B. M.; Faroughy, D. A.; Kamenik, J. F.
- Journal of High Energy Physics, Vol. 2020, Issue 10
New approaches for boosting to uniformity
journal, March 2015
- Rogozhnikov, A.; Bukva, A.; Gligorov, V.
- Journal of Instrumentation, Vol. 10, Issue 03
Tag N’ Train: a technique to train improved classifiers on unlabeled data
journal, January 2021
- Amram, Oz; Suarez, Cristina Mantilla
- Journal of High Energy Physics, Vol. 2021, Issue 1
Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV
journal, May 2018
- Sirunyan, A. M.; Tumasyan, A.; Adam, W.
- Journal of Instrumentation, Vol. 13, Issue 05
Novel jet observables from machine learning
journal, March 2018
- Datta, Kaustuv; Larkoski, Andrew J.
- Journal of High Energy Physics, Vol. 2018, Issue 3
An introduction to PYTHIA 8.2
journal, June 2015
- Sjöstrand, Torbjörn; Ask, Stefan; Christiansen, Jesper R.
- Computer Physics Communications, Vol. 191
Convolved substructure: analytically decorrelating jet substructure observables
journal, May 2018
- Moult, Ian; Nachman, Benjamin; Neill, Duff
- Journal of High Energy Physics, Vol. 2018, Issue 5
Measuring and testing dependence by correlation of distances
journal, December 2007
- Székely, Gábor J.; Rizzo, Maria L.; Bakirov, Nail K.
- The Annals of Statistics, Vol. 35, Issue 6
Partial distance correlation with methods for dissimilarities
journal, December 2014
- Székely, Gábor J.; Rizzo, Maria L.
- The Annals of Statistics, Vol. 42, Issue 6
Playing tag with ANN: boosted top identification with pattern recognition
journal, July 2015
- Almeida, Leandro G.; Backović, Mihailo; Cliche, Mathieu
- Journal of High Energy Physics, Vol. 2015, Issue 7
CapsNets continuing the convolutional quest
journal, January 2020
- Diefenbacher, Sascha; Frost, Hermann; Kasieczka, Gregor
- SciPost Physics, Vol. 8, Issue 2
The distance correlation -test of independence in high dimension
journal, May 2013
- Székely, Gábor J.; Rizzo, Maria L.
- Journal of Multivariate Analysis, Vol. 117
Dijet Resonance Search with Weak Supervision Using Collisions in the ATLAS Detector
journal, September 2020
- Aad, G.; Abbott, B.; Abbott, D. C.
- Physical Review Letters, Vol. 125, Issue 13