DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Automating the ABCD method with machine learning

Abstract

The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. Whye demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection, and signal contamination.

Authors:
; ORCiD logo; ;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC); German Research Foundation (DFG)
OSTI Identifier:
1766649
Alternate Identifier(s):
OSTI ID: 1811514
Grant/Contract Number:  
AC02-05CH11231; SC0013607; DOE-SC0010008; SC0010008; 390833306
Resource Type:
Published Article
Journal Name:
Physical Review D
Additional Journal Information:
Journal Name: Physical Review D Journal Volume: 103 Journal Issue: 3; Journal ID: ISSN 2470-0010
Publisher:
American Physical Society
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; hypothetical particle physics models; artificial neural networks; machine learning

Citation Formats

Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., and Shih, David. Automating the ABCD method with machine learning. United States: N. p., 2021. Web. doi:10.1103/PhysRevD.103.035021.
Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., & Shih, David. Automating the ABCD method with machine learning. United States. https://doi.org/10.1103/PhysRevD.103.035021
Kasieczka, Gregor, Nachman, Benjamin, Schwartz, Matthew D., and Shih, David. Mon . "Automating the ABCD method with machine learning". United States. https://doi.org/10.1103/PhysRevD.103.035021.
@article{osti_1766649,
title = {Automating the ABCD method with machine learning},
author = {Kasieczka, Gregor and Nachman, Benjamin and Schwartz, Matthew D. and Shih, David},
abstractNote = {The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. Whye demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection, and signal contamination.},
doi = {10.1103/PhysRevD.103.035021},
journal = {Physical Review D},
number = 3,
volume = 103,
place = {United States},
year = {Mon Feb 22 00:00:00 EST 2021},
month = {Mon Feb 22 00:00:00 EST 2021}
}

Works referenced in this record:

Evolution of average multiplicities of quark and gluon jets
journal, February 2000


Optimal Statistical Inference in the Presence of Systematic Uncertainties Using Neural Network Optimization Based on Binned Poisson Likelihoods with Nuisance Parameters
journal, January 2021

  • Wunsch, Stefan; Jörger, Simon; Wolf, Roger
  • Computing and Software for Big Science, Vol. 5, Issue 1
  • DOI: 10.1007/s41781-020-00049-5

Learning representations of irregular particle-detector geometry with distance-weighted graph networks
journal, July 2019


Search for pair-produced resonances decaying to jet pairs in proton–proton collisions at s = 8  TeV
journal, July 2015


Deep-learned Top Tagging with a Lorentz Layer
journal, January 2018


Electromagnetic showers beyond shower shapes
journal, January 2020

  • de Oliveira, Luke; Nachman, Benjamin; Paganini, Michela
  • Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 951
  • DOI: 10.1016/j.nima.2019.162879

Energy flow networks: deep sets for particle jets
journal, January 2019

  • Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
  • Journal of High Energy Physics, Vol. 2019, Issue 1
  • DOI: 10.1007/JHEP01(2019)121

How much information is in a jet?
journal, June 2017


Learning new physics from a machine
journal, January 2019


Deep learning in color: towards automated quark/gluon jet discrimination
journal, January 2017

  • Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
  • Journal of High Energy Physics, Vol. 2017, Issue 1
  • DOI: 10.1007/JHEP01(2017)110

Boosted W and Z tagging with jet charge and deep learning
journal, March 2020


Interaction networks for the identification of boosted H b b ¯ decays
journal, July 2020


FastJet user manual: (for version 3.0.2)
journal, March 2012


Topology Classification with Deep Learning to Improve Real-Time Event Selection at the LHC
journal, August 2019

  • Nguyen, T. Q.; Weitekamp, D.; Anderson, D.
  • Computing and Software for Big Science, Vol. 3, Issue 1
  • DOI: 10.1007/s41781-019-0028-1

Energy flow polynomials: a complete linear basis for jet substructure
journal, April 2018

  • Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
  • Journal of High Energy Physics, Vol. 2018, Issue 4
  • DOI: 10.1007/JHEP04(2018)013

Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques
journal, June 2020


Uncovering latent jet substructure
journal, September 2019


Calorimetry with deep learning: particle simulation and reconstruction for collider physics
journal, July 2020


Anomaly detection with density estimation
journal, April 2020


Identifying boosted objects with N-subjettiness
journal, March 2011


Jet substructure at the Large Hadron Collider: A review of recent advances in theory and machine learning
journal, November 2019


Fast convolutional neural networks for identifying long-lived particles in a high-granularity calorimeter
journal, December 2020


Mass agnostic jet taggers
journal, January 2020


Extending the search for new resonances with machine learning
journal, January 2019


Dispelling the <mml:math altimg="si1.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:msup><mml:mi>N</mml:mi><mml:mn>3</mml:mn></mml:msup></mml:math> myth for the <mml:math altimg="si2.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:msub><mml:mi>k</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:math> jet-finder
journal, September 2006


Learning to classify from impure samples with high-dimensional data
journal, July 2018


Novelty detection meets collider physics
journal, April 2020


Boosting H → b b ¯ $$ H\to b\overline{b} $$ with machine learning
journal, October 2018

  • Lin, Joshua; Freytsis, Marat; Moult, Ian
  • Journal of High Energy Physics, Vol. 2018, Issue 10
  • DOI: 10.1007/JHEP10(2018)101

Parameterized neural networks for high-energy physics
journal, April 2016


Maximizing boosted top identification by minimizing N-subjettiness
journal, February 2012


Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks
journal, October 2008


Reducing the Dependence of the Neural Network Function to Systematic Uncertainties in the Input Space
journal, February 2020

  • Wunsch, Stefan; Jörger, Simon; Wolf, Roger
  • Computing and Software for Big Science, Vol. 4, Issue 1
  • DOI: 10.1007/s41781-020-00037-9

Jet tagging via particle clouds
journal, March 2020


Jet-images — deep learning edition
journal, July 2016

  • de Oliveira, Luke; Kagan, Michael; Mackey, Lester
  • Journal of High Energy Physics, Vol. 2016, Issue 7
  • DOI: 10.1007/JHEP07(2016)069

Quark-gluon tagging: Machine learning vs detector
journal, January 2019


Machine learning uncertainties with adversarial neural networks
journal, January 2019


INFERNO: Inference-Aware Neural Optimisation
journal, November 2019


Guiding new physics searches with unsupervised learning
journal, March 2019


Reports of my demise are greatly exaggerated: $N$-subjettiness taggers take on jet images
journal, January 2019


Searching for new physics with deep autoencoders
journal, April 2020


Interpretable deep learning for two-prong jet classification with jet spectra
journal, July 2019

  • Chakraborty, Amit; Lim, Sung Hak; Nojiri, Mihoko M.
  • Journal of High Energy Physics, Vol. 2019, Issue 7
  • DOI: 10.1007/JHEP07(2019)135

Searching for exotic particles in high-energy physics with deep learning
journal, July 2014

  • Baldi, P.; Sadowski, P.; Whiteson, D.
  • Nature Communications, Vol. 5, Issue 1
  • DOI: 10.1038/ncomms5308

Neural network-based top tagger with two-point energy correlations and geometry of soft emissions
journal, July 2020

  • Chakraborty, Amit; Lim, Sung Hak; Nojiri, Mihoko M.
  • Journal of High Energy Physics, Vol. 2020, Issue 7
  • DOI: 10.1007/JHEP07(2020)111

Pileup mitigation at the Large Hadron Collider with graph neural networks
journal, July 2019

  • Arjona Martínez, J.; Cerri, O.; Spiropulu, M.
  • The European Physical Journal Plus, Vol. 134, Issue 7
  • DOI: 10.1140/epjp/i2019-12710-3

QCD-aware recursive neural networks for jet physics
journal, January 2019

  • Louppe, Gilles; Cho, Kyunghyun; Becot, Cyril
  • Journal of High Energy Physics, Vol. 2019, Issue 1
  • DOI: 10.1007/JHEP01(2019)057

A deep neural network to search for new long-lived particles decaying to jets
journal, August 2020


Transferability of deep learning models in searches for new physics at colliders
journal, February 2020


Parton shower uncertainties in jet substructure analyses with deep neural networks
journal, January 2017


uBoost: a boosting method for producing uniform selection efficiencies from multivariate classifiers
journal, December 2013


QCD or what?
journal, January 2019


The anti- k t jet clustering algorithm
journal, April 2008


SUSY Les Houches Accord: Interfacing SUSY Spectrum Calculators, Decay Packages, and Event Generators
journal, July 2004


Pulling out all the tops with computer vision and deep learning
journal, October 2018

  • Macaluso, Sebastian; Shih, David
  • Journal of High Energy Physics, Vol. 2018, Issue 10
  • DOI: 10.1007/JHEP10(2018)121

Decorrelated jet substructure tagging using adversarial neural networks
journal, October 2017


Simulation assisted likelihood-free anomaly detection
journal, May 2020


Measurement of σ B ( W e ν ) and σ B ( Z 0 e + e ) in p ¯ p collisions at s = 1800 GeV
journal, July 1991


JEDI-net: a jet identification algorithm based on interaction networks
journal, January 2020


PYTHIA 6.4 physics and manual
journal, May 2006


SUSY Les Houches Accord 2
journal, January 2009

  • Allanach, B. C.; Balázs, C.; Bélanger, G.
  • Computer Physics Communications, Vol. 180, Issue 1
  • DOI: 10.1016/j.cpc.2008.08.004

Adversarially-trained autoencoders for robust unsupervised new physics searches
journal, October 2019

  • Blance, Andrew; Spannowsky, Michael; Waite, Philip
  • Journal of High Energy Physics, Vol. 2019, Issue 10
  • DOI: 10.1007/JHEP10(2019)047

Deep-learning top taggers or the end of QCD?
journal, May 2017

  • Kasieczka, Gregor; Plehn, Tilman; Russell, Michael
  • Journal of High Energy Physics, Vol. 2017, Issue 5
  • DOI: 10.1007/JHEP05(2017)006

Jet charge and machine learning
journal, October 2018

  • Fraser, Katherine; Schwartz, Matthew D.
  • Journal of High Energy Physics, Vol. 2018, Issue 10
  • DOI: 10.1007/JHEP10(2018)093

A generic anti-QCD jet tagger
journal, November 2017

  • Aguilar-Saavedra, J. A.; Collins, Jack; Mishra, Rashmish K.
  • Journal of High Energy Physics, Vol. 2017, Issue 11
  • DOI: 10.1007/JHEP11(2017)163

Jet flavor classification in high-energy physics with deep neural networks
journal, December 2016


Recursive Neural Networks in Quark/Gluon Tagging
journal, June 2018


A search for pair-produced resonances in four-jet final states at $$\sqrt{s}=13$$ s = 13 $$\text {TeV}$$ TeV with the ATLAS detector
journal, March 2018


Automating the construction of jet observables with machine learning
journal, November 2019


Towards machine learning analytics for jet substructure
journal, September 2020

  • Kasieczka, Gregor; Marzani, Simone; Soyez, Gregory
  • Journal of High Energy Physics, Vol. 2020, Issue 9
  • DOI: 10.1007/JHEP09(2020)195

Brownian distance covariance
journal, December 2009

  • Székely, Gábor J.; Rizzo, Maria L.
  • The Annals of Applied Statistics, Vol. 3, Issue 4
  • DOI: 10.1214/09-AOAS312

Deep Learning and Its Application to LHC Physics
journal, October 2018


Robust Jet Classifiers through Distance Correlation
journal, September 2020


Finding new physics without learning about it: anomaly detection as a tool for searches at colliders
journal, January 2021


Thinking outside the ROCs: Designing Decorrelated Taggers (DDT) for jet substructure
journal, May 2016

  • Dolen, James; Harris, Philip; Marzani, Simone
  • Journal of High Energy Physics, Vol. 2016, Issue 5
  • DOI: 10.1007/JHEP05(2016)156

DELPHES 3: a modular framework for fast simulation of a generic collider experiment
journal, February 2014

  • de Favereau, J.; Delaere, C.; Demin, P.
  • Journal of High Energy Physics, Vol. 2014, Issue 2
  • DOI: 10.1007/JHEP02(2014)057

Energy dependence of mean multiplicities in gluon and quark jets at the next-to-next-to-next-to-leading order
journal, July 1999


Variational autoencoders for new physics mining at the Large Hadron Collider
journal, May 2019

  • Cerri, Olmo; Nguyen, Thong Q.; Pierini, Maurizio
  • Journal of High Energy Physics, Vol. 2019, Issue 5
  • DOI: 10.1007/JHEP05(2019)036

The Machine Learning landscape of top taggers
journal, January 2019


Anomaly Detection for Resonant New Physics with Machine Learning
journal, December 2018


QBDT, a new boosting decision tree method with systematical uncertainties into training for High Energy Physics
journal, June 2019

  • Xia, Li-Gang
  • Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 930
  • DOI: 10.1016/j.nima.2019.03.088

Jet-images: computer vision inspired techniques for jet tagging
journal, February 2015

  • Cogan, Josh; Kagan, Michael; Strauss, Emanuel
  • Journal of High Energy Physics, Vol. 2015, Issue 2
  • DOI: 10.1007/JHEP02(2015)118

Learning multivariate new physics
journal, January 2021


Learning the latent structure of collider events
journal, October 2020

  • Dillon, B. M.; Faroughy, D. A.; Kamenik, J. F.
  • Journal of High Energy Physics, Vol. 2020, Issue 10
  • DOI: 10.1007/JHEP10(2020)206

New approaches for boosting to uniformity
journal, March 2015


Tag N’ Train: a technique to train improved classifiers on unlabeled data
journal, January 2021

  • Amram, Oz; Suarez, Cristina Mantilla
  • Journal of High Energy Physics, Vol. 2021, Issue 1
  • DOI: 10.1007/JHEP01(2021)153

Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV
journal, May 2018


Novel jet observables from machine learning
journal, March 2018

  • Datta, Kaustuv; Larkoski, Andrew J.
  • Journal of High Energy Physics, Vol. 2018, Issue 3
  • DOI: 10.1007/JHEP03(2018)086

An introduction to PYTHIA 8.2
journal, June 2015

  • Sjöstrand, Torbjörn; Ask, Stefan; Christiansen, Jesper R.
  • Computer Physics Communications, Vol. 191
  • DOI: 10.1016/j.cpc.2015.01.024

Convolved substructure: analytically decorrelating jet substructure observables
journal, May 2018

  • Moult, Ian; Nachman, Benjamin; Neill, Duff
  • Journal of High Energy Physics, Vol. 2018, Issue 5
  • DOI: 10.1007/JHEP05(2018)002

Measuring and testing dependence by correlation of distances
journal, December 2007

  • Székely, Gábor J.; Rizzo, Maria L.; Bakirov, Nail K.
  • The Annals of Statistics, Vol. 35, Issue 6
  • DOI: 10.1214/009053607000000505

Partial distance correlation with methods for dissimilarities
journal, December 2014

  • Székely, Gábor J.; Rizzo, Maria L.
  • The Annals of Statistics, Vol. 42, Issue 6
  • DOI: 10.1214/14-AOS1255

Playing tag with ANN: boosted top identification with pattern recognition
journal, July 2015

  • Almeida, Leandro G.; Backović, Mihailo; Cliche, Mathieu
  • Journal of High Energy Physics, Vol. 2015, Issue 7
  • DOI: 10.1007/JHEP07(2015)086

CapsNets continuing the convolutional quest
journal, January 2020


The distance correlation t -test of independence in high dimension
journal, May 2013


Dijet Resonance Search with Weak Supervision Using s = 13 TeV p p Collisions in the ATLAS Detector
journal, September 2020