DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: JUNIPR: a framework for unsupervised machine learning in particle physics

Abstract

In applications of machine learning to particle physics, a persistent challenge is how to go beyond discrimination to learn about the underlying physics. To this end, a powerful tool would be a framework for unsupervised learning, where the machine learns the intricate high-dimensional contours of the data upon which it is trained, without reference to pre-established labels. In order to approach such a complex task, an unsupervised network must be structured intelligently, based on a qualitative understanding of the data. In this paper, we scaffold the neural network’s architecture around a leading-order model of the physics underlying the data. In addition to making unsupervised learning tractable, this design actually alleviates existing tensions between performance and interpretability. We call the framework JUNIPR: “Jets from UNsupervised Interpretable PRobabilistic models”. In this approach, the set of particle momenta composing a jet are clustered into a binary tree that the neural network examines sequentially. Training is unsupervised and unrestricted: the network could decide that the data bears little correspondence to the chosen tree structure. However, when there is a correspondence, the network’s output along the tree has a direct physical interpretation. JUNIPR models can perform discrimination tasks, through the statistically optimal likelihood-ratio test, andmore » they permit visualizations of discrimination power at each branching in a jet’s tree. Additionally, JUNIPR models provide a probability distribution from which events can be drawn, providing a data-driven Monte Carlo generator. As a third application, JUNIPR models can reweight events from one (e.g. simulated) data set to agree with distributions from another (e.g. experimental) data set.« less

Authors:
ORCiD logo; ; ;
Publication Date:
Research Org.:
Harvard Univ., Cambridge, MA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1619354
Alternate Identifier(s):
OSTI ID: 1612013
Grant/Contract Number:  
SC0013607
Resource Type:
Published Article
Journal Name:
European Physical Journal. C, Particles and Fields
Additional Journal Information:
Journal Name: European Physical Journal. C, Particles and Fields Journal Volume: 79 Journal Issue: 2; Journal ID: ISSN 1434-6044
Publisher:
Springer
Country of Publication:
Germany
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; Physics

Citation Formats

Andreassen, Anders, Feige, Ilya, Frye, Christopher, and Schwartz, Matthew D. JUNIPR: a framework for unsupervised machine learning in particle physics. Germany: N. p., 2019. Web. doi:10.1140/epjc/s10052-019-6607-9.
Andreassen, Anders, Feige, Ilya, Frye, Christopher, & Schwartz, Matthew D. JUNIPR: a framework for unsupervised machine learning in particle physics. Germany. https://doi.org/10.1140/epjc/s10052-019-6607-9
Andreassen, Anders, Feige, Ilya, Frye, Christopher, and Schwartz, Matthew D. Fri . "JUNIPR: a framework for unsupervised machine learning in particle physics". Germany. https://doi.org/10.1140/epjc/s10052-019-6607-9.
@article{osti_1619354,
title = {JUNIPR: a framework for unsupervised machine learning in particle physics},
author = {Andreassen, Anders and Feige, Ilya and Frye, Christopher and Schwartz, Matthew D.},
abstractNote = {In applications of machine learning to particle physics, a persistent challenge is how to go beyond discrimination to learn about the underlying physics. To this end, a powerful tool would be a framework for unsupervised learning, where the machine learns the intricate high-dimensional contours of the data upon which it is trained, without reference to pre-established labels. In order to approach such a complex task, an unsupervised network must be structured intelligently, based on a qualitative understanding of the data. In this paper, we scaffold the neural network’s architecture around a leading-order model of the physics underlying the data. In addition to making unsupervised learning tractable, this design actually alleviates existing tensions between performance and interpretability. We call the framework JUNIPR: “Jets from UNsupervised Interpretable PRobabilistic models”. In this approach, the set of particle momenta composing a jet are clustered into a binary tree that the neural network examines sequentially. Training is unsupervised and unrestricted: the network could decide that the data bears little correspondence to the chosen tree structure. However, when there is a correspondence, the network’s output along the tree has a direct physical interpretation. JUNIPR models can perform discrimination tasks, through the statistically optimal likelihood-ratio test, and they permit visualizations of discrimination power at each branching in a jet’s tree. Additionally, JUNIPR models provide a probability distribution from which events can be drawn, providing a data-driven Monte Carlo generator. As a third application, JUNIPR models can reweight events from one (e.g. simulated) data set to agree with distributions from another (e.g. experimental) data set.},
doi = {10.1140/epjc/s10052-019-6607-9},
journal = {European Physical Journal. C, Particles and Fields},
number = 2,
volume = 79,
place = {Germany},
year = {Fri Feb 01 00:00:00 EST 2019},
month = {Fri Feb 01 00:00:00 EST 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1140/epjc/s10052-019-6607-9

Citation Metrics:
Cited by: 72 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

The anti- k t jet clustering algorithm
journal, April 2008


Learning representations by back-propagating errors
journal, October 1986

  • Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J.
  • Nature, Vol. 323, Issue 6088
  • DOI: 10.1038/323533a0

Singularities in the physical region
journal, July 1965


Soft gluons and factorization
journal, October 1988


Finding physics signals with event deconstruction
journal, May 2014


How much information is in a jet?
journal, June 2017


Speech recognition with deep recurrent neural networks
conference, May 2013

  • Graves, Alex; Mohamed, Abdel-rahman; Hinton, Geoffrey
  • ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • DOI: 10.1109/ICASSP.2013.6638947

(Machine) learning to do more with less
journal, February 2018

  • Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan
  • Journal of High Energy Physics, Vol. 2018, Issue 2
  • DOI: 10.1007/JHEP02(2018)034

CaloGAN: Simulating 3D high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks
journal, January 2018


Finding physics signals with shower deconstruction
journal, October 2011


Classification without labels: learning from mixed samples in high energy physics
journal, October 2017

  • Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
  • Journal of High Energy Physics, Vol. 2017, Issue 10
  • DOI: 10.1007/JHEP10(2017)174

Longitudinally-invariant k⊥-clustering algorithms for hadron-hadron collisions
journal, September 1993


PYTHIA 6.4 physics and manual
journal, May 2006


Pileup Mitigation with Machine Learning (PUMML)
journal, December 2017

  • Komiske, Patrick T.; Metodiev, Eric M.; Nachman, Benjamin
  • Journal of High Energy Physics, Vol. 2017, Issue 12
  • DOI: 10.1007/JHEP12(2017)051

Factorization for short distance hadron-hadron scattering
journal, January 1985


Seeing in Color: Jet Superstructure
journal, July 2010


Deep learning in color: towards automated quark/gluon jet discrimination
journal, January 2017

  • Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
  • Journal of High Energy Physics, Vol. 2017, Issue 1
  • DOI: 10.1007/JHEP01(2017)110

Deep-learning top taggers or the end of QCD?
journal, May 2017

  • Kasieczka, Gregor; Plehn, Tilman; Russell, Michael
  • Journal of High Energy Physics, Vol. 2017, Issue 5
  • DOI: 10.1007/JHEP05(2017)006

Multivariate discrimination and the Higgs+W/Z search
journal, April 2011

  • Gallicchio, Jason; Huth, John; Kagan, Michael
  • Journal of High Energy Physics, Vol. 2011, Issue 4
  • DOI: 10.1007/JHEP04(2011)069

FastJet user manual: (for version 3.0.2)
journal, March 2012


Jet-images: computer vision inspired techniques for jet tagging
journal, February 2015

  • Cogan, Josh; Kagan, Michael; Strauss, Emanuel
  • Journal of High Energy Physics, Vol. 2015, Issue 2
  • DOI: 10.1007/JHEP02(2015)118

Fuzzy jets
journal, June 2016

  • Mackey, Lester; Nachman, Benjamin; Schwartzman, Ariel
  • Journal of High Energy Physics, Vol. 2016, Issue 6
  • DOI: 10.1007/JHEP06(2016)010

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
conference, January 2014

  • Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar
  • Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
  • DOI: 10.3115/v1/D14-1179

Novel jet observables from machine learning
journal, March 2018

  • Datta, Kaustuv; Larkoski, Andrew J.
  • Journal of High Energy Physics, Vol. 2018, Issue 3
  • DOI: 10.1007/JHEP03(2018)086

Identifying boosted objects with N-subjettiness
journal, March 2011


Jet flavor classification in high-energy physics with deep neural networks
journal, December 2016


Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis
journal, September 2017

  • de Oliveira, Luke; Paganini, Michela; Nachman, Benjamin
  • Computing and Software for Big Science, Vol. 1, Issue 1
  • DOI: 10.1007/s41781-017-0004-6

Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multilayer Calorimeters
journal, January 2018


Long Short-Term Memory
journal, November 1997


Hard-soft-collinear factorization to all orders
journal, November 2014


Successive combination jet algorithm for hadron collisions
journal, October 1993


Better jet clustering algorithms
journal, August 1997