DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: End-to-End Jet Classification of Boosted Top Quarks with CMS Open Data

Abstract

We describe a novel application of the end-to-end deep learning technique to the task of discriminating top quark-initiated jets from those originating from the hadronization of a light quark or a gluon. The end-to-end deep learning technique combines deep learning algorithms and low-level detector representation of the high-energy collision event. In this study, we use lowlevel detector information from the simulated CMS Open Data samples to construct the top jet classifiers. To optimize classifier performance we progressively add low-level information from the CMS tracking detector, including pixel detector reconstructed hits and impact parameters, and demonstrate the value of additional tracking information even when no new spatial structures are added. Relying only on calorimeter energy deposits and reconstructed pixel detector hits, the end-to-end classifier achieves a ROC-AUC score of 0.975±0.002 for the task of classifying boosted top quark jets. After adding derived track quantities, the classifier ROC-AUC score increases to 0.9824±0.0013, serving as the first performance benchmark for these CMS Open Data samples.

Authors:
 [1];  [2];  [3];  [4];  [4];  [2];  [2];  [1];  [2]
  1. Carnegie Mellon Univ., Pittsburgh, PA (United States)
  2. Brown Univ., Providence, RI (United States)
  3. BITS Pilani, Goa (India)
  4. Univ. of Alabama, Tuscaloosa, AL (United States)
Publication Date:
Research Org.:
Carnegie Mellon Univ., Pittsburgh, PA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP)
OSTI Identifier:
1837676
Grant/Contract Number:  
SC0010118
Resource Type:
Accepted Manuscript
Journal Name:
EPJ Web of Conferences (Online)
Additional Journal Information:
Journal Name: EPJ Web of Conferences (Online); Journal Volume: 251; Journal ID: ISSN 2100-014X
Publisher:
EDP Sciences
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS

Citation Formats

Andrews, Michael, Burkle, Bjorn, Chaudhari, Shravan, DiCroce, Davide, Gleyzer, Sergei, Heintz, Ulrich, Narain, Meenakshi, Paulini, Manfred, and Usai, Emanuele. End-to-End Jet Classification of Boosted Top Quarks with CMS Open Data. United States: N. p., 2021. Web. doi:10.1051/epjconf/202125104030.
Andrews, Michael, Burkle, Bjorn, Chaudhari, Shravan, DiCroce, Davide, Gleyzer, Sergei, Heintz, Ulrich, Narain, Meenakshi, Paulini, Manfred, & Usai, Emanuele. End-to-End Jet Classification of Boosted Top Quarks with CMS Open Data. United States. https://doi.org/10.1051/epjconf/202125104030
Andrews, Michael, Burkle, Bjorn, Chaudhari, Shravan, DiCroce, Davide, Gleyzer, Sergei, Heintz, Ulrich, Narain, Meenakshi, Paulini, Manfred, and Usai, Emanuele. Mon . "End-to-End Jet Classification of Boosted Top Quarks with CMS Open Data". United States. https://doi.org/10.1051/epjconf/202125104030. https://www.osti.gov/servlets/purl/1837676.
@article{osti_1837676,
title = {End-to-End Jet Classification of Boosted Top Quarks with CMS Open Data},
author = {Andrews, Michael and Burkle, Bjorn and Chaudhari, Shravan and DiCroce, Davide and Gleyzer, Sergei and Heintz, Ulrich and Narain, Meenakshi and Paulini, Manfred and Usai, Emanuele},
abstractNote = {We describe a novel application of the end-to-end deep learning technique to the task of discriminating top quark-initiated jets from those originating from the hadronization of a light quark or a gluon. The end-to-end deep learning technique combines deep learning algorithms and low-level detector representation of the high-energy collision event. In this study, we use lowlevel detector information from the simulated CMS Open Data samples to construct the top jet classifiers. To optimize classifier performance we progressively add low-level information from the CMS tracking detector, including pixel detector reconstructed hits and impact parameters, and demonstrate the value of additional tracking information even when no new spatial structures are added. Relying only on calorimeter energy deposits and reconstructed pixel detector hits, the end-to-end classifier achieves a ROC-AUC score of 0.975±0.002 for the task of classifying boosted top quark jets. After adding derived track quantities, the classifier ROC-AUC score increases to 0.9824±0.0013, serving as the first performance benchmark for these CMS Open Data samples.},
doi = {10.1051/epjconf/202125104030},
journal = {EPJ Web of Conferences (Online)},
number = ,
volume = 251,
place = {United States},
year = {Mon Aug 23 00:00:00 EDT 2021},
month = {Mon Aug 23 00:00:00 EDT 2021}
}

Works referenced in this record:

How to GAN LHC events
journal, January 2019


Energy flow networks: deep sets for particle jets
journal, January 2019

  • Komiske, Patrick T.; Metodiev, Eric M.; Thaler, Jesse
  • Journal of High Energy Physics, Vol. 2019, Issue 1
  • DOI: 10.1007/JHEP01(2019)121

End-to-end jet classification of quarks and gluons with the CMS Open Data
journal, October 2020

  • Andrews, M.; Alison, J.; An, S.
  • Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 977
  • DOI: 10.1016/j.nima.2020.164304

Samples with full event information including tracker hits for tracking, ML, and top quark tagging studies
dataset, January 2019


CMS data preservation, re-use and open access policy
dataset, January 2014


The anti- k t jet clustering algorithm
journal, April 2008


Deep-learned Top Tagging with a Lorentz Layer
journal, January 2018


Deep-learning top taggers or the end of QCD?
journal, May 2017

  • Kasieczka, Gregor; Plehn, Tilman; Russell, Michael
  • Journal of High Energy Physics, Vol. 2017, Issue 5
  • DOI: 10.1007/JHEP05(2017)006