DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data

Abstract

A restricted Boltzmann machine (RBM) is a generative model that could be used in effectively balancing a cybersecurity dataset because the synthetic data a RBM generates follows the probability distribution of the training data. RBM training can be performed using contrastive divergence (CD) and quantum annealing (QA). QA-based RBM training is fundamentally different from CD and requires samples from a quantum computer. We present a real-world application that uses a quantum computer. Specifically, we train a RBM using QA for cybersecurity applications. The D-Wave 2000Q has been used to implement QA. RBMs are trained on the ISCX data, which is a benchmark dataset for cybersecurity. For comparison, RBMs are also trained using CD. CD is a commonly used method for RBM training. Our analysis of the ISCX data shows that the dataset is imbalanced. We present two different schemes to balance the training dataset before feeding it to a classifier. The first scheme is based on the undersampling of benign instances. The imbalanced training dataset is divided into five sub-datasets that are trained separately. A majority voting is then performed to get the result. Our results show the majority vote increases the classification accuracy up from 90.24% to 95.68%, inmore » the case of CD. For the case of QA, the classification accuracy increases from 74.14% to 80.04%. In the second scheme, a RBM is used to generate synthetic data to balance the training dataset. We show that both QA and CD-trained RBM can be used to generate useful synthetic data. Balanced training data is used to evaluate several classifiers. Among the classifiers investigated, K-Nearest Neighbor (KNN) and Neural Network (NN) perform better than other classifiers. They both show an accuracy of 93%. Our results show a proof-of-concept that a QA-based RBM can be trained on a 64-bit binary dataset. The illustrative example suggests the possibility to migrate many practical classification problems to QA-based techniques. Further, we show that synthetic data generated from a RBM can be used to balance the original dataset.« less

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [3]; ORCiD logo [3]; ORCiD logo [4]; ORCiD logo [1]; ORCiD logo [1]
  1. Purdue University, West Lafayette, IN (United States)
  2. Temple Univ., Philadelphia, PA (United States)
  3. Mississippi State Univ., Mississippi State, MS (United States)
  4. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Quantum Computing Institute
Publication Date:
Research Org.:
Purdue Univ., West Lafayette, IN (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OSTI Identifier:
1871582
Alternate Identifier(s):
OSTI ID: 1870258
Grant/Contract Number:  
SC0019215; AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Transactions on Emerging Topics in Computational Intelligence
Additional Journal Information:
Journal Volume: 6; Journal Issue: 3; Journal ID: ISSN 2471-285X
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Dixit, Vivek, Selvarajan, Raja, Aldwairi, Tamer, Koshka, Yaroslav, Novotny, Mark A., Humble, Travis S., Alam, Muhammad A., and Kais, Sabre. Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data. United States: N. p., 2022. Web. doi:10.1109/tetci.2021.3074916.
Dixit, Vivek, Selvarajan, Raja, Aldwairi, Tamer, Koshka, Yaroslav, Novotny, Mark A., Humble, Travis S., Alam, Muhammad A., & Kais, Sabre. Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data. United States. https://doi.org/10.1109/tetci.2021.3074916
Dixit, Vivek, Selvarajan, Raja, Aldwairi, Tamer, Koshka, Yaroslav, Novotny, Mark A., Humble, Travis S., Alam, Muhammad A., and Kais, Sabre. Wed . "Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data". United States. https://doi.org/10.1109/tetci.2021.3074916. https://www.osti.gov/servlets/purl/1871582.
@article{osti_1871582,
title = {Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data},
author = {Dixit, Vivek and Selvarajan, Raja and Aldwairi, Tamer and Koshka, Yaroslav and Novotny, Mark A. and Humble, Travis S. and Alam, Muhammad A. and Kais, Sabre},
abstractNote = {A restricted Boltzmann machine (RBM) is a generative model that could be used in effectively balancing a cybersecurity dataset because the synthetic data a RBM generates follows the probability distribution of the training data. RBM training can be performed using contrastive divergence (CD) and quantum annealing (QA). QA-based RBM training is fundamentally different from CD and requires samples from a quantum computer. We present a real-world application that uses a quantum computer. Specifically, we train a RBM using QA for cybersecurity applications. The D-Wave 2000Q has been used to implement QA. RBMs are trained on the ISCX data, which is a benchmark dataset for cybersecurity. For comparison, RBMs are also trained using CD. CD is a commonly used method for RBM training. Our analysis of the ISCX data shows that the dataset is imbalanced. We present two different schemes to balance the training dataset before feeding it to a classifier. The first scheme is based on the undersampling of benign instances. The imbalanced training dataset is divided into five sub-datasets that are trained separately. A majority voting is then performed to get the result. Our results show the majority vote increases the classification accuracy up from 90.24% to 95.68%, in the case of CD. For the case of QA, the classification accuracy increases from 74.14% to 80.04%. In the second scheme, a RBM is used to generate synthetic data to balance the training dataset. We show that both QA and CD-trained RBM can be used to generate useful synthetic data. Balanced training data is used to evaluate several classifiers. Among the classifiers investigated, K-Nearest Neighbor (KNN) and Neural Network (NN) perform better than other classifiers. They both show an accuracy of 93%. Our results show a proof-of-concept that a QA-based RBM can be trained on a 64-bit binary dataset. The illustrative example suggests the possibility to migrate many practical classification problems to QA-based techniques. Further, we show that synthetic data generated from a RBM can be used to balance the original dataset.},
doi = {10.1109/tetci.2021.3074916},
journal = {IEEE Transactions on Emerging Topics in Computational Intelligence},
number = 3,
volume = 6,
place = {United States},
year = {Wed Jun 01 00:00:00 EDT 2022},
month = {Wed Jun 01 00:00:00 EDT 2022}
}

Works referenced in this record:

Optimizing adiabatic quantum program compilation using a graph-theoretic framework
journal, April 2018

  • Goodrich, Timothy D.; Sullivan, Blair D.; Humble, Travis S.
  • Quantum Information Processing, Vol. 17, Issue 5
  • DOI: 10.1007/s11128-018-1863-4

Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction
conference, November 2013

  • Tesfahun, Abebe; Bhaskari, D. Lalitha
  • 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies (CUBE)
  • DOI: 10.1109/CUBE.2013.31

Research on Intrusion Detection Method Based on Improved Smote and XGBoost
conference, January 2018

  • Su, Peihuang; Liu, Yanhua; Song, Xiang
  • Proceedings of the 8th International Conference on Communication and Network Security - ICCNS 2018
  • DOI: 10.1145/3290480.3290505

A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem
conference, December 2017

  • Yan, BingHao; Han, GuoDong; Sun, MeiDong
  • 2017 3rd IEEE International Conference on Computer and Communications (ICCC)
  • DOI: 10.1109/CompComm.2017.8322749

AESMOTE: Adversarial Reinforcement Learning With SMOTE for Anomaly Detection
journal, April 2021


Toward developing a systematic approach to generate benchmark datasets for intrusion detection
journal, May 2012


SMOTE Implementation on Phishing Data to Enhance Cybersecurity
conference, May 2018

  • Ahsan, Mostofa; Gomes, Rahul; Denton, Anne
  • 2018 IEEE International Conference on Electro/Information Technology (EIT)
  • DOI: 10.1109/EIT.2018.8500086

Adiabatic Quantum Computation is Equivalent to Standard Quantum Computation
journal, January 2007


Comparison of Use of a 2000 Qubit D-Wave Quantum Annealer and MCMC for Sampling, Image Reconstruction, and Classification
journal, February 2021

  • Koshka, Yaroslav; Novotny, Mark A.
  • IEEE Transactions on Emerging Topics in Computational Intelligence, Vol. 5, Issue 1
  • DOI: 10.1109/TETCI.2018.2871466

Quantum annealing in the transverse Ising model
journal, November 1998


Quantum annealing: A new method for minimizing multidimensional functions
journal, March 1994


Support vector machines on the D-Wave quantum annealer
journal, March 2020


Quantum annealing for combinatorial clustering
journal, January 2018

  • Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey
  • Quantum Information Processing, Vol. 17, Issue 2
  • DOI: 10.1007/s11128-017-1809-2

Quantum Annealing for Prime Factorization
journal, December 2018


Electronic Structure Calculations and the Ising Hamiltonian
journal, October 2017

  • Xia, Rongxin; Bian, Teng; Kais, Sabre
  • The Journal of Physical Chemistry B, Vol. 122, Issue 13
  • DOI: 10.1021/acs.jpcb.7b10371

Intrusion detection using deep belief networks
conference, June 2015

  • Alom, Md. Zahangir; Bontupalli, VenkataRamesh; Taha, Tarek M.
  • NAECON 2015 - IEEE National Aerospace and Electronics Conference, 2015 National Aerospace and Electronics Conference (NAECON)
  • DOI: 10.1109/NAECON.2015.7443094

SMOTE: Synthetic Minority Over-sampling Technique
journal, January 2002

  • Chawla, N. V.; Bowyer, K. W.; Hall, L. O.
  • Journal of Artificial Intelligence Research, Vol. 16
  • DOI: 10.1613/jair.953

An evaluation of the performance of Restricted Boltzmann Machines as a model for anomaly network intrusion detection
journal, October 2018


Training Products of Experts by Minimizing Contrastive Divergence
journal, August 2002


A Hybrid Malicious Code Detection Method based on Deep Learning
journal, May 2015

  • Li, Yuancheng; Ma, Rong; Jiao, Runhai
  • International Journal of Security and Its Applications, Vol. 9, Issue 5
  • DOI: 10.14257/ijsia.2015.9.5.21

SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary
journal, April 2018

  • Fernandez, Alberto; Garcia, Salvador; Herrera, Francisco
  • Journal of Artificial Intelligence Research, Vol. 61
  • DOI: 10.1613/jair.1.11192

Simple Proof of Equivalence between Adiabatic Quantum Computation and the Circuit Model
journal, August 2007


Toward an Online Anomaly Intrusion Detection System Based on Deep Learning
conference, December 2016

  • Alrawashdeh, Khaled; Purdy, Carla
  • 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)
  • DOI: 10.1109/ICMLA.2016.0040

Network anomaly detection with the restricted Boltzmann machine
journal, December 2013


Toward sampling from undirected probabilistic graphical models using a D-Wave quantum annealer
journal, September 2020


Estimation of effective temperatures in quantum annealers for sampling applications: A case study with possible applications in deep learning
journal, August 2016


Data mining: practical machine learning tools and techniques with Java implementations
journal, March 2002


Comparison of D-Wave Quantum Annealing and Classical Simulated Annealing for Local Minima Determination
journal, August 2020

  • Koshka, Yaroslav; Novotny, Mark A.
  • IEEE Journal on Selected Areas in Information Theory, Vol. 1, Issue 2
  • DOI: 10.1109/JSAIT.2020.3014192

Training a Quantum Annealing Based Restricted Boltzmann Machine on Cybersecurity Data
journal, January 2021

  • Dixit, Vivek; Selvarajan, Raja; Aldwairi, Tamer
  • IEEE Transactions on Emerging Topics in Computational Intelligence
  • DOI: 10.1109/TETCI.2021.3074916

A hybrid quantum enabled RBM advantage: convolutional autoencoders for quantum image compression and generative learning
conference, May 2020

  • Sleeman, Jennifer; Dorband, John; Halem, Milton
  • Quantum Information Science, Sensing, and Computation XII
  • DOI: 10.1117/12.2558832

Quantum annealing: A new method for minimizing multidimensional functions
journal, March 1994


Some Remarks on Weakly Prime and Weakly Semiprime Submodules
journal, January 2012