skip to main content

DOE PAGESDOE PAGES

Title: Order priors for Bayesian network discovery with an application to malware phylogeny

Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges) in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared tomore » existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.« less
Authors:
ORCiD logo [1] ;  [2] ; ORCiD logo [1] ; ORCiD logo [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  2. Cisco Systems Inc., Durham, NC (United States)
Publication Date:
Report Number(s):
LA-UR-16-23891
Journal ID: ISSN 1932-1864
Grant/Contract Number:
AC52-06NA25396
Type:
Accepted Manuscript
Journal Name:
Statistical Analysis and Data Mining
Additional Journal Information:
Journal Volume: 10; Journal Issue: 5; Journal ID: ISSN 1932-1864
Publisher:
Wiley
Research Org:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org:
USDOE Laboratory Directed Research and Development (LDRD) Program
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Bayesian networks; cyber security; malware; probabilistic graphical models
OSTI Identifier:
1398911

Oyen, Diane, Anderson, Blake, Sentz, Kari, and Anderson-Cook, Christine Michaela. Order priors for Bayesian network discovery with an application to malware phylogeny. United States: N. p., Web. doi:10.1002/sam.11364.
Oyen, Diane, Anderson, Blake, Sentz, Kari, & Anderson-Cook, Christine Michaela. Order priors for Bayesian network discovery with an application to malware phylogeny. United States. doi:10.1002/sam.11364.
Oyen, Diane, Anderson, Blake, Sentz, Kari, and Anderson-Cook, Christine Michaela. 2017. "Order priors for Bayesian network discovery with an application to malware phylogeny". United States. doi:10.1002/sam.11364. https://www.osti.gov/servlets/purl/1398911.
@article{osti_1398911,
title = {Order priors for Bayesian network discovery with an application to malware phylogeny},
author = {Oyen, Diane and Anderson, Blake and Sentz, Kari and Anderson-Cook, Christine Michaela},
abstractNote = {Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges) in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.},
doi = {10.1002/sam.11364},
journal = {Statistical Analysis and Data Mining},
number = 5,
volume = 10,
place = {United States},
year = {2017},
month = {9}
}