skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Order priors for Bayesian network discovery with an application to malware phylogeny

Journal Article · · Statistical Analysis and Data Mining
DOI:https://doi.org/10.1002/sam.11364· OSTI ID:1398911

Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges) in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
AC52-06NA25396
OSTI ID:
1398911
Report Number(s):
LA-UR-16-23891
Journal Information:
Statistical Analysis and Data Mining, Vol. 10, Issue 5; ISSN 1932-1864
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

References (9)

Learning Bayesian networks: The combination of knowledge and statistical data journal September 1995
Network modelling methods for FMRI journal January 2011
Graph-based malware detection using dynamic analysis journal June 2011
Intrusion detection using sequences of system calls journal July 1998
Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move journal April 2008
Malware behaviour analysis journal December 2007
Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge journal January 2007
A Bayesian method for the induction of probabilistic networks from data journal October 1992
Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks journal January 2003

Similar Records

Progress in Constraining Nuclear Symmetry Energy Using Neutron Star Observables Since GW170817
Journal Article · Fri Jun 04 00:00:00 EDT 2021 · Universe · OSTI ID:1398911

Supervised detection of anomalous light curves in massive astronomical catalogs
Journal Article · Sat Sep 20 00:00:00 EDT 2014 · Astrophysical Journal · OSTI ID:1398911

Rethinking the learning of belief network probabilities
Conference · Fri Mar 01 00:00:00 EST 1996 · OSTI ID:1398911