 
Summary: Heuristic Greedy Search Algorithms for Latent Variable Models
Peter Spirtes (Department of Philosophy, Carnegie Mellon Univeristy, ps7z@andrew.cmu.edu),
Thomas Richardson(Department of Statistics, University of Washington) , and Chris Meek (Microsoft)
I. Introduction
A Bayesian network consists of two distinct parts: a directed acyclic graph (DAG or beliefnetwork
structure) and a set of parameters for the DAG. The DAG in a Bayesian network can be used to represent
both causal hypotheses and sets of probability distributions. Under the causal interpretation, a DAG
represents the causal relations in a given population with a set of vertices V when there is an edge from A
to B if and only if A is a direct cause of B relative to V. (We adopt the convention that sets of variables are
capitalized and boldfaced, and individual variables are capitalized and italicized.) Under the statistical
interpretation a DAG G can be taken to represent a set of all distributions all of which share a set of
conditional independence relations that are entailed by satisfying a local directed Markov property (defined
below).
Assumptions linking the statistical and causal interpretations of DAG are discussed in Spirtes, Glymour and
Scheines (1993). For a particular set of parameters Q for a DAG G, G(Q) is a parametric family of
distributions. Many familiar parametric models, such as nonrecursive structural equation models with
uncorrelated errors, factor analytic models, item response models, etc. are special cases of parameterized
DAGs. Bayesian networks have proved useful in expert systems, particularly with classification problems
(see references in Pearl 1988) and in predicting the effects of interventions into given causal systems
(Spirtes et al. 1993 and Pearl 1995).
