 
Summary: Stochastic AttributeValue Grammars
Steven P. Abney
AT&T Laboratories
Probabilistic analogues of regular and contextfree grammars are wellknown in compu
tational linguistics, and currently the subject of intensive research. To date, however, no
satisfactory probabilistic analogue of attributevalue grammars has been proposed: previ
ous attempts have failed to define an adequate parameterestimation algorithm.
In the present paper, I define stochastic attributevalue grammars and give an algo
rithm for computing the maximumlikelihood estimate of their parameters. The estimation
algorithm is adapted from (Della Pietra, Della Pietra, and Lafferty, 1995). To estimate
model parameters, it is necessary to compute the expectations of certain functions under
random fields. In the application discussed by Della Pietra, Della Pietra, and Lafferty
(representing English orthographic constraints), Gibbs sampling can be used to estimate
the needed expectations. The fact that attributevalue grammars generate constrained lan
guages makes Gibbs sampling inapplicable, but I show that sampling can be done using
the more general MetropolisHastings algorithm.
1. Introduction
Stochastic versions of regular grammars and contextfree grammars have received a great
deal of attention in computational linguistics for the last several years, and basic tech
niques of stochastic parsing and parameter estimation have been known for decades.
