Summary: MEASURES AND MODELS
FOR PHRASE RECOGNITION
Bell Communications Research
445 South Street
Morristown, NJ 07960
I present an entropy measure for evaluating parser performance.
The measure is fine-grained, and permits us to evaluate
performance at the level of individual phrases. The parsing
problem is characterized as statistically approximating the Penn
Treebank annotations. I consider a series of models to "calibrate"
the measure by determining what scores can be achieved using
the most obvious kinds of information. I also relate the entropy
measure to measures of recall/precision and grammar coverage.
Entropy measures of parser performance have focussed on
the parser's contribution to word prediction. This is
appropriate for evaluating a parser as a language model for
speech recognition, but it is less appropriate for evaluating