| | |
Summary: Using MDL for grammar induction
Pieter Adriaans y and Ceriel Jacobs z
Abstract
In this paper we study the application of the Minimum Description
Length principle (or two-part-code optimization) to grammar induction
in the light of recent developments in Kolmogorov complexity theory. We
focus on issues that are important for construction of eective compres-
sion algorithms. We dene an independent measure for the quality of a
theory given a data set: the randomness deciency. This is a measure
of how typical the data set is for the theory. It can not be computed,
but it can in many relevant cases be approximated. An optimal theory
has minimal randomness deciency. Using results from Vereshchagin and
Vitanyi [2004] and Adriaans and Vitanyi [2005] we show that:
Shorter code not necessarily leads to better theories. We prove that,
in DFA induction, already as a result of a single deterministic merge
of two nodes, divergence of randomness deciency and MDL code
can occur.
Contrary to what is suggested by the results of Gold [1967] there is
no fundamental dierence between positive and negative data from
an MDL perspective.
|