Minimum Message Length Grouping of Ordered Leigh J. Fitzgibbon, Lloyd Allison, and David L. Dowe

Leigh J. Fitzgibbon, Lloyd Allison, and David L. Dowe
School of Computer Science and Software Engineering
Monash University, Clayton, VIC 3168 Australia
Abstract. Explicit segmentation is the partitioning of data into ho-
mogeneous regions by specifying cut-points. W. D. Fisher (1958) gave
an early example of explicit segmentation based on the minimisation of
squared error. Fisher called this the grouping problem and came up with
a polynomial time Dynamic Programming Algorithm (DPA). Oliver,
Baxter and colleagues (1996,1997,1998) have applied the information-
theoretic Minimum Message Length (MML) principle to explicit seg-
mentation. They have derived formulas for specifying cut-points impre-
cisely and have empirically shown their criterion to be superior to other
segmentation methods (AIC, MDL and BIC). We use a simple MML cri-
terion and Fisher's DPA to perform numerical Bayesian (summing and)
integration (using message lengths) over the cut-point location parame-
ters. This gives an estimate of the number of segments, which we then
use to estimate the cut-point positions and segment parameters by min-


Source: Allison, Lloyd - Caulfield School of Information Technology, Monash University


Collections: Computer Technologies and Information Sciences