 
Summary: Minimum Message Length Grouping of Ordered
Data
Leigh J. Fitzgibbon, Lloyd Allison, and David L. Dowe
School of Computer Science and Software Engineering
Monash University, Clayton, VIC 3168 Australia
fleighf,lloyd,dldg@csse.monash.edu.au
Abstract. Explicit segmentation is the partitioning of data into ho
mogeneous regions by specifying cutpoints. W. D. Fisher (1958) gave
an early example of explicit segmentation based on the minimisation of
squared error. Fisher called this the grouping problem and came up with
a polynomial time Dynamic Programming Algorithm (DPA). Oliver,
Baxter and colleagues (1996,1997,1998) have applied the information
theoretic Minimum Message Length (MML) principle to explicit seg
mentation. They have derived formulas for specifying cutpoints impre
cisely and have empirically shown their criterion to be superior to other
segmentation methods (AIC, MDL and BIC). We use a simple MML cri
terion and Fisher's DPA to perform numerical Bayesian (summing and)
integration (using message lengths) over the cutpoint location parame
ters. This gives an estimate of the number of segments, which we then
use to estimate the cutpoint positions and segment parameters by min
