| | |
Summary: MML Inference of Single-layer Neural Networks
Enes Makalic, Lloyd Allison and David L. Dowe
Abstract
Inference of the optimal neural network architecture for a specic dataset is a long standing
and diÆcult problem. Although a number of researchers have proposed various model selection
procedures, the problem still remains largely unsolved. The architecture of the neural network,
(the number of hidden layers, hidden neurons, inputs, etc.) directly aects its performance. A
network that is too simple will not learn the problem suÆciently well, resulting in poor performance.
Conversely, a complex network can overt and exhibit poor generalisation capabilities. This paper
introduces a novel selection criterion, based on Minimum Message Length (MML), for inference
of single hidden layer, fully-connected, feedforward neural networks. The criterion performance is
demonstrated on several articial and real datasets. Furthermore, the MML criterion is compared
against an MDL-based criterion and variations of the Akaike's Information Criterion (AIC) and
Bayesian Information Criterion (BIC). In all tests considered, the MML criterion never overtted
and performed as well as, and often better than other model selection criteria.
1 Introduction
Articial neural networks are an eÆcient tool for classication and regression problems. At the present
time the most popular neural network type in use is the Multilayer Perceptron (MLP) [11, 10]. MLPs
are characterised by the number of hidden layers, hidden neurons and connections between the layers.
The architecture of a network must be determined separately for each problem - there is no single,
|