Summary: Model Selection in Omnivariate Decision Trees
Olcay Taner Yildiz and Ethem Alpaydin
Department of Computer Engineering,
Bogazi¸ci University TR-34342, Istanbul, Turkey
Abstract. We propose an omnivariate decision tree architecture which
contains univariate, multivariate linear or nonlinear nodes, matching the
complexity of the node to the complexity of the data reaching that node.
We compare the use of different model selection techniques including
AIC, BIC, and CV to choose between the three types of nodes on stan-
dard datasets from the UCI repository and see that such omnivariate
trees with a small percentage of multivariate nodes close to the root gen-
eralize better than pure trees with the same type of node everywhere.
CV produces simpler trees than AIC and BIC without sacrificing from
expected error. The only disadvantage of CV is its longer training time.
A decision tree is made up of internal decision nodes and terminal leaves. The
input vector is composed of p attributes, x = [x1, . . . , xp]T
, and the aim in
classification is to assign x to one of K mutually exclusive and exhaustive classes.