Creating ensembles of decision trees through sampling
Recent work in classification indicates that significant improvements in accuracy can be obtained by growing an ensemble of classifiers and having them vote for the most popular class. This paper focuses on ensembles of decision trees that are created with a randomized procedure based on sampling. Randomization can be introduced by using random samples of the training data (as in bagging or arcing) and running a conventional tree-building algorithm, or by randomizing the induction algorithm itself. The objective of this paper is to describe our first experiences with a novel randomized tree induction method that uses a subset of samples at a node to determine the split. Our empirical results show that ensembles generated using this approach yield results that are competitive in accuracy and superior in computational cost.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- US Department of Energy (US)
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 15005459
- Report Number(s):
- UCRL-JC-142268; TRN: US200322%%454
- Resource Relation:
- Conference: International Conference on Machine Learning, Williams College, MA (US), 06/28/2001--07/01/2001; Other Information: PBD: 2 Feb 2001
- Country of Publication:
- United States
- Language:
- English
Similar Records
Approximate Splitting for Ensembles of Trees using Histograms
Classification of Bent-Double Galaxies: Experiences with Ensembles of Decision Trees