skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Feature Subset Selection, Class Separability, and Genetic Algorithms

Conference ·

The performance of classification algorithms in machine learning is affected by the features used to describe the labeled examples presented to the inducers. Therefore, the problem of feature subset selection has received considerable attention. Genetic approaches to this problem usually follow the wrapper approach: treat the inducer as a black box that is used to evaluate candidate feature subsets. The evaluations might take a considerable time and the traditional approach might be unpractical for large data sets. This paper describes a hybrid of a simple genetic algorithm and a method based on class separability applied to the selection of feature subsets for classification problems. The proposed hybrid was compared against each of its components and two other feature selection wrappers that are used widely. The objective of this paper is to determine if the proposed hybrid presents advantages over the other methods in terms of accuracy or speed in this problem. The experiments used a Naive Bayes classifier and public-domain and artificial data sets. The experiments suggest that the hybrid usually finds compact feature subsets that give the most accurate results, while beating the execution time of the other wrappers.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
15013963
Report Number(s):
UCRL-CONF-202041; TRN: US200803%%795
Resource Relation:
Journal Volume: 3102; Conference: Presented at: Genetic and Evolutionary Computation Conference, Seattle, WA, United States, Jun 26 - Jun 30, 2004
Country of Publication:
United States
Language:
English

References (13)

Comparison of algorithms that select features for pattern classifiers journal January 2000
Feature subset selection by Bayesian networks: a comparison with genetic and sequential algorithms journal August 2001
Feature Subset Selection by Bayesian network-based optimization journal October 2000
Genetic Algorithms, Selection Schemes, and the Varying Effects of Noise journal June 1996
A note on genetic algorithms for large-scale feature selection journal November 1989
Dimensionality reduction using genetic algorithms journal July 2000
Analysis of class separation and combination of class-dependent features for handwriting recognition
  • Oh, Il-Seok; Lee, Jin-Seon; Suen, C. Y.
  • IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, Issue 10, p. 1089-1094 https://doi.org/10.1109/34.799913
journal October 1999
Using Learning to Facilitate the Evolution of Features for Recognizing Visual Concepts journal September 1996
Feature selection: evaluation, application, and small sample performance journal January 1997
Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms journal November 1999
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator journal January 1998
Wrappers for feature subset selection journal December 1997
The Gambler's Ruin Problem, Genetic Algorithms, and the Sizing of Populations journal September 1999