Performing effective feature selection by investigating the deep structure of the data
- Centro Studi e Laboratori Telecomunicazioni, Torino (Italy)
This paper introduces ADHOC (Automatic Discoverer of Fligher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature selection process. ADHOC partitions the observed features into a number of groups, called factors, that reflect the major dimensions of the phenomenon under consideration. The set of learned factors define the starting point of the search of the best performing feature subset. A genetic algorithm is used to explore the feature space originated by the factors and to determine the set of most informative feature configurations. The feature subset evaluation function is the performance of the induction algorithm. This approach offers three main advantages: (i) the likelihood of selecting good performing features grows; (ii) the complexity of search diminishes consistently; (iii) the possibility of selecting a bad feature subset due to overfitting problems decreases. Extensive experiments on real-world data have been conducted to demonstrate the effectiveness of ADHOC as data reduction technique as well as feature selection method.
- OSTI ID:
- 421318
- Report Number(s):
- CONF-960830-; TRN: 96:005928-0073
- Resource Relation:
- Conference: 2. international conference on knowledge discovery and data mining, Portland, OR (United States), 2-4 Aug 1996; Other Information: PBD: 1996; Related Information: Is Part Of Proceedings of the second international conference on knowledge discovery & data mining; Simoudis, E.; Han, J.; Fayyad, U. [eds.]; PB: 405 p.
- Country of Publication:
- United States
- Language:
- English
Similar Records
Feature Subset Selection, Class Separability, and Genetic Algorithms
GENIE: A HYBRID GENETIC ALGORITHM FOR FEATURE CLASSIFICATION IN MULTI-SPECTRAL IMAGES