skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performing effective feature selection by investigating the deep structure of the data

Conference ·
OSTI ID:421318
;  [1]
  1. Centro Studi e Laboratori Telecomunicazioni, Torino (Italy)

This paper introduces ADHOC (Automatic Discoverer of Fligher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature selection process. ADHOC partitions the observed features into a number of groups, called factors, that reflect the major dimensions of the phenomenon under consideration. The set of learned factors define the starting point of the search of the best performing feature subset. A genetic algorithm is used to explore the feature space originated by the factors and to determine the set of most informative feature configurations. The feature subset evaluation function is the performance of the induction algorithm. This approach offers three main advantages: (i) the likelihood of selecting good performing features grows; (ii) the complexity of search diminishes consistently; (iii) the possibility of selecting a bad feature subset due to overfitting problems decreases. Extensive experiments on real-world data have been conducted to demonstrate the effectiveness of ADHOC as data reduction technique as well as feature selection method.

OSTI ID:
421318
Report Number(s):
CONF-960830-; TRN: 96:005928-0073
Resource Relation:
Conference: 2. international conference on knowledge discovery and data mining, Portland, OR (United States), 2-4 Aug 1996; Other Information: PBD: 1996; Related Information: Is Part Of Proceedings of the second international conference on knowledge discovery & data mining; Simoudis, E.; Han, J.; Fayyad, U. [eds.]; PB: 405 p.
Country of Publication:
United States
Language:
English