skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: GRAFTING: FAST, INCREMENTAL FEATURE SELECTION USING GRADIENT DESCENT IN FUNCTION SPACE

Abstract

We present a novel and flexible approach to the problem of feature selection, called grafting. Rather than considering feature selection as separate from learning, grafting treats the selection of suitable features as an integral part of learning a predictor in a regularized learning framework. To make this regularized learning process sufficiently fast for large scale problems, grafting operates in an incremental iterative fashion, gradually building up a feature set while training a predictor model using gradient descent. At each iteration, a fast gradient-based heuristic is used to quickly assess which feature is most likely to improve the existing model, that feature is then added to the model, and the model is incrementally optimized using gradient descent. The algorithm scales linearly with the number of data points and at most quadratically with the number of features. Grafting can be used with a variety of predictor model classes, both linear and non-linear, and can be used for both classification and regression. Experiments are reported here on a variant of grafting for classification, using both linear and non-linear models, and using a logistic regression-inspired loss function. Results on a variety of synthetic and real world data sets are presented. Finally the relationship betweenmore » grafting, stagewise additive modelling, and boosting is explored. Keywords: Feature selection, functional gradient descent, loss functions, margin space, boosting.« less

Authors:
; ;
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM
Sponsoring Org.:
USDOE
OSTI Identifier:
810782
Report Number(s):
LA-UR-02-4162
Journal ID: ISSN 1532-4435
DOE Contract Number:  
W-7405-ENG-36
Resource Type:
Journal Article
Journal Name:
Journal of Machine Learning Research
Additional Journal Information:
Journal Volume: 3; Related Information: http://www.jmlr.org/papers/volume3/perkins03a/perkins03a.pdf; Journal ID: ISSN 1532-4435
Publisher:
JMLR
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

PERKINS, SIMON J, LACKER, KEVIN, and THEILER, JAMES. GRAFTING: FAST, INCREMENTAL FEATURE SELECTION USING GRADIENT DESCENT IN FUNCTION SPACE. United States: N. p., 2002. Web.
PERKINS, SIMON J, LACKER, KEVIN, & THEILER, JAMES. GRAFTING: FAST, INCREMENTAL FEATURE SELECTION USING GRADIENT DESCENT IN FUNCTION SPACE. United States.
PERKINS, SIMON J, LACKER, KEVIN, and THEILER, JAMES. Fri . "GRAFTING: FAST, INCREMENTAL FEATURE SELECTION USING GRADIENT DESCENT IN FUNCTION SPACE". United States.
@article{osti_810782,
title = {GRAFTING: FAST, INCREMENTAL FEATURE SELECTION USING GRADIENT DESCENT IN FUNCTION SPACE},
author = {PERKINS, SIMON J and LACKER, KEVIN and THEILER, JAMES},
abstractNote = {We present a novel and flexible approach to the problem of feature selection, called grafting. Rather than considering feature selection as separate from learning, grafting treats the selection of suitable features as an integral part of learning a predictor in a regularized learning framework. To make this regularized learning process sufficiently fast for large scale problems, grafting operates in an incremental iterative fashion, gradually building up a feature set while training a predictor model using gradient descent. At each iteration, a fast gradient-based heuristic is used to quickly assess which feature is most likely to improve the existing model, that feature is then added to the model, and the model is incrementally optimized using gradient descent. The algorithm scales linearly with the number of data points and at most quadratically with the number of features. Grafting can be used with a variety of predictor model classes, both linear and non-linear, and can be used for both classification and regression. Experiments are reported here on a variant of grafting for classification, using both linear and non-linear models, and using a logistic regression-inspired loss function. Results on a variety of synthetic and real world data sets are presented. Finally the relationship between grafting, stagewise additive modelling, and boosting is explored. Keywords: Feature selection, functional gradient descent, loss functions, margin space, boosting.},
doi = {},
journal = {Journal of Machine Learning Research},
issn = {1532-4435},
number = ,
volume = 3,
place = {United States},
year = {2002},
month = {7}
}