Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Efficient Active Learning for Gaussian Process Classification by Error Reduction

Journal Article · · Advances in Neural Information Processing Systems
OSTI ID:1842011
 [1];  [1];  [2];  [3];  [2]
  1. Texas A & M Univ., College Station, TX (United States)
  2. Texas A & M Univ., College Station, TX (United States); Brookhaven National Lab. (BNL), Upton, NY (United States)
  3. Brookhaven National Lab. (BNL), Upton, NY (United States)
Active learning sequentially selects the best instance for labeling by optimizing an acquisition function to enhance data/label efficiency. The selection can be either from a discrete instance set (pool-based scenario) or a continuous instance space (query synthesis scenario). In this work, we study both active learning scenarios for Gaussian Process Classification (GPC). The existing active learning strategies that maximize the Estimated Error Reduction (EER) aim at reducing the classification error after training with the new acquired instance in a onestep-look-ahead manner. The computation of EER-based acquisition functions is typically prohibitive as it requires retraining the GPC with every new query. Moreover, as the EER is not smooth, it can not be combined with gradient-based optimization techniques to efficiently explore the continuous instance space for query synthesis. To overcome these critical limitations, we develop computationally efficient algorithms for EER-based active learning with GPC. Further, we derive the joint predictive distribution of label pairs as a one-dimensional integral, as a result of which the computation of the acquisition function avoids retraining the GPC for each query, remarkably reducing the computational overhead. We also derive the gradient chain rule to efficiently calculate the gradient of the acquisition function, which leads to the first query synthesis active learning algorithm implementing EER-based strategies. Our experiments clearly demonstrate the computational efficiency of the proposed algorithms. We also benchmark our algorithms on both synthetic and real-world datasets, which show superior performance in terms of sampling efficiency compared to the existing state-of-the-art algorithms.
Research Organization:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF)
Grant/Contract Number:
SC0012704; SC0019303
OSTI ID:
1842011
Report Number(s):
BNL-222619-2022-JAAM
Journal Information:
Advances in Neural Information Processing Systems, Journal Name: Advances in Neural Information Processing Systems Vol. 34; ISSN 1049-5258
Publisher:
Association for Computing Machinery (ACM)Copyright Statement
Country of Publication:
United States
Language:
English

Similar Records

Batch Active Learning for Multispectral and Hyperspectral Image Segmentation Using Similarity Graphs
Journal Article · Thu Jul 20 00:00:00 EDT 2023 · Communications on Applied Mathematics and Computation · OSTI ID:1991631