skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Multi-View Budgeted Learning under Label and Feature Constraints Using Label-Guided Graph-Based Regularization

Conference ·
OSTI ID:1018622

Budgeted learning under constraints on both the amount of labeled information and the availability of features at test time pertains to a large number of real world problems. Ideas from multi-view learning, semi-supervised learning, and even active learning have applicability, but a common framework whose assumptions fit these problem spaces is non-trivial to construct. We leverage ideas from these fields based on graph regularizers to construct a robust framework for learning from labeled and unlabeled samples in multiple views that are non-independent and include features that are inaccessible at the time the model would need to be applied. We describe examples of applications that fit this scenario, and we provide experimental results to demonstrate the effectiveness of knowledge carryover from training-only views. As learning algorithms are applied to more complex applications, relevant information can be found in a wider variety of forms, and the relationships between these information sources are often quite complex. The assumptions that underlie most learning algorithms do not readily or realistically permit the incorporation of many of the data sources that are available, despite an implicit understanding that useful information exists in these sources. When multiple information sources are available, they are often partially redundant, highly interdependent, and contain noise as well as other information that is irrelevant to the problem under study. In this paper, we are focused on a framework whose assumptions match this reality, as well as the reality that labeled information is usually sparse. Most significantly, we are interested in a framework that can also leverage information in scenarios where many features that would be useful for learning a model are not available when the resulting model will be applied. As with constraints on labels, there are many practical limitations on the acquisition of potentially useful features. A key difference in the case of feature acquisition is that the same constraints often don't pertain to the training samples. This difference provides an opportunity to allow features that are impractical in an applied setting to nevertheless add value during the model-building process. Unfortunately, there are few machine learning frameworks built on assumptions that allow effective utilization of features that are only available at training time. In this paper we formulate a knowledge carryover framework for the budgeted learning scenario with constraints on features and labels. The approach is based on multi-view and semi-supervised learning methods that use graph-encoded regularization. Our main contributions are the following: (1) we propose and provide justification for a methodology for ensuring that changes in the graph regularizer using alternate views are performed in a manner that is target-concept specific, allowing value to be obtained from noisy views; and (2) we demonstrate how this general set-up can be used to effectively improve models by leveraging features unavailable at test time. The rest of the paper is structured as follows. In Section 2, we outline real-world problems to motivate the approach and describe relevant prior work. Section 3 describes the graph construction process and the learning methodologies that are employed. Section 4 provides preliminary discussion regarding theoretical motivation for the method. In Section 5, effectiveness of the approach is demonstrated in a series of experiments employing modified versions of two well-known semi-supervised learning algorithms. Section 6 concludes the paper.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1018622
Resource Relation:
Conference: Workshop on Combining Learning Strategies to Reduce Label Cost at the 28th International Conference on Machine Learning (ICML 2011), Bellevue, WA, USA, 20110628, 20110702
Country of Publication:
United States
Language:
English

Similar Records

Error-Bounded Graph Construction for Semi-supervised Manifold Learning
Conference · Wed Aug 01 00:00:00 EDT 2018 · OSTI ID:1018622

Event‐Based Training in Label‐Limited Regimes
Journal Article · Tue Jan 18 00:00:00 EST 2022 · Journal of Geophysical Research. Solid Earth · OSTI ID:1018622

Semisupervised Learning for Seismic Monitoring Applications
Journal Article · Wed Oct 21 00:00:00 EDT 2020 · Seismological Research Letters · OSTI ID:1018622