Graph Learning in Knowledge Bases
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fields (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.
- Research Organization:
- Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1390764
- Report Number(s):
- SAND2017--9706R; 656886
- Country of Publication:
- United States
- Language:
- English
Similar Records
Improve Learning from Crowds via Generative Augmentation
Text Mining for Process–Structure–Properties Relationships in Metals
Learning from crowds with variational Gaussian processes
Conference
·
Sat Aug 14 00:00:00 EDT 2021
· Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
·
OSTI ID:1822655
Text Mining for Process–Structure–Properties Relationships in Metals
Journal Article
·
Thu Sep 25 20:00:00 EDT 2025
· Integrating Materials and Manufacturing Innovation
·
OSTI ID:3014104
Learning from crowds with variational Gaussian processes
Journal Article
·
Mon Nov 19 19:00:00 EST 2018
· Pattern Recognition
·
OSTI ID:1488416