Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Graph Learning in Knowledge Bases

Technical Report ·
DOI:https://doi.org/10.2172/1390764· OSTI ID:1390764
 [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fields (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.
Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC04-94AL85000
OSTI ID:
1390764
Report Number(s):
SAND2017--9706R; 656886
Country of Publication:
United States
Language:
English

Similar Records

Improve Learning from Crowds via Generative Augmentation
Conference · Sat Aug 14 00:00:00 EDT 2021 · Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining · OSTI ID:1822655

Text Mining for Process–Structure–Properties Relationships in Metals
Journal Article · Thu Sep 25 20:00:00 EDT 2025 · Integrating Materials and Manufacturing Innovation · OSTI ID:3014104

Learning from crowds with variational Gaussian processes
Journal Article · Mon Nov 19 19:00:00 EST 2018 · Pattern Recognition · OSTI ID:1488416

Related Subjects