skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Graph Learning in Knowledge Bases

Abstract

The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fields (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.

Authors:
 [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1390764
Report Number(s):
SAND2017-9706R
656886
DOE Contract Number:  
AC04-94AL85000
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Goldberg, Sean, and Wang, Daisy Zhe. Graph Learning in Knowledge Bases. United States: N. p., 2017. Web. doi:10.2172/1390764.
Goldberg, Sean, & Wang, Daisy Zhe. Graph Learning in Knowledge Bases. United States. doi:10.2172/1390764.
Goldberg, Sean, and Wang, Daisy Zhe. Fri . "Graph Learning in Knowledge Bases". United States. doi:10.2172/1390764. https://www.osti.gov/servlets/purl/1390764.
@article{osti_1390764,
title = {Graph Learning in Knowledge Bases},
author = {Goldberg, Sean and Wang, Daisy Zhe},
abstractNote = {The amount of text data has been growing exponentially in recent years, giving rise to automatic information extraction methods that store text annotations in a database. The current state-of-theart structured prediction methods, however, are likely to contain errors and it’s important to be able to manage the overall uncertainty of the database. On the other hand, the advent of crowdsourcing has enabled humans to aid machine algorithms at scale. As part of this project we introduced pi-CASTLE , a system that optimizes and integrates human and machine computing as applied to a complex structured prediction problem involving conditional random fields (CRFs). We proposed strategies grounded in information theory to select a token subset, formulate questions for the crowd to label, and integrate these labelings back into the database using a method of constrained inference. On both a text segmentation task over academic citations and a named entity recognition task over tweets we showed an order of magnitude improvement in accuracy gain over baseline methods.},
doi = {10.2172/1390764},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Sep 01 00:00:00 EDT 2017},
month = {Fri Sep 01 00:00:00 EDT 2017}
}

Technical Report:

Save / Share: