skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Survey of Current State of the Art Entity-Relation Extraction Tools

Technical Report ·
DOI:https://doi.org/10.2172/1630263· OSTI ID:1630263
 [1];  [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

In the area of information extraction from text data, there exists a number of tools with the capability of extracting entities, topics, and their relationships with one another from both structured and unstructured text sources. Such information has endless uses in a number of domains, however, the solutions to getting this information are still in early stages and has room for improvement. The topic has been explored from a research perspective by academic institutions, as well as formal tool creation from corporations but has not made much advancement since the early 2000's. Overall, entity extraction, and the related topic of entity linking, is common among these tools, though with varying degrees of accuracy, while relationship extraction is more difficult to find and seems limited to same sentence analysis. In this report, we take a look at the top state of the art tools currently available and identify their capabilities, strengths, and weaknesses. We explore the common algorithms in the successful approaches to entity extraction and their ability to efficiently handle both structured and unstructured text data. Finally, we highlight some of the common issues among these tools and summarize the current ability to extract relationship information.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC04-94AL85000
OSTI ID:
1630263
Report Number(s):
SAND-2020-4866; 686082
Country of Publication:
United States
Language:
English

Similar Records

Survey of Current State of the Art Entity-Relation Extraction Tools
Technical Report · Mon Jul 01 00:00:00 EDT 2019 · OSTI ID:1630263

Extraction of information from unstructured text
Technical Report · Wed Nov 01 00:00:00 EST 1995 · OSTI ID:1630263

Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature
Journal Article · Tue Jul 30 00:00:00 EDT 2019 · Journal of Chemical Information and Modeling · OSTI ID:1630263