Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Extraction of information from unstructured text

Technical Report ·
DOI:https://doi.org/10.2172/148697· OSTI ID:148697

Extracting information from unstructured text has become an emphasis in recent years due to the large amount of text now electronically available. This status report describes the findings and work done by the end of the first year of a two-year LDRD. Requirements of the approach included that it model the information in a domain independent way. This means that it would differ from current systems by not relying on previously built domain knowledge and that it would do more than keyword identification. Three areas that are discussed and expected to contribute to a solution include (1) identifying key entities through document level profiling and preprocessing, (2) identifying relationships between entities through sentence level syntax, and (3) combining the first two with semantic knowledge about the terms.

Research Organization:
Sandia National Labs., Albuquerque, NM (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States)
DOE Contract Number:
AC04-94AL85000
OSTI ID:
148697
Report Number(s):
SAND--95-2532; ON: DE96003216
Country of Publication:
United States
Language:
English

Similar Records

Domain-independent information extraction in unstructured text
Technical Report · Sun Sep 01 00:00:00 EDT 1996 · OSTI ID:378821

Information Extraction from Unstructured Text for the Biodefense Knowledge Center
Conference · Fri Apr 29 00:00:00 EDT 2005 · OSTI ID:877921

Overview of the penman text generation system
Technical Report · Thu Mar 31 23:00:00 EST 1983 · OSTI ID:6755287