skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Enhanced Named Entity Extraction via Error-Driven Aggregation

Conference ·
OSTI ID:1009212

Despite recent advances in named entity extraction technologies, state-of-the-art extraction tools achieve insufficient accuracy rates for practical use in many operational settings. However, they are not generally prone to the same types of error, suggesting that substantial improvements may be achieved via appropriate combinations of existing tools, provided their behavior can be accurately characterized and quantified. In this paper, we present an inference methodology for the aggregation of named entity extraction technologies that is founded upon a black-box analysis of their respective error processes. This method has been shown to produce statistically significant improvements in extraction relative to standard performance metrics and to mitigate the weak performance of entity extractors operating under suboptimal conditions. Moreover, this approach provides a framework for quantifying uncertainty and has demonstrated the ability to reconstruct the truth when majority voting fails.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
1009212
Report Number(s):
LLNL-CONF-424662; TRN: US201106%%476
Resource Relation:
Conference: Presented at: International Conference on Data Mining 2010, Las Vegas, NV, United States, Jul 12 - Jul 15, 2010
Country of Publication:
United States
Language:
English