Enhanced Named Entity Extraction via Error-Driven Aggregation
Despite recent advances in named entity extraction technologies, state-of-the-art extraction tools achieve insufficient accuracy rates for practical use in many operational settings. However, they are not generally prone to the same types of error, suggesting that substantial improvements may be achieved via appropriate combinations of existing tools, provided their behavior can be accurately characterized and quantified. In this paper, we present an inference methodology for the aggregation of named entity extraction technologies that is founded upon a black-box analysis of their respective error processes. This method has been shown to produce statistically significant improvements in extraction relative to standard performance metrics and to mitigate the weak performance of entity extractors operating under suboptimal conditions. Moreover, this approach provides a framework for quantifying uncertainty and has demonstrated the ability to reconstruct the truth when majority voting fails.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 1009212
- Report Number(s):
- LLNL-CONF-424662; TRN: US201106%%476
- Resource Relation:
- Conference: Presented at: International Conference on Data Mining 2010, Las Vegas, NV, United States, Jul 12 - Jul 15, 2010
- Country of Publication:
- United States
- Language:
- English
Similar Records
Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature
Creating Training Data for Scientific Named Entity Recognition with Minimal Human Effort