skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Information Extraction from Unstructured Text for the Biodefense Knowledge Center

Conference ·
OSTI ID:877921

The Bio-Encyclopedia at the Biodefense Knowledge Center (BKC) is being constructed to allow an early detection of emerging biological threats to homeland security. It requires highly structured information extracted from variety of data sources. However, the quantity of new and vital information available from every day sources cannot be assimilated by hand, and therefore reliable high-throughput information extraction techniques are much anticipated. In support of the BKC, Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, together with the University of Utah, are developing an information extraction system built around the bioterrorism domain. This paper reports two important pieces of our effort integrated in the system: key phrase extraction and semantic tagging. Whereas two key phrase extraction technologies developed during the course of project help identify relevant texts, our state-of-the-art semantic tagging system can pinpoint phrases related to emerging biological threats. Also we are enhancing and tailoring the Bio-Encyclopedia by augmenting semantic dictionaries and extracting details of important events, such as suspected disease outbreaks. Some of these technologies have already been applied to large corpora of free text sources vital to the BKC mission, including ProMED-mail, PubMed abstracts, and the DHS's Information Analysis and Infrastructure Protection (IAIP) news clippings. In order to address the challenges involved in incorporating such large amounts of unstructured text, the overall system is focused on precise extraction of the most relevant information for inclusion in the BKC.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
877921
Report Number(s):
UCRL-CONF-213354; TRN: US200608%%756
Resource Relation:
Conference: Presented at: Working Together: Research & Development Partnerships in Homeland Security, Boston, MA, United States, Apr 25 - Apr 26, 2005
Country of Publication:
United States
Language:
English

Similar Records

Biodefense Knowledge Management System
Technical Report · Fri Mar 13 00:00:00 EDT 2020 · OSTI ID:877921

NBIC Biofeeds: Deploying a New, Digital Tool for Open Source Biosurveillance across Federal Agencies
Journal Article · Tue May 22 00:00:00 EDT 2018 · Online Journal of Public Health Informatics · OSTI ID:877921

A U.S. Biodefense Strategy Primer
Technical Report · Mon May 11 00:00:00 EDT 2009 · OSTI ID:877921