Text-based Analytics for Biosurveillance

Charles, Lauren E.; Smith, William P.; Rounds, Jeremiah; Mendoza, Joshua A.

doi:10.1007/978-3-319-77911-9

Text-based Analytics for Biosurveillance

Book · Wed May 16 00:00:00 EDT 2018

DOI:https://doi.org/10.1007/978-3-319-77911-9· OSTI ID:1440619

Charles, Lauren E.; Smith, William P.; Rounds, Jeremiah; Mendoza, Joshua A.

The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline, includes real-time automated data analysis founded on natural language processing (NLP), semantic concept matching, and machine learning techniques, to enrich content with metadata related to biosurveillance. Online news articles are presented as an example use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when). The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline, includes real-time automated data analysis founded on natural language processing (NLP), semantic concept matching, and machine learning techniques, to enrich content with metadata related to biosurveillance. Online news articles are presented as an example use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when).

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (US)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1440619

Report Number(s):: PNNL-SA-126938

Country of Publication:: United States

Language:: English

Similar Records

Machine Learning for Identifying Relevance to Biosurveillance in Multilingual Text

Journal Article · Tue May 22 00:00:00 EDT 2018 · Online Journal of Public Health Informatics · OSTI ID:1629052

An Overview of Internet biosurveillance

Journal Article · Fri Jun 21 00:00:00 EDT 2013 · Clinical Microbiology and Infection, 19(11):1006-1013 · OSTI ID:1132700

The Landscape of International Biosurveillance

Journal Article · Sun Jan 31 23:00:00 EST 2010 · Emerging Health Threats Journal, 3:e3 · OSTI ID:992356

Related Subjects

Biomedical NLP
Machine Learning
Ontology
Text Analytics
biosurveillance
feature extraction

Text-based Analytics for Biosurveillance

Citation Formats

Similar Records

Related Subjects