skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A novel web informatics approach for automated surveillance of cancer mortality trends

Abstract

Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation’s progress against cancer. The objective of this study was to apply a novel web informatics approach for enabling fully automated monitoring of cancer mortality trends. The approach involves automated collection and text mining of online obituaries to derive the age distribution, geospatial, and temporal trends of cancer deaths in the US. Using breast and lung cancer as examples, we mined 23,850 cancer-related and 413,024 general online obituaries spanning the timeframe 2008–2012. There was high correlation between the web-derived mortality trends and the official surveillance statistics reported by NCI with respect to the age distribution (ρ = 0.981 for breast; ρ = 0.994 for lung), the geospatial distribution (ρ = 0.939 for breast; ρ = 0.881 for lung), and the annual rates of cancer deaths (ρ = 0.661 for breast; ρ = 0.839 for lung). Additional experiments investigated the effect of sample size on the consistency of the web-based findings. Altogether, our study findings support web informatics asmore » a promising, cost-effective way to dynamically monitor spatiotemporal cancer mortality trends.« less

Authors:
ORCiD logo [1];  [1];  [2]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  2. New Jersey Institute of Technology, Newark, NJ (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
Work for Others (WFO); USDOE
OSTI Identifier:
1361096
Alternate Identifier(s):
OSTI ID: 1334473; OSTI ID: 1342973
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Journal Article: Published Article
Journal Name:
Journal of Biomedical Informatics
Additional Journal Information:
Journal Volume: 61; Journal Issue: C; Journal ID: ISSN 1532-0464
Country of Publication:
United States
Language:
English
Subject:
web informatics; web mining; digital epidemiology; cancer mortality; breast cancer; lung cancer

Citation Formats

Tourassi, Georgia, Yoon, Hong -Jun, and Xu, Songhua. A novel web informatics approach for automated surveillance of cancer mortality trends. United States: N. p., 2016. Web. doi:10.1016/j.jbi.2016.03.027.
Tourassi, Georgia, Yoon, Hong -Jun, & Xu, Songhua. A novel web informatics approach for automated surveillance of cancer mortality trends. United States. doi:10.1016/j.jbi.2016.03.027.
Tourassi, Georgia, Yoon, Hong -Jun, and Xu, Songhua. Fri . "A novel web informatics approach for automated surveillance of cancer mortality trends". United States. doi:10.1016/j.jbi.2016.03.027.
@article{osti_1361096,
title = {A novel web informatics approach for automated surveillance of cancer mortality trends},
author = {Tourassi, Georgia and Yoon, Hong -Jun and Xu, Songhua},
abstractNote = {Cancer surveillance data are collected every year in the United States via the National Program of Cancer Registries (NPCR) and the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI). General trends are closely monitored to measure the nation’s progress against cancer. The objective of this study was to apply a novel web informatics approach for enabling fully automated monitoring of cancer mortality trends. The approach involves automated collection and text mining of online obituaries to derive the age distribution, geospatial, and temporal trends of cancer deaths in the US. Using breast and lung cancer as examples, we mined 23,850 cancer-related and 413,024 general online obituaries spanning the timeframe 2008–2012. There was high correlation between the web-derived mortality trends and the official surveillance statistics reported by NCI with respect to the age distribution (ρ = 0.981 for breast; ρ = 0.994 for lung), the geospatial distribution (ρ = 0.939 for breast; ρ = 0.881 for lung), and the annual rates of cancer deaths (ρ = 0.661 for breast; ρ = 0.839 for lung). Additional experiments investigated the effect of sample size on the consistency of the web-based findings. Altogether, our study findings support web informatics as a promising, cost-effective way to dynamically monitor spatiotemporal cancer mortality trends.},
doi = {10.1016/j.jbi.2016.03.027},
journal = {Journal of Biomedical Informatics},
issn = {1532-0464},
number = C,
volume = 61,
place = {United States},
year = {2016},
month = {4}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record at 10.1016/j.jbi.2016.03.027

Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

Save / Share: