DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Global disease monitoring and forecasting with Wikipedia

Abstract

Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data, such as social media and search queries, are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoringmore » and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.« less

Authors:
 [1];  [1];  [1];  [1];  [1];  [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1214710
Grant/Contract Number:  
AC52-06NA25396
Resource Type:
Accepted Manuscript
Journal Name:
PLoS Computational Biology (Online)
Additional Journal Information:
Journal Name: PLoS Computational Biology (Online); Journal Volume: 10; Journal Issue: 11; Journal ID: ISSN 1553-7358
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; online encyclopedias; forecasting; influenza; disease surveillance; language; internet; infectious disease surveillance; machine learning algorithims

Citation Formats

Generous, Nicholas, Fairchild, Geoffrey, Deshpande, Alina, Del Valle, Sara Y., Priedhorsky, Reid, and Salathé, Marcel. Global disease monitoring and forecasting with Wikipedia. United States: N. p., 2014. Web. doi:10.1371/journal.pcbi.1003892.
Generous, Nicholas, Fairchild, Geoffrey, Deshpande, Alina, Del Valle, Sara Y., Priedhorsky, Reid, & Salathé, Marcel. Global disease monitoring and forecasting with Wikipedia. United States. https://doi.org/10.1371/journal.pcbi.1003892
Generous, Nicholas, Fairchild, Geoffrey, Deshpande, Alina, Del Valle, Sara Y., Priedhorsky, Reid, and Salathé, Marcel. Thu . "Global disease monitoring and forecasting with Wikipedia". United States. https://doi.org/10.1371/journal.pcbi.1003892. https://www.osti.gov/servlets/purl/1214710.
@article{osti_1214710,
title = {Global disease monitoring and forecasting with Wikipedia},
author = {Generous, Nicholas and Fairchild, Geoffrey and Deshpande, Alina and Del Valle, Sara Y. and Priedhorsky, Reid and Salathé, Marcel},
abstractNote = {Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data, such as social media and search queries, are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.},
doi = {10.1371/journal.pcbi.1003892},
journal = {PLoS Computational Biology (Online)},
number = 11,
volume = 10,
place = {United States},
year = {Thu Nov 13 00:00:00 EST 2014},
month = {Thu Nov 13 00:00:00 EST 2014}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 95 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

MapReduce: simplified data processing on large clusters
journal, January 2008

  • Dean, Jeffrey; Ghemawat, Sanjay; Mehta, Brijesh
  • Communications of the ACM, Vol. 51, Issue 1
  • DOI: 10.1145/1327452.1327492

Eye-Opening Approach to Norovirus Surveillance
journal, August 2010

  • Hulth, Anette; Andersson, Yvonne; Hedlund, Kjell-Olof
  • Emerging Infectious Diseases, Vol. 16, Issue 8
  • DOI: 10.3201/eid1608.100093

Public Anxiety and Information Seeking Following the H1N1 Outbreak: Blogs, Newspaper Articles, and Wikipedia Visits
journal, August 2011


An Evaluation of Wikipedia as a Resource for Patient Education in Nephrology: WIKIPEDIA FOR RENAL PATIENT EDUCATION
journal, February 2013

  • Thomas, Garry R.; Eng, Lawson; de Wolff, Jacob F.
  • Seminars in Dialysis, Vol. 26, Issue 2
  • DOI: 10.1111/sdi.12059

Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time
journal, April 2014


Real-time influenza forecasts during the 2012–2013 season
journal, December 2013

  • Shaman, Jeffrey; Karspeck, Alicia; Yang, Wan
  • Nature Communications, Vol. 4, Issue 1
  • DOI: 10.1038/ncomms3837

Seasonality in Seeking Mental Health Information on Google
journal, May 2013

  • Ayers, John W.; Althouse, Benjamin M.; Allem, Jon-Patrick
  • American Journal of Preventive Medicine, Vol. 44, Issue 5
  • DOI: 10.1016/j.amepre.2013.01.012

BioCaster: detecting public health rumors with a Web-based text mining system
journal, October 2008


Early detection of disease outbreaks using the Internet
journal, March 2009

  • Wilson, K.; Brownstein, J. S.
  • Canadian Medical Association Journal, Vol. 180, Issue 8
  • DOI: 10.1503/cmaj.1090215

Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data
journal, August 2013


Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance
journal, May 2011


Google Trends: A Web‐Based Tool for Real‐Time Surveillance of Disease Outbreaks
journal, November 2009

  • Carneiro, Herman Anthony; Mylonakis, Eleftherios
  • Clinical Infectious Diseases, Vol. 49, Issue 10
  • DOI: 10.1086/630200

Using Internet Searches for Influenza Surveillance
journal, December 2008

  • Polgreen, Philip M.; Chen, Yiling; Pennock, David M.
  • Clinical Infectious Diseases, Vol. 47, Issue 11
  • DOI: 10.1086/593098

Social and News Media Enable Estimation of Epidemiological Patterns Early in the 2010 Haitian Cholera Outbreak
journal, January 2012

  • Chunara, Rumi; Andrews, Jason R.; Brownstein, John S.
  • The American Journal of Tropical Medicine and Hygiene, Vol. 86, Issue 1
  • DOI: 10.4269/ajtmh.2012.11-0597

Analysis and forecasting of trending topics in online media streams
conference, January 2013

  • Althoff, Tim; Borth, Damian; Hees, Jörn
  • Proceedings of the 21st ACM international conference on Multimedia - MM '13
  • DOI: 10.1145/2502081.2502117

Use of Hangeul Twitter to Track and Predict Human Influenza Infection
journal, July 2013


Internet Search Patterns of Human Immunodeficiency Virus and the Digital Divide in the Russian Federation: Infoveillance Study
journal, January 2013

  • Zheluk, Andrey; Quinn, Casey; Hercz, Daniel
  • Journal of Medical Internet Research, Vol. 15, Issue 11
  • DOI: 10.2196/jmir.2936

Systematic Review: Surveillance Systems for Early Detection of Bioterrorism-Related Diseases
journal, June 2004


Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales
journal, October 2013


Global and regional burden of disease and risk factors, 2001: systematic analysis of population health data
journal, May 2006


A neural netwok based approach to detect influenza epidemics using search engine query data
conference, July 2010

  • Xu, Wei; Han, Zhen-Wen; Ma, Jian
  • 2010 International Conference on Machine Learning and Cybernetics (ICMLC)
  • DOI: 10.1109/ICMLC.2010.5580851

Internet Queries and Methicillin-Resistant Staphylococcus aureus Surveillance
journal, June 2011

  • Dukic, Vanja M.; David, Michael Z.; Lauderdale, Diane S.
  • Emerging Infectious Diseases, Vol. 17, Issue 6
  • DOI: 10.3201/eid/1706.101451

Tracking the flu pandemic by monitoring the social web
conference, June 2010

  • Lampos, Vasileios; Cristianini, Nello
  • 2010 2nd International Workshop on Cognitive Information Processing (CIP)
  • DOI: 10.1109/CIP.2010.5604088

Creating, destroying, and restoring value in wikipedia
conference, January 2007

  • Priedhorsky, Reid; Chen, Jilin; Lam, Shyong (Tony) K.
  • Proceedings of the 2007 international ACM conference on Conference on supporting group work - GROUP '07
  • DOI: 10.1145/1316624.1316663

More Diseases Tracked by Using Google Trends
journal, August 2009

  • Pelat, Camille; Turbelin, Clément; Bar-Hen, Avner
  • Emerging Infectious Diseases, Vol. 15, Issue 8
  • DOI: 10.3201/eid1508.090299

Norovirus Disease Surveillance Using Google Internet Query Share Data
journal, June 2012

  • Desai, Rishi; Hall, Aron J.; Lopman, Benjamin A.
  • Clinical Infectious Diseases, Vol. 55, Issue 8
  • DOI: 10.1093/cid/cis579

Gonorrhea incidence forecasting research based on Baidu search data
conference, July 2013

  • Jia-xing, Bao; Bcn-fu, Lv; Geng, Peng
  • 2013 International Conference on Management Science and Engineering (ICMSE), 2013 International Conference on Management Science and Engineering 20th Annual Conference Proceedings
  • DOI: 10.1109/ICMSE.2013.6586259

Internet suicide searches and the incidence of suicide in young people in Japan
journal, April 2011

  • Hagihara, Akihito; Miyazaki, Shogo; Abe, Takeru
  • European Archives of Psychiatry and Clinical Neuroscience, Vol. 262, Issue 1
  • DOI: 10.1007/s00406-011-0212-8

Can electoral popularity be predicted using socially generated big data?
journal, January 2014


Monitoring Influenza Epidemics in China with Search Query from Baidu
journal, May 2013


National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic
journal, December 2013


Using search queries for malaria surveillance, Thailand
journal, January 2013


Influenza Forecasting with Google Flu Trends
journal, February 2013


Predicting Flu Trends using Twitter data
conference, April 2011

  • Achrekar, Harshavardhan; Gandhe, Avinash; Lazarus, Ross
  • IEEE INFOCOM 2011 - IEEE Conference on Computer Communications Workshops, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)
  • DOI: 10.1109/INFCOMW.2011.5928903

WikiPop: personalized event detection system based on Wikipedia page view statistics
conference, January 2010

  • Ciglan, Marek; Nørvåg, Kjetil
  • Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10
  • DOI: 10.1145/1871437.1871769

Association of Internet search trends with suicide death in Taipei City, Taiwan, 2004–2009
journal, July 2011

  • Yang, Albert C.; Tsai, Shi-Jen; Huang, Norden E.
  • Journal of Affective Disorders, Vol. 132, Issue 1-2
  • DOI: 10.1016/j.jad.2011.01.019

Quality of Information on the Internet About Carpal Tunnel Syndrome: An Update
journal, August 2013


Prediction of Infectious Disease Spread Using Twitter: A Case of Influenza
conference, December 2012

  • Hirose, Hideo; Wang, Liangliang
  • 2012 Fifth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)
  • DOI: 10.1109/PAAP.2012.23

Using Google Trends for Influenza Surveillance in South China
journal, January 2013


The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic
journal, May 2011


Categorization, Prioritization, and Surveillance of Potential Bioterrorism Agents
journal, June 2006

  • Borchardt, Stephanie M.; Ritger, Kathleen A.; Dworkin, Mark S.
  • Infectious Disease Clinics of North America, Vol. 20, Issue 2
  • DOI: 10.1016/j.idc.2006.02.005

Influences, usage, and outcomes of Internet health information searching: Multivariate results from the Pew surveys
journal, January 2006


Wikipedia and osteosarcoma: a trustworthy patients' information?
journal, July 2010

  • Leithner, A.; Maurer-Ertl, W.; Glehr, M.
  • Journal of the American Medical Informatics Association, Vol. 17, Issue 4
  • DOI: 10.1136/jamia.2010.004507

Internet encyclopaedias go head to head
journal, December 2005


Accuracy and completeness of drug information in Wikipedia: an assessment
journal, October 2011

  • Kupferberg, Natalie; Protus, Bridget McCrate
  • Journal of the Medical Library Association : JMLA, Vol. 99, Issue 4
  • DOI: 10.3163/1536-5050.99.4.010

Head Lice Surveillance on a Deregulated OTC-Sales Market: A Study Using Web Query Data
journal, November 2012


Use of Google Insights for Search to Track Seasonal and Geographic Kidney Stone Incidence in the United States
journal, August 2011


Online reporting for malaria surveillance using micro-monetary incentives, in urban India 2010-2011
journal, February 2012


The annual impact of seasonal influenza in the US: Measuring disease burden and costs
journal, June 2007

  • Molinari, Noelle-Angelique M.; Ortega-Sanchez, Ismael R.; Messonnier, Mark L.
  • Vaccine, Vol. 25, Issue 27, p. 5086-5096
  • DOI: 10.1016/j.vaccine.2007.03.046

Prediction of Dengue Incidence Using Search Query Surveillance
journal, August 2011

  • Althouse, Benjamin M.; Ng, Yih Yng; Cummings, Derek A. T.
  • PLoS Neglected Tropical Diseases, Vol. 5, Issue 8
  • DOI: 10.1371/journal.pntd.0001258

The utility of “Google Trends” for epidemiological research: Lyme disease as an example
journal, May 2010

  • Seifter, Ari; Schwarzwalder, Alison; Geis, Kate
  • Geospatial health, Vol. 4, Issue 2
  • DOI: 10.4081/gh.2010.195

Flu Near You: An Online Self-reported Influenza Surveillance System in the USA
journal, March 2013

  • Chunara, Rumi; Aman, Susan; Smolinski, Mark
  • Online Journal of Public Health Informatics, Vol. 5, Issue 1
  • DOI: 10.5210/ojphi.v5i1.4456

Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages
journal, May 2012


Correlation between National Influenza Surveillance Data and Google Trends in South Korea
journal, December 2013


Web Queries as a Source for Syndromic Surveillance
journal, February 2009


Notifiable infectious disease surveillance with data collected by search engine
journal, April 2010

  • Zhou, Xi-chuan; Shen, Hai-bin
  • Journal of Zhejiang University SCIENCE C, Vol. 11, Issue 4
  • DOI: 10.1631/jzus.C0910371

The Complex Relationship of Realspace Events and Messages in Cyberspace: Case Study of Influenza and Pertussis Using Tweets
journal, January 2013

  • Nagel, Anna C.; Tsou, Ming-Hsiang; Spitzberg, Brian H.
  • Journal of Medical Internet Research, Vol. 15, Issue 10
  • DOI: 10.2196/jmir.2705

When Google got flu wrong
journal, February 2013


HealthMap: Global Infectious Disease Monitoring through Automated Classification and Visualization of Internet Media Reports
journal, March 2008

  • Freifeld, C. C.; Mandl, K. D.; Reis, B. Y.
  • Journal of the American Medical Informatics Association, Vol. 15, Issue 2
  • DOI: 10.1197/jamia.M2544

Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses
conference, September 2012

  • Doan, Son; Ohno-Machado, Lucila; Collier, Nigel
  • 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB)
  • DOI: 10.1109/HISB.2012.21

Internet Queries and Methicillin-Resistant Staphylococcus aureus Surveillance
journal, June 2011

  • Dukic, Vanja M.; David, Michael Z.; Lauderdale, Diane S.
  • Emerging Infectious Diseases, Vol. 17, Issue 6
  • DOI: 10.3201/eid1706.101451

Early detection of disease outbreaks using the Internet
journal, April 2009

  • Wilson, K.; Brownstein, J. S.
  • Canadian Medical Association Journal, Vol. 180, Issue 8
  • DOI: 10.1503/cmaj.090215

Can electoral popularity be predicted using socially generated big data?
preprint, January 2013


Using Web Mining for Discovering Spatial Patterns and Hot Spots for Spatial Generalization
book, January 2012


The bioterrorism preparedness and response Early Aberration Reporting System (EARS)
journal, March 2003

  • Hutwagner, Lori; Thompson, William; Seeman, G. Matthew
  • Journal of Urban Health, Vol. 80, Issue S1
  • DOI: 10.1007/pl00022319

Seasonality in Seeking Mental Health Information on Google
journal, May 2013

  • Ayers, John W.; Althouse, Benjamin M.; Allem, Jon-Patrick
  • American Journal of Preventive Medicine, Vol. 44, Issue 5
  • DOI: 10.1016/j.amepre.2013.01.012

Categorization, Prioritization, and Surveillance of Potential Bioterrorism Agents
journal, June 2006

  • Borchardt, Stephanie M.; Ritger, Kathleen A.; Dworkin, Mark S.
  • Infectious Disease Clinics of North America, Vol. 20, Issue 2
  • DOI: 10.1016/j.idc.2006.02.005

Association of Internet search trends with suicide death in Taipei City, Taiwan, 2004–2009
journal, July 2011

  • Yang, Albert C.; Tsai, Shi-Jen; Huang, Norden E.
  • Journal of Affective Disorders, Vol. 132, Issue 1-2
  • DOI: 10.1016/j.jad.2011.01.019

Use of Google Insights for Search to Track Seasonal and Geographic Kidney Stone Incidence in the United States
journal, August 2011


The annual impact of seasonal influenza in the US: Measuring disease burden and costs
journal, June 2007

  • Molinari, Noelle-Angelique M.; Ortega-Sanchez, Ismael R.; Messonnier, Mark L.
  • Vaccine, Vol. 25, Issue 27, p. 5086-5096
  • DOI: 10.1016/j.vaccine.2007.03.046

Real-time influenza forecasts during the 2012–2013 season
journal, December 2013

  • Shaman, Jeffrey; Karspeck, Alicia; Yang, Wan
  • Nature Communications, Vol. 4, Issue 1
  • DOI: 10.1038/ncomms3837

Quantifying Wikipedia Usage Patterns Before Stock Market Moves
journal, May 2013

  • Moat, Helen Susannah; Curme, Chester; Avakian, Adam
  • Scientific Reports, Vol. 3, Issue 1
  • DOI: 10.1038/srep01801

Public Anxiety and Information Seeking Following the H1N1 Outbreak: Blogs, Newspaper Articles, and Wikipedia Visits
journal, August 2011


Google Trends: A Web‐Based Tool for Real‐Time Surveillance of Disease Outbreaks
journal, November 2009

  • Carneiro, Herman Anthony; Mylonakis, Eleftherios
  • Clinical Infectious Diseases, Vol. 49, Issue 10
  • DOI: 10.1086/630200

BioCaster: detecting public health rumors with a Web-based text mining system
journal, October 2008


Norovirus Disease Surveillance Using Google Internet Query Share Data
journal, June 2012

  • Desai, Rishi; Hall, Aron J.; Lopman, Benjamin A.
  • Clinical Infectious Diseases, Vol. 55, Issue 8
  • DOI: 10.1093/cid/cis579

Monitoring Epidemic Alert Levels by Analyzing Internet Search Volume
journal, February 2013

  • Zhou, Xichuan; Li, Qin; Zhu, Zhenglin
  • IEEE Transactions on Biomedical Engineering, Vol. 60, Issue 2
  • DOI: 10.1109/tbme.2012.2228264

Wikipedia and osteosarcoma: a trustworthy patients' information?
journal, July 2010

  • Leithner, A.; Maurer-Ertl, W.; Glehr, M.
  • Journal of the American Medical Informatics Association, Vol. 17, Issue 4
  • DOI: 10.1136/jamia.2010.004507

MapReduce: simplified data processing on large clusters
journal, January 2008

  • Dean, Jeffrey; Ghemawat, Sanjay; Mehta, Brijesh
  • Communications of the ACM, Vol. 51, Issue 1
  • DOI: 10.1145/1327452.1327492

Nowcasting Events from the Social Web with Statistical Learning
journal, September 2012

  • Lampos, Vasileios; Cristianini, Nello
  • ACM Transactions on Intelligent Systems and Technology, Vol. 3, Issue 4
  • DOI: 10.1145/2337542.2337557

An approach for using Wikipedia to measure the flow of trends across countries
conference, May 2013

  • Tinati, Ramine; Tiropanis, Thanassis; Carr, Lesie
  • Proceedings of the 22nd International Conference on World Wide Web
  • DOI: 10.1145/2487788.2488177

Online reporting for malaria surveillance using micro-monetary incentives, in urban India 2010-2011
journal, February 2012


Using search queries for malaria surveillance, Thailand
journal, January 2013


Seeking Health Information Online: Does Wikipedia Matter?
journal, July 2009

  • Laurent, M. R.; Vickers, T. J.
  • Journal of the American Medical Informatics Association, Vol. 16, Issue 4
  • DOI: 10.1197/jamia.m3059

Patient-Oriented Cancer Information on the Internet: A Comparison of Wikipedia and a Professionally Maintained Database
journal, September 2011

  • Rajagopalan, Malolan S.; Khanna, Vineet K.; Leiter, Yaacov
  • Journal of Oncology Practice, Vol. 7, Issue 5
  • DOI: 10.1200/jop.2010.000209

Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales
journal, October 2013


Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time
journal, April 2014


A New Approach to Monitoring Dengue Activity
journal, May 2011

  • Madoff, Lawrence C.; Fisman, David N.; Kass-Hout, Taha
  • PLoS Neglected Tropical Diseases, Vol. 5, Issue 5
  • DOI: 10.1371/journal.pntd.0001215

Prediction of Dengue Incidence Using Search Query Surveillance
journal, August 2011

  • Althouse, Benjamin M.; Ng, Yih Yng; Cummings, Derek A. T.
  • PLoS Neglected Tropical Diseases, Vol. 5, Issue 8
  • DOI: 10.1371/journal.pntd.0001258

Head Lice Surveillance on a Deregulated OTC-Sales Market: A Study Using Web Query Data
journal, November 2012


Using Google Trends for Influenza Surveillance in South China
journal, January 2013


Monitoring Influenza Epidemics in China with Search Query from Baidu
journal, May 2013


Use of Hangeul Twitter to Track and Predict Human Influenza Infection
journal, July 2013


Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data
journal, August 2013


Correlation between National Influenza Surveillance Data and Google Trends in South Korea
journal, December 2013


National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic
journal, December 2013


Early detection of disease outbreaks using the Internet
journal, March 2009

  • Wilson, K.; Brownstein, J. S.
  • Canadian Medical Association Journal, Vol. 180, Issue 8
  • DOI: 10.1503/cmaj.1090215

The Complex Relationship of Realspace Events and Messages in Cyberspace: Case Study of Influenza and Pertussis Using Tweets
journal, January 2013

  • Nagel, Anna C.; Tsou, Ming-Hsiang; Spitzberg, Brian H.
  • Journal of Medical Internet Research, Vol. 15, Issue 10
  • DOI: 10.2196/jmir.2705

Internet Search Patterns of Human Immunodeficiency Virus and the Digital Divide in the Russian Federation: Infoveillance Study
journal, January 2013

  • Zheluk, Andrey; Quinn, Casey; Hercz, Daniel
  • Journal of Medical Internet Research, Vol. 15, Issue 11
  • DOI: 10.2196/jmir.2936

Accuracy and completeness of drug information in Wikipedia: an assessment
journal, October 2011

  • Kupferberg, Natalie; Protus, Bridget McCrate
  • Journal of the Medical Library Association : JMLA, Vol. 99, Issue 4
  • DOI: 10.3163/1536-5050.99.4.010

Determination of geographic variance in stroke prevalence using Internet search engine analytics
journal, June 2011

  • Walcott, Brian P.; Nahed, Brian V.; Kahle, Kristopher T.
  • Neurosurgical Focus, Vol. 30, Issue 6
  • DOI: 10.3171/2011.2.focus1124

More Diseases Tracked by Using Google Trends
journal, August 2009

  • Pelat, Camille; Turbelin, Clément; Bar-Hen, Avner
  • Emerging Infectious Diseases, Vol. 15, Issue 8
  • DOI: 10.3201/eid1508.090299

Eye-Opening Approach to Norovirus Surveillance
journal, August 2010

  • Hulth, Anette; Andersson, Yvonne; Hedlund, Kjell-Olof
  • Emerging Infectious Diseases, Vol. 16, Issue 8
  • DOI: 10.3201/eid1608.100093

Internet Queries and Methicillin-Resistant Staphylococcus aureus Surveillance
journal, June 2011

  • Dukic, Vanja M.; David, Michael Z.; Lauderdale, Diane S.
  • Emerging Infectious Diseases, Vol. 17, Issue 6
  • DOI: 10.3201/eid1706.101451

Quality of Information on the Internet About Carpal Tunnel Syndrome: An Update
journal, August 2013


The utility of “Google Trends” for epidemiological research: Lyme disease as an example
journal, May 2010

  • Seifter, Ari; Schwarzwalder, Alison; Geis, Kate
  • Geospatial health, Vol. 4, Issue 2
  • DOI: 10.4081/gh.2010.195

Social and News Media Enable Estimation of Epidemiological Patterns Early in the 2010 Haitian Cholera Outbreak
journal, January 2012

  • Chunara, Rumi; Andrews, Jason R.; Brownstein, John S.
  • The American Journal of Tropical Medicine and Hygiene, Vol. 86, Issue 1
  • DOI: 10.4269/ajtmh.2012.11-0597

Modeling page-view dynamics on Wikipedia
preprint, January 2012


Can electoral popularity be predicted using socially generated big data?
preprint, January 2013


Flu Near You: An Online Self-reported Influenza Surveillance System in the USA
journal, March 2013

  • Chunara, Rumi; Aman, Susan; Smolinski, Mark
  • Online Journal of Public Health Informatics, Vol. 5, Issue 1
  • DOI: 10.5210/ojphi.v5i1.4456

Systematic Review: Surveillance Systems for Early Detection of Bioterrorism-Related Diseases
journal, June 2004


Works referencing / citing this record:

Big Data for Policymaking: Great Expectations, but with Limited Progress?: Big Data for Policymaking
journal, July 2018

  • Poel, Martijn; Meyer, Eric T.; Schroeder, Ralph
  • Policy & Internet, Vol. 10, Issue 3
  • DOI: 10.1002/poi3.176

Inspiration, Captivation, and Misdirection: Emergent Properties in Networks of Online Navigation
book, January 2018


Disease surveillance based on Internet-based linear models: an Australian case study of previously unmodeled infection diseases
journal, December 2016

  • Rohart, Florian; Milinovich, Gabriel J.; Avril, Simon M. R.
  • Scientific Reports, Vol. 6, Issue 1
  • DOI: 10.1038/srep38522

Digital Pharmacovigilance and Disease Surveillance: Combining Traditional and Big-Data Systems for Better Public Health
journal, November 2016


Mind the Scales: Harnessing Spatial Big Data for Infectious Disease Surveillance and Inference
journal, November 2016

  • Lee, Elizabeth C.; Asher, Jason M.; Goldlust, Sandra
  • Journal of Infectious Diseases, Vol. 214, Issue suppl 4
  • DOI: 10.1093/infdis/jiw344

Epidemic Forecasting is Messier Than Weather Forecasting: The Role of Human Behavior and Internet Data Streams in Epidemic Forecast
journal, November 2016

  • Moran, Kelly R.; Fairchild, Geoffrey; Generous, Nicholas
  • Journal of Infectious Diseases, Vol. 214, Issue suppl 4
  • DOI: 10.1093/infdis/jiw375

Improved real-time influenza surveillance using Internet search data in eight Latin American countries
journal, September 2018

  • Clemente, Leonardo; Lu, Fred; Santillana, Mauricio
  • JMIR Public Health and Surveillance
  • DOI: 10.1101/418475

Evolution of Wikipedia’s medical content: past, present and future
journal, August 2017

  • Shafee, Thomas; Masukume, Gwinyai; Kipersztok, Lisa
  • Journal of Epidemiology and Community Health
  • DOI: 10.1136/jech-2016-208601

Enhancing disease surveillance with novel data streams: challenges and opportunities
journal, October 2015


Uncovering the relationships between military community health and affects expressed in social media
journal, June 2017


Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda
conference, January 2017

  • Priedhorsky, Reid; Osthus, Dave; Daughton, Ashlynn R.
  • Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW '17
  • DOI: 10.1145/2998181.2998183

Using Participatory Web-based Surveillance Data to Improve Seasonal Influenza Forecasting in Italy
conference, April 2017

  • Perrotta, Daniela; Tizzoni, Michele; Paolotti, Daniela
  • WWW '17: 26th International World Wide Web Conference, Proceedings of the 26th International Conference on World Wide Web
  • DOI: 10.1145/3038912.3052670

Using electronic health records and Internet search information for accurate influenza forecasting
journal, May 2017

  • Yang, Shihao; Santillana, Mauricio; Brownstein, John S.
  • BMC Infectious Diseases, Vol. 17, Issue 1
  • DOI: 10.1186/s12879-017-2424-7

Summary results of the 2014-2015 DARPA Chikungunya challenge
journal, May 2018

  • Del Valle, Sara Y.; McMahon, Benjamin H.; Asher, Jason
  • BMC Infectious Diseases, Vol. 18, Issue 1
  • DOI: 10.1186/s12879-018-3124-7

Forecasting the 2013–2014 Influenza Season Using Wikipedia
journal, May 2015


Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
journal, October 2015


Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions
journal, June 2018


Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited
journal, February 2019


Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level
journal, July 2017

  • Marques-Toledo, Cecilia de Almeida; Degener, Carolin Marlen; Vinhal, Livia
  • PLOS Neglected Tropical Diseases, Vol. 11, Issue 7
  • DOI: 10.1371/journal.pntd.0005729

Internet-based biosurveillance methods for vector-borne diseases: Are they novel public health tools or just novelties?
journal, November 2017

  • Pollett, Simon; Althouse, Benjamin M.; Forshey, Brett
  • PLOS Neglected Tropical Diseases, Vol. 11, Issue 11
  • DOI: 10.1371/journal.pntd.0005871

Supplementing Public Health Inspection via Social Media
journal, March 2016


Forecasting influenza-like illness dynamics for military populations using neural networks and social media
journal, December 2017


Real Time Influenza Monitoring Using Hospital Big Data in Combination with Machine Learning Methods: Comparison Study
journal, January 2018

  • Poirier, Canelle; Lavenu, Audrey; Bertaud, Valérie
  • JMIR Public Health and Surveillance, Vol. 4, Issue 4
  • DOI: 10.2196/11361

Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
journal, January 2019

  • Clemente, Leonardo; Lu, Fred; Santillana, Mauricio
  • JMIR Public Health and Surveillance, Vol. 5, Issue 2
  • DOI: 10.2196/12214

Identifying Protective Health Behaviors on Twitter: Observational Study of Travel Advisories and Zika Virus
journal, January 2019

  • Daughton, Ashlynn R.; Paul, Michael J.
  • Journal of Medical Internet Research, Vol. 21, Issue 5
  • DOI: 10.2196/13090

The Application of Internet-Based Sources for Public Health Surveillance (Infoveillance): Systematic Review
journal, January 2020

  • Barros, Joana M.; Duggan, Jim; Rebholz-Schuhmann, Dietrich
  • Journal of Medical Internet Research, Vol. 22, Issue 3
  • DOI: 10.2196/13680

Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language
journal, January 2015

  • Heilman, James M.; West, Andrew G.
  • Journal of Medical Internet Research, Vol. 17, Issue 3
  • DOI: 10.2196/jmir.4069

Evaluating Google, Twitter, and Wikipedia as Tools for Influenza Surveillance Using Bayesian Change Point Analysis: A Comparative Analysis
journal, January 2016

  • Sharpe, J. Danielle; Hopkins, Richard S.; Cook, Robert L.
  • JMIR Public Health and Surveillance, Vol. 2, Issue 2
  • DOI: 10.2196/publichealth.5901

Determinants of Participants’ Follow-Up and Characterization of Representativeness in Flu Near You, A Participatory Disease Surveillance System
journal, January 2017

  • Baltrusaitis, Kristin; Santillana, Mauricio; Crawley, Adam W.
  • JMIR Public Health and Surveillance, Vol. 3, Issue 2
  • DOI: 10.2196/publichealth.7304

Automated Real-Time Collection of Pathogen-Specific Diagnostic Data: Syndromic Infectious Disease Epidemiology
journal, January 2018

  • Meyers, Lindsay; Ginocchio, Christine C.; Faucett, Aimie N.
  • JMIR Public Health and Surveillance, Vol. 4, Issue 3
  • DOI: 10.2196/publichealth.9876

Social Monitoring for Public Health
journal, August 2017


Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization
journal, September 2018

  • Musa, Ibrahim; Park, Hyun; Munkhdalai, Lkhagvadorj
  • Sustainability, Vol. 10, Issue 10
  • DOI: 10.3390/su10103414

Forecasting Zoonotic Infectious Disease Response to Climate Change: Mosquito Vectors and a Changing Environment
journal, May 2019

  • Bartlow, Andrew W.; Manore, Carrie; Xu, Chonggang
  • Veterinary Sciences, Vol. 6, Issue 2
  • DOI: 10.3390/vetsci6020040

Evolution of Wikipedia’s medical content: past, present and future
text, January 2021


Forecasting the 2013--2014 Influenza Season using Wikipedia
text, January 2014


Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
text, January 2015


Digital Pharmacovigilance and Disease Surveillance: Combining Traditional and Big-Data Systems for Better Public Health
journal, November 2016


Using internet search data to predict new HIV diagnoses in China: a modelling study
journal, October 2018


Enhancing disease surveillance with novel data streams: challenges and opportunities
journal, October 2015


Clinical Age-Specific Seasonal Conjunctivitis Patterns and Their Online Detection in Twitter, Blog, Forum, and Comment Social Media Posts
journal, February 2018

  • Deiner, Michael S.; McLeod, Stephen D.; Chodosh, James
  • Investigative Opthalmology & Visual Science, Vol. 59, Issue 2
  • DOI: 10.1167/iovs.17-22818

Using electronic health records and Internet search information for accurate influenza forecasting
journal, May 2017

  • Yang, Shihao; Santillana, Mauricio; Brownstein, John S.
  • BMC Infectious Diseases, Vol. 17, Issue 1
  • DOI: 10.1186/s12879-017-2424-7

Forecasting the 2013–2014 Influenza Season Using Wikipedia
journal, May 2015


Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
journal, October 2015


Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions
journal, June 2018


Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited
journal, February 2019


Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level
journal, July 2017

  • Marques-Toledo, Cecilia de Almeida; Degener, Carolin Marlen; Vinhal, Livia
  • PLOS Neglected Tropical Diseases, Vol. 11, Issue 7
  • DOI: 10.1371/journal.pntd.0005729

Forecasting influenza-like illness dynamics for military populations using neural networks and social media
journal, December 2017


Real Time Influenza Monitoring Using Hospital Big Data in Combination with Machine Learning Methods: Comparison Study
journal, January 2018

  • Poirier, Canelle; Lavenu, Audrey; Bertaud, Valérie
  • JMIR Public Health and Surveillance, Vol. 4, Issue 4
  • DOI: 10.2196/11361

Improved Real-Time Influenza Surveillance: Using Internet Search Data in Eight Latin American Countries
journal, January 2019

  • Clemente, Leonardo; Lu, Fred; Santillana, Mauricio
  • JMIR Public Health and Surveillance, Vol. 5, Issue 2
  • DOI: 10.2196/12214

The Application of Internet-Based Sources for Public Health Surveillance (Infoveillance): Systematic Review
journal, January 2020

  • Barros, Joana M.; Duggan, Jim; Rebholz-Schuhmann, Dietrich
  • Journal of Medical Internet Research, Vol. 22, Issue 3
  • DOI: 10.2196/13680

Automated Real-Time Collection of Pathogen-Specific Diagnostic Data: Syndromic Infectious Disease Epidemiology
journal, April 2018

  • Myers, Lindsay; Ginocchio, Christine. C.; Faucett, Aimie N.
  • JMIR Public Health and Surveillance, Vol. 4, Issue 3
  • DOI: 10.2196/preprints.9876.a

Evaluating Google, Twitter, and Wikipedia as Tools for Influenza Surveillance Using Bayesian Change Point Analysis: A Comparative Analysis
journal, January 2016

  • Sharpe, J. Danielle; Hopkins, Richard S.; Cook, Robert L.
  • JMIR Public Health and Surveillance, Vol. 2, Issue 2
  • DOI: 10.2196/publichealth.5901

Evolution of Wikipedia’s medical content: past, present and future
text, January 2021


Forecasting Zoonotic Infectious Disease Response to Climate Change: Mosquito Vectors and a Changing Environment
journal, May 2019

  • Bartlow, Andrew W.; Manore, Carrie; Xu, Chonggang
  • Veterinary Sciences, Vol. 6, Issue 2
  • DOI: 10.3390/vetsci6020040

Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
journal, January 2018

  • Park, Hyeoun-Ae; Jung, Hyesil; On, Jeongah
  • Healthcare Informatics Research, Vol. 24, Issue 4
  • DOI: 10.4258/hir.2018.24.4.253

Enhancement of Epidemiological Models for Dengue Fever Based on Twitter Data
preprint, January 2017


Epidemiological data challenges: planning for a more robust future through data standards
text, January 2018


Collective response to the media coverage of COVID-19 Pandemic on Reddit and Wikipedia
preprint, January 2020


Does the blue bird get the flu? using Twitter for flu surveillance
text, January 2017


Design Choices for Automated Disease Surveillance in the Social Web
journal, September 2018

  • Magumba, Mark Abraham; Nabende, Peter; Mwebaze, Ernest
  • Online Journal of Public Health Informatics, Vol. 10, Issue 2
  • DOI: 10.5210/ojphi.v10i2.9312