DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Zika discourse in the Americas: A multilingual topic analysis of Twitter

Abstract

Article Authors Metrics Comments Media Coverage Abstract Introduction Materials and methods Results Discussion Acknowledgments References Reader Comments (0) Media Coverage (0) Figures Abstract This work examines Twitter discussion surrounding the 2015 outbreak of Zika, a virus that is most often mild but has been associated with serious birth defects and neurological syndromes. We introduce and analyze a collection of 3.9 million tweets mentioning Zika geolocated to North and South America, where the virus is most prevalent. Using a multilingual topic model, we automatically identify and extract the key topics of discussion across the dataset in English, Spanish, and Portuguese. We examine the variation in Twitter activity across time and location, finding that rises in activity tend to follow to major events, and geographic rates of Zika-related discussion are moderately correlated with Zika incidence (ρ = .398).

Authors:
 [1]; ORCiD logo [2]; ORCiD logo [3];  [3];  [3];  [3];  [4]
  1. Univ. of Pittsburgh, PA (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  3. Univ. of Colorado, Boulder, CO (United States)
  4. Univ. of Maryland, College Park, MD (United States)
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1526952
Report Number(s):
LA-UR-18-25885
Journal ID: ISSN 1932-6203
Grant/Contract Number:  
89233218CNA000001
Resource Type:
Accepted Manuscript
Journal Name:
PLoS ONE
Additional Journal Information:
Journal Volume: 14; Journal Issue: 5; Journal ID: ISSN 1932-6203
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; Biological Science; Information Science

Citation Formats

Pruss, Dasha, Daughton, Ashlynn Rae, Paul, Michael J., Fujinuma, Yoshinari, Arnot, Brad, Szafir, Danielle Albers, and Boyd-Graber, Jordan. Zika discourse in the Americas: A multilingual topic analysis of Twitter. United States: N. p., 2019. Web. doi:10.1371/journal.pone.0216922.
Pruss, Dasha, Daughton, Ashlynn Rae, Paul, Michael J., Fujinuma, Yoshinari, Arnot, Brad, Szafir, Danielle Albers, & Boyd-Graber, Jordan. Zika discourse in the Americas: A multilingual topic analysis of Twitter. United States. https://doi.org/10.1371/journal.pone.0216922
Pruss, Dasha, Daughton, Ashlynn Rae, Paul, Michael J., Fujinuma, Yoshinari, Arnot, Brad, Szafir, Danielle Albers, and Boyd-Graber, Jordan. Thu . "Zika discourse in the Americas: A multilingual topic analysis of Twitter". United States. https://doi.org/10.1371/journal.pone.0216922. https://www.osti.gov/servlets/purl/1526952.
@article{osti_1526952,
title = {Zika discourse in the Americas: A multilingual topic analysis of Twitter},
author = {Pruss, Dasha and Daughton, Ashlynn Rae and Paul, Michael J. and Fujinuma, Yoshinari and Arnot, Brad and Szafir, Danielle Albers and Boyd-Graber, Jordan},
abstractNote = {Article Authors Metrics Comments Media Coverage Abstract Introduction Materials and methods Results Discussion Acknowledgments References Reader Comments (0) Media Coverage (0) Figures Abstract This work examines Twitter discussion surrounding the 2015 outbreak of Zika, a virus that is most often mild but has been associated with serious birth defects and neurological syndromes. We introduce and analyze a collection of 3.9 million tweets mentioning Zika geolocated to North and South America, where the virus is most prevalent. Using a multilingual topic model, we automatically identify and extract the key topics of discussion across the dataset in English, Spanish, and Portuguese. We examine the variation in Twitter activity across time and location, finding that rises in activity tend to follow to major events, and geographic rates of Zika-related discussion are moderately correlated with Zika incidence (ρ = .398).},
doi = {10.1371/journal.pone.0216922},
journal = {PLoS ONE},
number = 5,
volume = 14,
place = {United States},
year = {Thu May 23 00:00:00 EDT 2019},
month = {Thu May 23 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 20 works
Citation information provided by
Web of Science

Figures / Tables:

Table 1 Table 1: The number of tweets from each country or territory in our Americas dataset, along with the percentage in each language.

Save / Share:

Works referenced in this record:

Early Assessment of Anxiety and Behavioral Response to Novel Swine-Origin Influenza A(H1N1)
journal, December 2009


Computer-Assisted Text Analysis for Comparative Politics
journal, January 2015

  • Lucas, Christopher; Nielsen, Richard A.; Roberts, Margaret E.
  • Political Analysis, Vol. 23, Issue 2
  • DOI: 10.1093/pan/mpu019

Early Assessment of Anxiety and Behavioral Response to Novel Swine-Origin Influenza A(H1N1)
journal, December 2009


Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
journal, October 2015


Zika tweets and topics (2015-03-01 to 2016-10-31)
dataset, January 2019


A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews
journal, November 2014

  • Wallace, Byron C.; Paul, Michael J.; Sarkar, Urmimala
  • Journal of the American Medical Informatics Association, Vol. 21, Issue 6
  • DOI: 10.1136/amiajnl-2014-002711

What are we ‘tweeting’ about obesity? Mapping tweets with topic modeling and Geographic Information System
journal, March 2013

  • Ghosh, Debarchana (Debs); Guha, Rajarshi
  • Cartography and Geographic Information Science, Vol. 40, Issue 2
  • DOI: 10.1080/15230406.2013.776210

The Effect of Population and "Structural" Biases on Social Media-based Algorithms: A Case Study in Geolocation Inference Across the Urban-Rural Spectrum
conference, January 2017

  • Johnson, Isaac; McMahon, Connor; Schöning, Johannes
  • Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17
  • DOI: 10.1145/3025453.3026015

Possible Association Between Zika Virus Infection and Microcephaly — Brazil, 2015
journal, January 2016

  • Schuler-Faccini, Lavinia; Ribeiro, Erlane M.; Feitosa, Ian M. L.
  • MMWR. Morbidity and Mortality Weekly Report, Vol. 65, Issue 3
  • DOI: 10.15585/mmwr.mm6503e2er

Virtual Zika transmission after the first U.S. case: who said what and how it spread on Twitter
journal, May 2018

  • Vijaykumar, Santosh; Nowak, Glen; Himelboim, Itai
  • American Journal of Infection Control, Vol. 46, Issue 5
  • DOI: 10.1016/j.ajic.2017.10.015

Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data
journal, January 2017

  • McGough, Sarah F.; Brownstein, John S.; Hawkins, Jared B.
  • PLOS Neglected Tropical Diseases, Vol. 11, Issue 1
  • DOI: 10.1371/journal.pntd.0005295

Polylingual topic models
conference, January 2009

  • Mimno, David; Wallach, Hanna M.; Naradowsky, Jason
  • Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 2 - EMNLP '09
  • DOI: 10.3115/1699571.1699627

Mass Media and the Contagion of Fear: The Case of Ebola in America
journal, June 2015


Applications of Topic Models
journal, January 2017

  • Boyd-Graber, Jordan; Hu, Yuening; Mimno, David
  • Foundations and Trends® in Information Retrieval, Vol. 11, Issue 2-3
  • DOI: 10.1561/1500000030

Cheap Translation for Cross-Lingual Named Entity Recognition
conference, January 2017

  • Mayhew, Stephen; Tsai, Chen-Tse; Roth, Dan
  • Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
  • DOI: 10.18653/v1/D17-1269

Disease Detection or Public Opinion Reflection? Content Analysis of Tweets, Other Social Media, and Online Newspapers During the Measles Outbreak in the Netherlands in 2013
journal, January 2015

  • Mollema, Liesbeth; Harmsen, Irene Anhai; Broekhuizen, Emma
  • Journal of Medical Internet Research, Vol. 17, Issue 5
  • DOI: 10.2196/jmir.3863

Mining multilingual topics from wikipedia
conference, January 2009

  • Ni, Xiaochuan; Sun, Jian-Tao; Hu, Jian
  • Proceedings of the 18th international conference on World wide web - WWW '09
  • DOI: 10.1145/1526709.1526904

Zika Virus Infection With Prolonged Maternal Viremia and Fetal Brain Abnormalities
journal, March 2017


Possible Association Between Zika Virus Infection and Microcephaly — Brazil, 2015
journal, January 2016

  • Schuler-Faccini, Lavinia; Ribeiro, Erlane M.; Feitosa, Ian M. L.
  • MMWR. Morbidity and Mortality Weekly Report, Vol. 65, Issue 3
  • DOI: 10.15585/mmwr.mm6503e2

Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data
journal, January 2017

  • McGough, Sarah F.; Brownstein, John S.; Hawkins, Jared B.
  • PLOS Neglected Tropical Diseases, Vol. 11, Issue 1
  • DOI: 10.1371/journal.pntd.0005295

Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
text, January 2015


Computer-Assisted Keyword and Document Set Discovery from Unstructured Text: KEYWORD AND DOCUMENT SET DISCOVERY
journal, April 2017

  • King, Gary; Lam, Patrick; Roberts, Margaret E.
  • American Journal of Political Science, Vol. 61, Issue 4
  • DOI: 10.1111/ajps.12291

Detecting themes of public concern: A text mining analysis of the Centers for Disease Control and Prevention's Ebola live Twitter chat
journal, October 2015

  • Lazard, Allison J.; Scheinfeld, Emily; Bernhardt, Jay M.
  • American Journal of Infection Control, Vol. 43, Issue 10
  • DOI: 10.1016/j.ajic.2015.05.025

Redundancy-Aware Topic Modeling for Patient Record Notes
text, January 2014

  • Cohen, Raphael; Aviram, Iddo; Elhadad, Michael
  • Columbia University
  • DOI: 10.7916/d84t6jqv

E-Cigarette Surveillance With Social Media Data: Social Bots, Emerging Topics, and Trends
journal, January 2017

  • Allem, Jon-Patrick; Ferrara, Emilio; Uppu, Sree Priyanka
  • JMIR Public Health and Surveillance, Vol. 3, Issue 4
  • DOI: 10.2196/publichealth.8641

What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention
journal, January 2017

  • Miller, Michele; Banerjee, Tanvi; Muppalla, Roopteja
  • JMIR Public Health and Surveillance, Vol. 3, Issue 2
  • DOI: 10.2196/publichealth.7157

The spread of awareness and its impact on epidemic outbreaks
journal, March 2009

  • Funk, S.; Gilad, E.; Watkins, C.
  • Proceedings of the National Academy of Sciences, Vol. 106, Issue 16
  • DOI: 10.1073/pnas.0810762106

#Healthy Selfies: Exploration of Health Topics on Instagram
journal, January 2018

  • Muralidhara, Sachin; Paul, Michael J.
  • JMIR Public Health and Surveillance, Vol. 4, Issue 2
  • DOI: 10.2196/10150

Possible Association Between Zika Virus Infection and Microcephaly — Brazil, 2015
journal, January 2016

  • Schuler-Faccini, Lavinia; Ribeiro, Erlane M.; Feitosa, Ian M. L.
  • MMWR. Morbidity and Mortality Weekly Report, Vol. 65, Issue 3
  • DOI: 10.15585/mmwr.mm6503e2

Probabilistic topic models
journal, April 2012


Geographic Maldistribution of Primary Care for Children
journal, January 2012


Probabilistic topic models
conference, January 2011

  • Blei, David
  • Proceedings of the 17th ACM SIGKDD International Conference Tutorials on - KDD '11 Tutorials
  • DOI: 10.1145/2107736.2107741

Zika in Twitter: Temporal Variations of Locations, Actors, and Concepts
journal, January 2017

  • Stefanidis, Anthony; Vraga, Emily; Lamprianidis, Georgios
  • JMIR Public Health and Surveillance, Vol. 3, Issue 2
  • DOI: 10.2196/publichealth.6925

The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies
journal, January 2011

  • McCarty, Catherine A.; Chisholm, Rex L.; Chute, Christopher G.
  • BMC Medical Genomics, Vol. 4, Issue 1
  • DOI: 10.1186/1755-8794-4-13

Monitoring Public Health Concerns Using Twitter Sentiment Classifications
conference, September 2013

  • Ji, Xiang; Chun, Soon Ae; Geller, James
  • 2013 IEEE International Conference on Healthcare Informatics (ICHI)
  • DOI: 10.1109/ICHI.2013.47

What are we ‘tweeting’ about obesity? Mapping tweets with topic modeling and Geographic Information System
journal, March 2013

  • Ghosh, Debarchana (Debs); Guha, Rajarshi
  • Cartography and Geographic Information Science, Vol. 40, Issue 2
  • DOI: 10.1080/15230406.2013.776210

Redundancy-Aware Topic Modeling for Patient Record Notes
journal, February 2014


Redundancy-Aware Topic Modeling for Patient Record Notes
journal, February 2014


Global reaction to the recent outbreaks of Zika virus: Insights from a Big Data analysis
journal, September 2017


Risk perception and the media
journal, January 2000


Zika tweets and topics (2015-03-01 to 2016-10-31)
dataset, January 2019


Comparing Apples to Apple: The Effects of Stemmers on Topic Models
journal, December 2016

  • Schofield, Alexandra; Mimno, David
  • Transactions of the Association for Computational Linguistics, Vol. 4
  • DOI: 10.1162/tacl_a_00099

Empirical study of topic modeling in Twitter
conference, January 2010

  • Hong, Liangjie; Davison, Brian D.
  • Proceedings of the First Workshop on Social Media Analytics - SOMA '10
  • DOI: 10.1145/1964858.1964870

Discovering Health Topics in Social Media Using Topic Models
journal, August 2014


The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies
journal, January 2011

  • McCarty, Catherine A.; Chisholm, Rex L.; Chute, Christopher G.
  • BMC Medical Genomics, Vol. 4, Issue 1
  • DOI: 10.1186/1755-8794-4-13

Effective vaccine communication during the disneyland measles outbreak
journal, June 2016


Risk perception and the media
journal, January 2000


Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence?
journal, April 2017

  • Baumer, Eric P. S.; Mimno, David; Guha, Shion
  • Journal of the Association for Information Science and Technology, Vol. 68, Issue 6
  • DOI: 10.1002/asi.23786

Geographic Maldistribution of Primary Care for Children
journal, December 2010


Global reaction to the recent outbreaks of Zika virus: Insights from a Big Data analysis
journal, September 2017


Comparing Apples to Apple: The Effects of Stemmers on Topic Models
journal, December 2016

  • Schofield, Alexandra; Mimno, David
  • Transactions of the Association for Computational Linguistics, Vol. 4
  • DOI: 10.1162/tacl_a_00099

Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence?
journal, April 2017

  • Baumer, Eric P. S.; Mimno, David; Guha, Shion
  • Journal of the Association for Information Science and Technology, Vol. 68, Issue 6
  • DOI: 10.1002/asi.23786

Geographic Maldistribution of Primary Care for Children
journal, December 2010


Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA
journal, August 2010


Discovering Health Topics in Social Media Using Topic Models
journal, August 2014


Statistical machine translation
journal, August 2008


Twitter Improves Influenza Forecasting
journal, January 2014


Social media for large studies of behavior
journal, November 2014


Flu Gone Viral: Syndromic Surveillance of Flu on Twitter Using Temporal Topic Models
conference, December 2014

  • Chen, Liangzhe; Hossain, K. S. M. Tozammel; Butler, Patrick
  • 2014 IEEE International Conference on Data Mining (ICDM)
  • DOI: 10.1109/ICDM.2014.137

Group chats on Twitter
conference, January 2013

  • Cook, James; Kenthapadi, Krishnaram; Mishra, Nina
  • Proceedings of the 22nd international conference on World Wide Web - WWW '13
  • DOI: 10.1145/2488388.2488409

What makes people talk about Ebola on social media? A retrospective analysis of Twitter use
journal, January 2015

  • Rodriguez-Morales, Alfonso J.; Castañeda-Hernández, Diana Milena; McGregor, Alastair
  • Travel Medicine and Infectious Disease, Vol. 13, Issue 1
  • DOI: 10.1016/j.tmaid.2014.11.004

Dictionary-based techniques for cross-language information retrieval
journal, May 2005

  • Levow, Gina-Anne; Oard, Douglas W.; Resnik, Philip
  • Information Processing & Management, Vol. 41, Issue 3
  • DOI: 10.1016/j.ipm.2004.06.012

Dictionary-based techniques for cross-language information retrieval
journal, May 2005

  • Levow, Gina-Anne; Oard, Douglas W.; Resnik, Philip
  • Information Processing & Management, Vol. 41, Issue 3
  • DOI: 10.1016/j.ipm.2004.06.012

What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention
journal, January 2017

  • Miller, Michele; Banerjee, Tanvi; Muppalla, Roopteja
  • JMIR Public Health and Surveillance, Vol. 3, Issue 2
  • DOI: 10.2196/publichealth.7157

The spread of awareness and its impact on epidemic outbreaks
journal, March 2009

  • Funk, S.; Gilad, E.; Watkins, C.
  • Proceedings of the National Academy of Sciences, Vol. 106, Issue 16
  • DOI: 10.1073/pnas.0810762106

Applications of Topic Models
journal, January 2017

  • Boyd-Graber, Jordan; Hu, Yuening; Mimno, David
  • Foundations and Trends® in Information Retrieval, Vol. 11, Issue 2-3
  • DOI: 10.1561/1500000030

Zika in Twitter: Temporal Variations of Locations, Actors, and Concepts
journal, January 2017

  • Stefanidis, Anthony; Vraga, Emily; Lamprianidis, Georgios
  • JMIR Public Health and Surveillance, Vol. 3, Issue 2
  • DOI: 10.2196/publichealth.6925

Disease Detection or Public Opinion Reflection? Content Analysis of Tweets, Other Social Media, and Online Newspapers During the Measles Outbreak in the Netherlands in 2013
journal, January 2015

  • Mollema, Liesbeth; Harmsen, Irene Anhai; Broekhuizen, Emma
  • Journal of Medical Internet Research, Vol. 17, Issue 5
  • DOI: 10.2196/jmir.3863

A Literature Review of Zika Virus
journal, July 2016


Computer-Assisted Keyword and Document Set Discovery from Unstructured Text: KEYWORD AND DOCUMENT SET DISCOVERY
journal, April 2017

  • King, Gary; Lam, Patrick; Roberts, Margaret E.
  • American Journal of Political Science, Vol. 61, Issue 4
  • DOI: 10.1111/ajps.12291

Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
journal, October 2015


Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality
conference, January 2014

  • Lau, Jey Han; Newman, David; Baldwin, Timothy
  • Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
  • DOI: 10.3115/v1/E14-1056

Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA
journal, August 2010


Zika Virus Infection with Prolonged Maternal Viremia and Fetal Brain Abnormalities
journal, June 2016

  • Driggers, Rita W.; Ho, Cheng-Ying; Korhonen, Essi M.
  • New England Journal of Medicine, Vol. 374, Issue 22
  • DOI: 10.1056/NEJMoa1601824

#Healthy Selfies: Exploration of Health Topics on Instagram
journal, January 2018

  • Muralidhara, Sachin; Paul, Michael J.
  • JMIR Public Health and Surveillance, Vol. 4, Issue 2
  • DOI: 10.2196/10150

Diagnosing and Improving Topic Models by Analyzing Posterior Variability
journal, April 2018

  • Xing, Linzi; Paul, Michael
  • Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, Issue 1
  • DOI: 10.1609/aaai.v32i1.12033

Works referencing / citing this record:

Public Sphere in Crisis Mode: How the COVID-19 Pandemic Influenced Public Discourse and User Behaviour in the Swiss Twitter-sphere
journal, April 2021


Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.