Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study
Abstract
Background: Health authorities can minimize the impact of an emergent infectious disease outbreak through effective and timely risk communication, which can build trust and adherence to subsequent behavioral messaging. Monitoring the psychological impacts of an outbreak, as well as public adherence to such messaging, is also important for minimizing long-term effects of an outbreak. Objective: We used social media data from Twitter to identify human behaviors relevant to COVID-19 transmission, as well as the perceived impacts of COVID-19 on individuals, as a first step toward real-time monitoring of public perceptions to inform public health communications. Methods: We developed a coding schema for 6 categories and 11 subcategories, which included both a wide number of behaviors as well codes focused on the impacts of the pandemic (eg, economic and mental health impacts). We used this to develop training data and develop supervised learning classifiers for classes with sufficient labels. Classifiers that performed adequately were applied to our remaining corpus, and temporal and geospatial trends were assessed. We compared the classified patterns to ground truth mobility data and actual COVID-19 confirmed cases to assess the signal achieved here. Results: We applied our labeling schema to approximately 7200 tweets. The worst-performing classifiers hadmore »
- Authors:
-
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Univ. of New Mexico, Albuquerque, NM (United States)
- Publication Date:
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1827580
- Report Number(s):
- LA-UR-21-20074
Journal ID: ISSN 1438-8871
- Grant/Contract Number:
- 89233218CNA000001
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Medical Internet Research
- Additional Journal Information:
- Journal Volume: 23; Journal Issue: 5; Journal ID: ISSN 1438-8871
- Publisher:
- JMIR Publications
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 60 APPLIED LIFE SCIENCES; information science; Twitter; social media; human behavior; infectious disease; COVID-19; coronavirus; infodemiology; infoveillance; social distancing; shelter-in-place; mobility; COVID-19 intervention
Citation Formats
Daughton, Ashlynn R., Shelley, Courtney D., Barnard, Martha, Gerts, Dax, Watson Ross, Chrysm, Crooker, Isabel, Nadiga, Gopal, Mukundan, Nilesh, Vaquera Chavez, Nidia Yadira, Parikh, Nidhi, Pitts, Travis, and Fairchild, Geoffrey. Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study. United States: N. p., 2021.
Web. doi:10.2196/27059.
Daughton, Ashlynn R., Shelley, Courtney D., Barnard, Martha, Gerts, Dax, Watson Ross, Chrysm, Crooker, Isabel, Nadiga, Gopal, Mukundan, Nilesh, Vaquera Chavez, Nidia Yadira, Parikh, Nidhi, Pitts, Travis, & Fairchild, Geoffrey. Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study. United States. https://doi.org/10.2196/27059
Daughton, Ashlynn R., Shelley, Courtney D., Barnard, Martha, Gerts, Dax, Watson Ross, Chrysm, Crooker, Isabel, Nadiga, Gopal, Mukundan, Nilesh, Vaquera Chavez, Nidia Yadira, Parikh, Nidhi, Pitts, Travis, and Fairchild, Geoffrey. Tue .
"Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study". United States. https://doi.org/10.2196/27059. https://www.osti.gov/servlets/purl/1827580.
@article{osti_1827580,
title = {Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study},
author = {Daughton, Ashlynn R. and Shelley, Courtney D. and Barnard, Martha and Gerts, Dax and Watson Ross, Chrysm and Crooker, Isabel and Nadiga, Gopal and Mukundan, Nilesh and Vaquera Chavez, Nidia Yadira and Parikh, Nidhi and Pitts, Travis and Fairchild, Geoffrey},
abstractNote = {Background: Health authorities can minimize the impact of an emergent infectious disease outbreak through effective and timely risk communication, which can build trust and adherence to subsequent behavioral messaging. Monitoring the psychological impacts of an outbreak, as well as public adherence to such messaging, is also important for minimizing long-term effects of an outbreak. Objective: We used social media data from Twitter to identify human behaviors relevant to COVID-19 transmission, as well as the perceived impacts of COVID-19 on individuals, as a first step toward real-time monitoring of public perceptions to inform public health communications. Methods: We developed a coding schema for 6 categories and 11 subcategories, which included both a wide number of behaviors as well codes focused on the impacts of the pandemic (eg, economic and mental health impacts). We used this to develop training data and develop supervised learning classifiers for classes with sufficient labels. Classifiers that performed adequately were applied to our remaining corpus, and temporal and geospatial trends were assessed. We compared the classified patterns to ground truth mobility data and actual COVID-19 confirmed cases to assess the signal achieved here. Results: We applied our labeling schema to approximately 7200 tweets. The worst-performing classifiers had F1 scores of only 0.18 to 0.28 when trying to identify tweets about monitoring symptoms and testing. Classifiers about social distancing, however, were much stronger, with F1 scores of 0.64 to 0.66. We applied the social distancing classifiers to over 228 million tweets. We showed temporal patterns consistent with real-world events, and we showed correlations of up to –0.5 between social distancing signals on Twitter and ground truth mobility throughout the United States. Conclusions: Behaviors discussed on Twitter are exceptionally varied. Twitter can provide useful information for parameterizing models that incorporate human behavior, as well as for informing public health communication strategies by describing awareness of and compliance with suggested behaviors.},
doi = {10.2196/27059},
journal = {Journal of Medical Internet Research},
number = 5,
volume = 23,
place = {United States},
year = {Tue May 25 00:00:00 EDT 2021},
month = {Tue May 25 00:00:00 EDT 2021}
}
Works referenced in this record:
COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data
journal, January 2020
- Ahmed, Wasim; Vidal-Alaball, Josep; Downing, Joseph
- Journal of Medical Internet Research, Vol. 22, Issue 5
National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic
journal, December 2013
- Broniatowski, David A.; Paul, Michael J.; Dredze, Mark
- PLoS ONE, Vol. 8, Issue 12
Low Acceptability of A/H1N1 Pandemic Vaccination in French Adult Population: Did Public Health Policy Fuel Public Dissonance?
journal, April 2010
- Schwarzinger, Michaël; Flicoteaux, Rémi; Cortarenoda, Sébastien
- PLoS ONE, Vol. 5, Issue 4
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study
journal, January 2020
- Abd-Alrazaq, Alaa; Alhuwail, Dari; Househ, Mowafa
- Journal of Medical Internet Research, Vol. 22, Issue 4
Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and Control
journal, October 2011
- Salathé, Marcel; Khandelwal, Shashank
- PLoS Computational Biology, Vol. 7, Issue 10
Using social media to monitor mental health discussions − evidence from Twitter
journal, October 2016
- McClellan, Chandler; Ali, Mir M.; Mutter, Ryan
- Journal of the American Medical Informatics Association, Vol. 24, Issue 3
Social Connectedness: Measurement, Determinants, and Effects
journal, August 2018
- Bailey, Michael; Cao, Rachel; Kuchler, Theresa
- Journal of Economic Perspectives, Vol. 32, Issue 3
Zika and Public Health: Understanding the Epidemiology and Information Environment
journal, February 2018
- MacDonald, Pia D. M.; Holden, E. Wayne
- Pediatrics, Vol. 141, Issue Supplement 2
Precautionary Behavior in Response to Perceived Threat of Pandemic Influenza
journal, September 2007
- Sadique, M. Zia; Edmunds, W. John; Smith, Richard D.
- Emerging Infectious Diseases, Vol. 13, Issue 9
The characteristics of multi-source mobility datasets and how they reveal the luxury nature of social distancing in the U.S. during the COVID-19 pandemic
journal, February 2021
- Huang, Xiao; Li, Zhenlong; Jiang, Yuqin
- International Journal of Digital Earth, Vol. 14, Issue 4
Tracking the Rise in Popularity of Electronic Nicotine Delivery Systems (Electronic Cigarettes) Using Search Query Surveillance
journal, April 2011
- Ayers, John W.; Ribisl, Kurt M.; Brownstein, John S.
- American Journal of Preventive Medicine, Vol. 40, Issue 4
Psychological Language on Twitter Predicts County-Level Heart Disease Mortality
journal, January 2015
- Eichstaedt, Johannes C.; Schwartz, Hansen Andrew; Kern, Margaret L.
- Psychological Science, Vol. 26, Issue 2
Identifying Protective Health Behaviors on Twitter: Observational Study of Travel Advisories and Zika Virus
journal, January 2019
- Daughton, Ashlynn R.; Paul, Michael J.
- Journal of Medical Internet Research, Vol. 21, Issue 5
Perceptions and Behavioral Responses of the General Public During the 2009 Influenza A (H1N1) Pandemic: A Systematic Review
journal, April 2015
- Bults, Marloes; Beaujean, Desirée J. M. A.; Richardus, Jan Hendrik
- Disaster Medicine and Public Health Preparedness, Vol. 9, Issue 2
Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set
journal, January 2020
- Chen, Emily; Lerman, Kristina; Ferrara, Emilio
- JMIR Public Health and Surveillance, Vol. 6, Issue 2
Towards detecting influenza epidemics by analyzing Twitter messages
conference, January 2010
- Culotta, Aron
- Proceedings of the First Workshop on Social Media Analytics - SOMA '10
Social media use by community-based organizations conducting health promotion: a content analysis
journal, December 2013
- Ramanadhan, Shoba; Mendez, Samuel R.; Rao, Megan
- BMC Public Health, Vol. 13, Issue 1
Zika Virus Awareness and Prevention Practices Among University Students in Miami: Fall 2016
journal, March 2018
- Darrow, William; Bhatt, Chintan; Rene, Cassandra
- Health Education & Behavior, Vol. 45, Issue 6
“Thought I’d Share First” and Other Conspiracy Theory Tweets from the COVID-19 Infodemic: Exploratory Study
journal, January 2021
- Gerts, Dax; Shelley, Courtney D.; Parikh, Nidhi
- JMIR Public Health and Surveillance, Vol. 7, Issue 4
Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance
journal, May 2011
- Chan, Emily H.; Sahai, Vikram; Conrad, Corrie
- PLoS Neglected Tropical Diseases, Vol. 5, Issue 5
Social media for rapid knowledge dissemination: early experience from the COVID ‐19 pandemic
journal, March 2020
- Chan, A. K. M.; Nickson, C. P.; Rudolph, J. W.
- Anaesthesia, Vol. 75, Issue 12
Accounting for behavioral responses during a flu epidemic using home television viewing
journal, January 2015
- Springborn, Michael; Chowell, Gerardo; MacLachlan, Matthew
- BMC Infectious Diseases, Vol. 15, Issue 1
An investigation into the knowledge, perceptions and role of personal protective technologies in Zika prevention in Colombia
journal, January 2020
- Mendoza, Carolina; Jaramillo, Gloria-Isabel; Ant, Thomas H.
- PLOS Neglected Tropical Diseases, Vol. 14, Issue 1
Home is not always a haven: The domestic violence crisis amid the COVID-19 pandemic.
journal, August 2020
- Kofman, Yasmin B.; Garfin, Dana Rose
- Psychological Trauma: Theory, Research, Practice, and Policy, Vol. 12, Issue S1
COVID-19 pandemic and mental health consequences: Systematic review of the current evidence
journal, October 2020
- Vindegaard, Nina; Benros, Michael Eriksen
- Brain, Behavior, and Immunity, Vol. 89
Gluttony and guilt: monthly trends in internet search query data are comparable with national-level energy intake and dieting behavior
journal, January 2018
- Coogan, Sean; Sui, Zhixian; Raubenheimer, David
- Palgrave Communications, Vol. 4, Issue 1
The Effects of Social Media Use on Preventive Behaviors during Infectious Disease Outbreaks: The Mediating Role of Self-relevant Emotions and Public Risk Perception
journal, February 2020
- Oh, Sang-Hwa; Lee, Seo Yoon; Han, Changhyun
- Health Communication, Vol. 36, Issue 8
Catching Zika Fever: Application of Crowdsourcing and Machine Learning for Tracking Health Misinformation on Twitter
conference, August 2017
- Ghenai, Amira; Mejova, Yelena
- 2017 IEEE International Conference on Healthcare Informatics (ICHI)
From health search to healthcare: explorations of intention and utilization via query logs and user surveys
journal, January 2014
- White, Ryen W.; Horvitz, Eric
- Journal of the American Medical Informatics Association, Vol. 21, Issue 1
Staying at Home: Mobility Effects of COVID-19
journal, January 2020
- Engle, Samuel; Stromme, John; Zhou, Anson
- SSRN Electronic Journal
Social distancing beliefs and human mobility: Evidence from Twitter
journal, March 2021
- Porcher, Simon; Renault, Thomas
- PLOS ONE, Vol. 16, Issue 3
“Fitspiration” on Social Media: A Content Analysis of Gendered Images
journal, January 2017
- Carrotte, Elise Rose; Prichard, Ivanka; Lim, Megan Su Cheng
- Journal of Medical Internet Research, Vol. 19, Issue 3
Perceptions of Community Risk and Travel During Pregnancy in an Area of Zika Transmission
journal, July 2017
- Chandrasekaran, Neeraja; Marotta, Mabel; Taldone, Sabrina
- Cureus
Aggregated mobility data could help fight COVID-19
journal, April 2020
- Buckee, Caroline O.; Balsari, Satchit; Chan, Jennifer
- Science, Vol. 368, Issue 6487
SARS-related Perceptions in Hong Kong
journal, March 2005
- Lau, Joseph T. F.; Yang, Xilin; Pang, Ellie
- Emerging Infectious Diseases, Vol. 11, Issue 3
Comparison of Social Media, Syndromic Surveillance, and Microbiologic Acute Respiratory Infection Data: Observational Study
journal, January 2020
- Daughton, Ashlynn R.; Chunara, Rumi; Paul, Michael J.
- JMIR Public Health and Surveillance, Vol. 6, Issue 2
Forecasting the West Nile Virus in the United States: An Extensive Novel Data Streams–Based Time Series Analysis and Structural Equation Modeling of Related Digital Searching Behavior
journal, January 2019
- Watad, Abdulla; Watad, Samaa; Mahroum, Naim
- JMIR Public Health and Surveillance, Vol. 5, Issue 1