skip to main content


Title: Forecasting influenza-like illness dynamics for military populations using neural networks and social media

This work is the first to take advantage of recurrent neural networks to predict influenza-like-illness (ILI) dynamics from various linguistic signals extracted from social media data. Unlike other approaches that rely on timeseries analysis of historical ILI data [1, 2] and the state-of-the-art machine learning models [3, 4], we build and evaluate the predictive power of Long Short Term Memory (LSTMs) architectures capable of nowcasting (predicting in \real-time") and forecasting (predicting the future) ILI dynamics in the 2011 { 2014 influenza seasons. To build our models we integrate information people post in social media e.g., topics, stylistic and syntactic patterns, emotions and opinions, and communication behavior. We then quantitatively evaluate the predictive power of different social media signals and contrast the performance of the-state-of-the-art regression models with neural networks. Finally, we combine ILI and social media signals to build joint neural network models for ILI dynamics prediction. Unlike the majority of the existing work, we specifically focus on developing models for local rather than national ILI surveillance [1], specifically for military rather than general populations [3] in 26 U.S. and six international locations. Our approach demonstrates several advantages: (a) Neural network models learned from social media data yield the bestmore » performance compared to previously used regression models. (b) Previously under-explored language and communication behavior features are more predictive of ILI dynamics than syntactic and stylistic signals expressed in social media. (c) Neural network models learned exclusively from social media signals yield comparable or better performance to the models learned from ILI historical data, thus, signals from social media can be potentially used to accurately forecast ILI dynamics for the regions where ILI historical data is not available. (d) Neural network models learned from combined ILI and social media signals significantly outperform models that rely solely on ILI historical data, which adds to a great potential of alternative public sources for ILI dynamics prediction. (e) Location-specific models outperform previously used location-independent models e.g., U.S. only. (f) Prediction results significantly vary across geolocations depending on the amount of social media data available and ILI activity patterns.« less
ORCiD logo [1] ;  [1] ;  [1] ;  [1] ;  [1]
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Publication Date:
Report Number(s):
Journal ID: ISSN 1932-6203; 453040142
Grant/Contract Number:
Accepted Manuscript
Journal Name:
Additional Journal Information:
Journal Volume: 12; Journal Issue: 12; Journal ID: ISSN 1932-6203
Public Library of Science
Research Org:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org:
Country of Publication:
United States
60 APPLIED LIFE SCIENCES; neural networks; social media; forecasting influenza dynamics; natural language processing; machine learning
OSTI Identifier: