skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

Abstract

Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Barthel et al., 2016). Fabricated stories spread in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news tweets as suspicious or verified, and predict four subtypes of suspicious news – satire, hoaxes, clickbait and propaganda. We demonstrate that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models decreases performance. Incorporating linguistic features, including bias and subjectivity, improves classification results, however social interaction features are most informative for finer-grained separation between our four types of suspicious news posts.

Authors:
; ; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1373869
Report Number(s):
PNNL-SA-123856
453040300
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, July 30-August 4, 2017, Vancouver, BC, Canada, 2:647-653; Paper No. 10.18653/v1/P17-2102
Country of Publication:
United States
Language:
English
Subject:
neural networks; linguistic models; social graph embeddings; classification

Citation Formats

Volkova, Svitlana, Shaffer, Kyle J., Jang, Jin Yea, and Hodas, Nathan O. Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter. United States: N. p., 2017. Web. doi:10.18653/v1/P17-2102.
Volkova, Svitlana, Shaffer, Kyle J., Jang, Jin Yea, & Hodas, Nathan O. Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter. United States. doi:10.18653/v1/P17-2102.
Volkova, Svitlana, Shaffer, Kyle J., Jang, Jin Yea, and Hodas, Nathan O. Sun . "Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter". United States. doi:10.18653/v1/P17-2102.
@article{osti_1373869,
title = {Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter},
author = {Volkova, Svitlana and Shaffer, Kyle J. and Jang, Jin Yea and Hodas, Nathan O.},
abstractNote = {Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Barthel et al., 2016). Fabricated stories spread in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news tweets as suspicious or verified, and predict four subtypes of suspicious news – satire, hoaxes, clickbait and propaganda. We demonstrate that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models decreases performance. Incorporating linguistic features, including bias and subjectivity, improves classification results, however social interaction features are most informative for finer-grained separation between our four types of suspicious news posts.},
doi = {10.18653/v1/P17-2102},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Jul 30 00:00:00 EDT 2017},
month = {Sun Jul 30 00:00:00 EDT 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: