Understanding and Explicitly Measuring Linguistic and Stylistic Properties of Deception via Generation and Translation

Saldanha, Emily G.; Garimella, Aparna; Volkova, Svitlana

Understanding and Explicitly Measuring Linguistic and Stylistic Properties of Deception via Generation and Translation

Conference · Wed Dec 23 04:00:00 EST 2020

OSTI ID:1866984

Saldanha, Emily G. ^[1]; Garimella, Aparna ^[1]; Volkova, Svitlana ^[1]

BATTELLE (PACIFIC NW LAB)

Massive digital disinformation is one of the main risks of modern society. Hundreds of models and linguistic analyses have been done to compare and contrast misleading and credible content online. However, most models do not remove the confounding factor of a topic or narrative when training, so the resulting models learn a clear topical separation for misleading versus credible content. We study the feasibility of using two strategies to disentangle the topic bias from the models to understand and explicitly measure linguistic and stylistic properties of content from misleading versus credible content. First, we develop conditional generative models to create news content that is characteristic of different credibility levels. We perform multi-dimensional evaluation of model performance on mimicking both the style and linguistic differences that distinguish news of different credibility using machine translation metrics and classification models. We show that even though generative models are able to imitate both the style and language of the original content, additional conditioning on both the news category and the topic leads to reduced performance. In a second approach, we perform deception style ``transfer" by translating deceptive content into the style of credible content and vice versa. Extending earlier studies, we demonstrate that, when conditioned on a topic, deceptive content is shorter, less readable, more biased, and more subjective than credible content, and transferring the style from deceptive to credible content is more challenging than the opposite direction.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1866984

Report Number(s):: PNNL-SA-152898

Country of Publication:: United States

Language:: English

Similar Records

Misleading or Falsification? Inferring Deceptive Strategies and Types in Online News and Social Media

Conference · Fri Apr 27 00:00:00 EDT 2018 · OSTI ID:1435892

Evaluating Deception Detection Model Robustness To Linguistic Variation

Conference · Thu Jun 10 00:00:00 EDT 2021 · OSTI ID:1894779

Machine Intelligence to Detect, Characterise, and Defend against Influence Operations in the Information Environment

Journal Article · Wed Apr 14 00:00:00 EDT 2021 · Journal of Information Warfare · OSTI ID:1777157

Understanding and Explicitly Measuring Linguistic and Stylistic Properties of Deception via Generation and Translation

Citation Formats

Similar Records

Related Subjects