Mitigative Strategies for Recovering From Large Language Model Trust Violations

Martell, Max J.; Baweja, Jessica A.; Dreslin, Brandon D.

doi:10.1177/15553434241303577

Mitigative Strategies for Recovering From Large Language Model Trust Violations

Journal Article · Tue Dec 03 23:00:00 EST 2024 · Journal of Cognitive Engineering and Decision Making

DOI:https://doi.org/10.1177/15553434241303577· OSTI ID:2560812

Martell, Max J. ^[1]; ^[1]; ^[1]

Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

In this study, we investigated strategies to address trust issues arising from errors in large language models (LLMs). The study examined the impact of confidence scores, system capability explanations, and user feedback on trust restoration post-error. 68 participants viewed the responses of an LLM to 20 general trivia questions, with an error introduced on the third trial. Each participant was presented with one mitigation strategy. Participants rated their overall trust in the model and the reliability of the answer. Results showed an immediate drop in trust after the error; however, there were no differences across the three strategies in trust recovery. Further, all conditions had a logarithmic trend in trust recovery following error. Differences in overall trust were predicted by perceived reliability of the answer, suggesting that participants were evaluating results critically and using that to inform their trust in the model. Qualitative data supported this finding; participants expressed lasting distrust despite the LLM’s later accuracy. Results showcase the need to prioritize accuracy in LLM deployment, because early errors may irrevocably damage user trust calibration and later adoption.

View Accepted Manuscript (DOE)

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Science (SC), Office of Workforce Development for Teachers & Scientists (WDTS)

Grant/Contract Number:: AC05-76RL01830

OSTI ID:: 2560812

Alternate ID(s):: OSTI ID: 2479660

Report Number(s):: PNNL-SA--193711

Journal Information:: Journal of Cognitive Engineering and Decision Making, Journal Name: Journal of Cognitive Engineering and Decision Making Journal Issue: 1 Vol. 19; ISSN 1555-3434

Publisher:: Sage PublicationsCopyright Statement

Country of Publication:: United States

Language:: English

References (48)

A systematic review of algorithm aversion in augmented decision making Burton, Jason W.; Stein, Mari‐Klara; Jensen, Tina Blegind Journal of Behavioral Decision Making, Vol. 33, Issue 2 https://doi.org/10.1002/bdm.2155	journal	October 2019
Trust, self-confidence, and operators' adaptation to automation Lee, John D.; Moray, Neville International Journal of Human-Computer Studies, Vol. 40, Issue 1 https://doi.org/10.1006/ijhc.1994.1007	journal	January 1994
Timing Is Key for Robot Trust Repair Robinette, Paul; Howard, Ayanna M.; Wagner, Alan R. Social Robotics (ICSR 2015) https://doi.org/10.1007/978-3-319-25554-5_57	conference	January 2015
Explainable AI: The New 42? Goebel, Randy; Chander, Ajay; Holzinger, Katharina Lecture Notes in Computer Science https://doi.org/10.1007/978-3-319-99740-7_21	book	January 2018
Examining Science Education in ChatGPT: An Exploratory Study of Generative Artificial Intelligence Cooper, Grant Journal of Science Education and Technology, Vol. 32, Issue 3 https://doi.org/10.1007/s10956-023-10039-y	journal	March 2023
Machine learning and deep learning Janiesch, Christian; Zschech, Patrick; Heinrich, Kai Electronic Markets, Vol. 31, Issue 3 https://doi.org/10.1007/s12525-021-00475-2	journal	April 2021
The role of trust in automation reliance Dzindolet, Mary T.; Peterson, Scott A.; Pomranky, Regina A. International Journal of Human-Computer Studies, Vol. 58, Issue 6 https://doi.org/10.1016/S1071-5819(03)00038-7	journal	June 2003
Evaluating XAI: A comparison of rule-based and example-based explanations van der Waa, Jasper; Nieuwburg, Elisabeth; Cremers, Anita Artificial Intelligence, Vol. 291 https://doi.org/10.1016/j.artint.2020.103404	journal	February 2021
Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies Kenny, Eoin M.; Ford, Courtney; Quinn, Molly Artificial Intelligence, Vol. 294 https://doi.org/10.1016/j.artint.2021.103459	journal	May 2021
Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry Köbis, Nils; Mossink, Luca D. Computers in Human Behavior, Vol. 114 https://doi.org/10.1016/j.chb.2020.106553	journal	January 2021
The effects of personality and locus of control on trust in humans versus artificial intelligence Sharan, Navya Nishith; Romano, Daniela Maria Heliyon, Vol. 6, Issue 8 https://doi.org/10.1016/j.heliyon.2020.e04572	journal	August 2020
How transparency modulates trust in artificial intelligence Zerilli, John; Bhatt, Umang; Weller, Adrian Patterns, Vol. 3, Issue 4 https://doi.org/10.1016/j.patter.2022.100455	journal	April 2022
What influences algorithmic decision-making? A systematic literature review on algorithm aversion Mahmud, Hasan; Islam, A. K. M. Najmul; Ahmed, Syed Ishtiaque Technological Forecasting and Social Change, Vol. 175 https://doi.org/10.1016/j.techfore.2021.121390	journal	February 2022
How should intelligent agents apologize to restore trust? Interaction effects between anthropomorphism and apology attribution on trust repair Kim, Taenyun; Song, Hayeon Telematics and Informatics, Vol. 61 https://doi.org/10.1016/j.tele.2021.101595	journal	August 2021
Trust in deliberation: The consequences of deliberative decision strategies for medical decisions. Scherer, Laura D.; de Vries, Marieke; Zikmund-Fisher, Brian J. Health Psychology, Vol. 34, Issue 11 https://doi.org/10.1037/hea0000203	journal	November 2015
Algorithm aversion: People erroneously avoid algorithms after seeing them err. Dietvorst, Berkeley J.; Simmons, Joseph P.; Massey, Cade Journal of Experimental Psychology: General, Vol. 144, Issue 1 https://doi.org/10.1037/xge0000033	journal	January 2015
Large language models in medicine Thirunavukarasu, Arun James; Ting, Darren Shu Jeng; Elangovan, Kabilan Nature Medicine, Vol. 29, Issue 8 https://doi.org/10.1038/s41591-023-02448-8	journal	July 2023
From ‘automation’ to ‘autonomy’: the importance of trust repair in human–machine interaction de Visser, Ewart J.; Pak, Richard; Shaw, Tyler H. Ergonomics, Vol. 61, Issue 10 https://doi.org/10.1080/00140139.2018.1457725	journal	April 2018
Trust, control strategies and allocation of function in human-machine systems Lee, John; Moray, Neville Ergonomics, Vol. 35, Issue 10 https://doi.org/10.1080/00140139208967392	journal	October 1992
Trust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation Muir, Bonnie M.; Moray, Neville Ergonomics, Vol. 39, Issue 3 https://doi.org/10.1080/00140139608964474	journal	March 1996
Measuring trust inside organisations Dietz, Graham; Den Hartog, Deanne N. Personnel Review, Vol. 35, Issue 5 https://doi.org/10.1108/00483480610682299	journal	September 2006
Stubborn Reliance on Human Nature in Employee Selection: Statistical Decision Aids Are Evolutionarily Novel Colarelli, Stephen M.; Thompson, Matthew Industrial and Organizational Psychology, Vol. 1, Issue 3 https://doi.org/10.1111/j.1754-9434.2008.00060.x	journal	September 2008
"Why Should I Trust You?": Explaining the Predictions of Any Classifier Ribeiro, Marco Tulio; Singh, Sameer; Guestrin, Carlos Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16 https://doi.org/10.1145/2939672.2939778	conference	January 2016
Understanding the Effect of Accuracy on Trust in Machine Learning Models Yin, Ming; Wortman Vaughan, Jennifer; Wallach, Hanna Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems https://doi.org/10.1145/3290605.3300509	conference	May 2019
Do I trust my machine teammate? Yu, Kun; Berkovsky, Shlomo; Taib, Ronnie Proceedings of the 24th International Conference on Intelligent User Interfaces https://doi.org/10.1145/3301275.3302277	conference	March 2019
Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning Kaur, Harmanpreet; Nori, Harsha; Jenkins, Samuel Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems https://doi.org/10.1145/3313831.3376219	conference	April 2020
The relationship between trust in AI and trustworthy machine learning technologies Toreini, Ehsan; Aitken, Mhairi; Coopamootoo, Kovila Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency https://doi.org/10.1145/3351095.3372834	conference	January 2020
Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making Zhang, Yunfeng; Liao, Q. Vera; Bellamy, Rachel K. E. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency https://doi.org/10.1145/3351095.3372852	conference	January 2020
How do visual explanations foster end users' appropriate trust in machine learning? Yang, Fumeng; Huang, Zhuanyi; Scholtz, Jean Proceedings of the 25th International Conference on Intelligent User Interfaces https://doi.org/10.1145/3377325.3377480	conference	March 2020
To Trust or to Think Buçinca, Zana; Malaya, Maja Barbara; Gajos, Krzysztof Z. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, Issue CSCW1 https://doi.org/10.1145/3449287	journal	April 2021
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions Huang, Lei; Yu, Weijiang; Ma, Weitao ACM Transactions on Information Systems https://doi.org/10.1145/3703155	journal	November 2024
Trust in Automation: Integrating Empirical Evidence on Factors That Influence Trust Hoff, Kevin Anthony; Bashir, Masooda Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 57, Issue 3 https://doi.org/10.1177/0018720814547570	journal	September 2014
Individual Differences in the Calibration of Trust in Automation Pop, Vlad L.; Shrewsbury, Alex; Durso, Francis T. Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 57, Issue 4 https://doi.org/10.1177/0018720814564422	journal	December 2014
Measuring Individual Differences in the Perfect Automation Schema Merritt, Stephanie M.; Unnerstall, Jennifer L.; Lee, Deborah Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 57, Issue 5 https://doi.org/10.1177/0018720815581247	journal	April 2015
Trusting Automation: Designing for Responsivity and Resilience Chiou, Erin K.; Lee, John D. Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 65, Issue 1 https://doi.org/10.1177/00187208211009995	journal	April 2021
Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them Dietvorst, Berkeley J.; Simmons, Joseph P.; Massey, Cade Management Science, Vol. 64, Issue 3 https://doi.org/10.1287/mnsc.2016.2643	journal	March 2018
Overtrusting robots: Setting a research agenda to mitigate overtrust in automation Aroyo, Alexander M.; de Bruyne, Jan; Dheu, Orian Paladyn, Journal of Behavioral Robotics, Vol. 12, Issue 1 https://doi.org/10.1515/pjbr-2021-0029	journal	October 2021
Supporting Trust Calibration and the Effective Use of Decision Aids by Presenting Dynamic System Confidence Information McGuirl, John M.; Sarter, Nadine B. Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 48, Issue 4 https://doi.org/10.1518/001872006779166334	journal	December 2006
Trust in Automation: Designing for Appropriate Reliance Lee, J. D.; See, K. A. Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 46, Issue 1 https://doi.org/10.1518/hfes.46.1.50_30392	journal	January 2004
Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance Bansal, Gagan; Nushi, Besmira; Kamar, Ece Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7 https://doi.org/10.1609/hcomp.v7i1.5285	journal	October 2019
An Integrative Model of Organizational Trust Mayer, Roger C.; Davis, James H.; Schoorman, F. David The Academy of Management Review, Vol. 20, Issue 3 https://doi.org/10.2307/258792	journal	July 1995
Generative AI in the Workplace: Employee Perspectives of ChatGPT Benefits and Organizational Policies Cardon, Peter Wilson; Getchell, Kristen; Carradini, Stephen https://doi.org/10.31235/osf.io/b3ezy	preprint	March 2023
Measurement of Trust in Automation: A Narrative Review and Reference Guide Kohn, Spencer C.; de Visser, Ewart J.; Wiese, Eva Frontiers in Psychology, Vol. 12 https://doi.org/10.3389/fpsyg.2021.604977	journal	October 2021
ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health De Angelis, Luigi; Baglivo, Francesco; Arzilli, Guglielmo Frontiers in Public Health, Vol. 11 https://doi.org/10.3389/fpubh.2023.1166120	journal	April 2023
Repairing and Enhancing Trust:Approaches to Reducing Organizational Trust Deficits Kramer, Roderick M.; Lewicki, Roy J. Academy of Management Annals, Vol. 4, Issue 1 https://doi.org/10.5465/19416520.2010.487403	journal	January 2010
The Role Of Causal Attribution Dimensions In Trust Repair Tomlinson, Edward C.; Mryer, Roger C. Academy of Management Review, Vol. 34, Issue 1 https://doi.org/10.5465/amr.2009.35713291	journal	January 2009
The Repair of Trust: A Dynamic Bilateral Perspective and Multilevel Conceptualization Kim, Peter H.; Dirks, Kurt T.; Cooper, Cecily D. Academy of Management Review, Vol. 34, Issue 3 https://doi.org/10.5465/amr.2009.40631887	journal	July 2009
Human Trust in Artificial Intelligence: Review of Empirical Research Glikson, Ella; Woolley, Anita Williams Academy of Management Annals, Vol. 14, Issue 2 https://doi.org/10.5465/annals.2018.0057	journal	July 2020

Similar Records

Trust and Public Participation in Risk Policy Issues

Conference · Tue Nov 30 23:00:00 EST 1999 · OSTI ID:15004140

California Consumers’ Beliefs and Trust in Electric Utilities

Journal Article · Fri Jun 24 00:00:00 EDT 2022 · Socius: Sociological Research for a Dynamic World · OSTI ID:2458253

Improving Reliability of Large Language Models for Nuclear Power Plant Diagnostics [Poster]

Technical Report · Wed Jul 24 00:00:00 EDT 2024 · OSTI ID:2440146

Related Subjects

97 MATHEMATICS AND COMPUTING

Mitigative Strategies for Recovering From Large Language Model Trust Violations

Citation Formats

References (48)

Similar Records

Related Subjects