Addressing Uncertainty in LLM Outputs for Trust Calibration Through Visualization and User Interface Design
Helen Armstronga, Ashley L. Andersona,b, Rebecca Plancharta, Kweku Baidooa, and Matthew Petersona
a: Department of Graphic Design and Industrial Design, North Carolina State University, Raleigh, NC, USA; b: School of Visual Arts, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
Corresponding author: Helen Armstrong (hsarmstr[at]ncsu.edu)
Abstract: Large language models (LLMs) are becoming ubiquitous in knowledge work. However, the uncertainty inherent to LLM summary generation limits the efficacy of human-machine teaming, especially when users are unable to properly calibrate their trust in automation. Visual conventions for signifying uncertainty and interface design strategies for engaging users are needed to realize the full potential of LLMs. We report on an exploratory interdisciplinary project that resulted in four main contributions to explainable artificial intelligence in and beyond an intelligence analysis context. First, we provide and evaluate eight potential visual conventions for representing uncertainty in LLM summaries. Second, we describe a framework for uncertainty specific to LLM technology. Third, we specify 10 features for a proposed LLM validation system — the Multiple Agent Validation System (MAVS) — that utilizes the visual conventions, the framework, and three virtual agents to aid in language analysis. Fourth, we provide and describe four MAVS prototypes, one as an interactive simulation interface and the others as narrative interface videos. All four utilize a language analysis scenario to educate users on the potential of LLM technology in human-machine teams. To demonstrate applicability of the contributions beyond intelligence analysis, we also consider LLM-derived uncertainty in clinical decision-making in medicine and in climate forecasting. Ultimately, this investigation makes a case for the importance of visual and interface design in shaping the development of LLM technology.
Implications for practice: This article focuses on the role and responsibilities of the emerging AI designer in modern product design and development. The distinction between AI for efficiency and AI for augmentation (Section 2.3) suggests a comprehensive framework that can help AI designers apply these categories and advocate for user and societal needs in the rush to incorporate AI functions into existing services. The discussion of user feedback loops (Section 2.6) characterizes good feedback systems as being granular, contextual, and actionable, with a palette of available UX patterns including inline corrections for refinement, transparent confidence scores, and feedback tagging. Empirical research is needed to provide AI designers with a generalized understanding of how these UI characteristics and UX patterns impact human understanding, and how they interact.
Keywords: explainable AI; human-machine teaming; intelligence analysis; large language models; trust calibration; uncertainty; user interface design; visual representation
DOI being generated
Cite this article:
Armstrong, H., Anderson, A. L., Planchart, R., Baidoo, K., & Peterson, M. (2025). Addressing uncertainty in LLM outputs for trust calibration through visualization and user interface design. Visible Language, 59(2), 176–217. https://www.visible-language.org/journal/issue-59-2-addressing-uncertainty-in-llm-outputs-for-trust-calibration-through-visualization-and-user-interface-design
First published online August 15, 2025. © 2025 Visible Language — this article is open access, published under the CC BY-NC-ND 4.0 license.
https://www.visible-language.org/journal
Visible Language Consortium:
University of Leeds (UK)
University of Cincinnati (USA)
North Carolina State University (USA)