SUVA: A Probabilistic Framework for Auditing LLMs with an Application to Social Preferences

Published Online:https://doi.org/10.1287/isre.2024.0857

References

  • Abdel-Karim BM, Pfeuffer N, Carl KV, Hinz O (2023) How AI-based systems can induce reflections: The case of AI-augmented diagnostic work. MIS Quart. 47(4):1395–1424.CrossrefGoogle Scholar
  • Adam M, Roethke K, Benlian A (2023) Human vs. automated sales agents: How and why customer responses shift across sales stages. Inform. Systems Res. 34(3):1148–1168.LinkGoogle Scholar
  • Aher GV, Arriaga RI, Kalai AT (2023) Using large language models to simulate multiple humans and replicate human subject studies. Internat. Conf. Machine Learn. (PMLR, New York), 337–371.Google Scholar
  • Akata E, Schulz L, Coda-Forno J, Oh SJ, Bethge M, Schulz E (2025) Playing repeated games with large language models. Nature Human Behav., 1–11.Google Scholar
  • Andreas J (2022) Language models as agent models. Findings of the Association for Computational Linguistics: EMNLP 2022, 5769–5779.Google Scholar
  • Andreoni J (1990) Impure altruism and donations to public goods: A theory of warm-glow giving. The Econom. J. 100(401):464–477.Google Scholar
  • Bail CA (2024) Can generative AI improve social science? Proc. Natl. Acad. Sci. USA 121(21):e2314021121.CrossrefGoogle Scholar
  • Baird A, Maruping LM (2021) The next generation of research on IS use: A theoretical framework of delegation to and from agentic is artifacts. MIS Quart. 45(1):315–341.CrossrefGoogle Scholar
  • Berente N, Gu B, Recker J, Santhanam R (2021) Managing artificial intelligence. MIS Quart. 45(3):1433–1450.CrossrefGoogle Scholar
  • Binz M, Schulz E (2023) Using cognitive psychology to understand GPT-3. Proc. Natl. Acad. Sci. USA 120(6):e2218523120.CrossrefGoogle Scholar
  • Bolton GE, Ockenfels A (2000) ERC: A theory of equity, reciprocity, and competition. Amer. Econom. Rev. 91(1):166–193.CrossrefGoogle Scholar
  • Brand J, Israeli A, Ngwe D (2023) Using GPT for market research. Preprint, submitted March 30, https://doi.org/10.2139/ssrn.4395751.Google Scholar
  • Bratman M (1987) Intention, plans, and practical reason (Harvard University Press, Cambridge, MA).Google Scholar
  • Brookins P, DeBacker JM (2023) Playing games with GPT: What can we learn about a large language model from canonical strategic games? Preprint, submitted July 10, https://doi.org/10.2139/ssrn.4493398.Google Scholar
  • Brown TB (2020) Language models are few-shot learners. Preprint, submitted May 28, https://arxiv.org/abs/2005.14165.Google Scholar
  • Chalmers DJ (2023) Could a large language model be conscious? Preprint, submitted March 4, https://arxiv.org/abs/2303.07103.Google Scholar
  • Chan A, Salganik R, Markelius A, Pang C, Rajkumar N, Krasheninnikov D, Langosco L, et al. (2023) Harms from increasingly agentic algorithmic systems. Proc. 2023 ACM Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 651–666.Google Scholar
  • Charness G, Rabin M (2002) Understanding social preferences with simple tests. Quart. J. Econom. 117(3):817–869.CrossrefGoogle Scholar
  • Chen Y, Li SX (2009) Group identity and social preferences. Amer. Econom. Rev. 99(1):431–457.CrossrefGoogle Scholar
  • Chen Y, Liu TX, Shan Y, Zhong S (2023) The emergence of economic rationality of GPT. Proc. Natl. Acad. Sci. USA 120(51):e2316205120.CrossrefGoogle Scholar
  • Chen Y, Kirshner SN, Ovchinnikov A, Andiappan M, Jenkin T (2025) A manager and an AI walk into a bar: Does ChatGPT make biased decisions like we do? Manufacturing Service Oper. Management 27(2):354–368.LinkGoogle Scholar
  • Chew R, Bollenbacher J, Wenger M, Speer J, Kim A (2023) LLM-assisted content analysis: Using large language models to support deductive coding. Preprint, submitted June 23, https://arxiv.org/abs/2306.14924.Google Scholar
  • Chiang CH, Lee HY (2023) Can large language models be an alternative to human evaluations? Proc. 61st Annual Meeting Assoc. Comput. Linguistics (Vol. 1 Long Papers), 15607–15631.Google Scholar
  • Christiano PF, Leike J, Brown T, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY).Google Scholar
  • Dlugosz J, Gam YK, Gopalan R, Skrastins J (2024) Decision-making delegation in banks. Management Sci. 70(5):3281–3301.LinkGoogle Scholar
  • Falk A, Fischbacher U (2006) A theory of reciprocity. Games Econom. Behav. 54(2):293–315.CrossrefGoogle Scholar
  • Fehr E, Gächter S (2000) Fairness and retaliation: The economics of reciprocity. J. Econom. Perspect. 14(3):159–182.CrossrefGoogle Scholar
  • Fehr E, Schmidt KM (1999) A theory of fairness, competition, and cooperation. Quart. J. Econom. 114(3):817–868.CrossrefGoogle Scholar
  • Floridi L, Chiriatti M (2020) GPT-3: Its nature, scope, limits, and consequences. Minds Machines 30:681–694.CrossrefGoogle Scholar
  • Frederick S, Loewenstein G, O’Donoghue T (2002) Time discounting and time preference: A critical review. J. Econom. Literature 40(2):351–401.CrossrefGoogle Scholar
  • Fügener A, Grahl J, Gupta A, Ketter W (2022) Cognitive challenges in human–artificial intelligence collaboration: Investigating the path toward productive delegation. Inform. Systems Res. 33(2):678–696.LinkGoogle Scholar
  • Gabriel I (2020) Artificial intelligence, values, and alignment. Minds Machines 30(3):411–437.CrossrefGoogle Scholar
  • Gao C, Lan X, Lu Z, Mao J, Piao J, Wang H, Jin D, Li Y (2023) S3: Social-network simulation system with large language model-empowered agents. Preprint, submitted July 27, https://arxiv.org/abs/2307.14984.Google Scholar
  • Georgeff M, Pell B, Pollack M, Tambe M, Wooldridge M (1999) The belief-desire-intention model of agency. Intelligent Agents V Agents Theories Architectures Languages Fifth Internat. Workshop Proc., vol. 5 (Springer, Berlin, Heidelberg), 1–10.Google Scholar
  • Gnewuch U, Morana S, Hinz O, Kellner R, Maedche A (2023) More than a bot? The impact of disclosing human involvement on customer interactions with hybrid service agents. Inform. Systems Res. 35(3):936–955.Google Scholar
  • Goli A, Singh A (2024) Can LLMs capture human preferences? Marketing Sci. 43(4):709–722.LinkGoogle Scholar
  • Han E, Yin D, Zhang H (2023) Bots with feelings: Should AI agents express positive emotion in customer service? Inform. Systems Res. 34(3):1296–1311.LinkGoogle Scholar
  • Holzmeister F, Holmén M, Kirchler M, Stefan M, Wengström E (2023) Delegation decisions in finance. Management Sci. 69(8):4828–4844.LinkGoogle Scholar
  • Hong R, Zhang H, Zhao H, Yu D, Zhang C (2023) Faithful question answering with Monte-Carlo planning. Preprint, submitted May 4, https://arxiv.org/abs/2305.02556.Google Scholar
  • Horton JJ (2023) Large language models as simulated economic agents: What can we learn from homo silicus? NBER Working Paper No. 31122, National Bureau of Economic Research, Cambridge, MA.Google Scholar
  • Jie YW, Satapathy R, Goh R, Cambria E (2024) How interpretable are reasoning explanations from prompting large language models? Findings Assoc. Comput. Linguistics, 2148–2164.Google Scholar
  • Jussupow E, Spohrer K, Heinzl A, Gawlitza J (2021) Augmenting medical diagnosis decisions? An investigation into physicians’ decision-making process with artificial intelligence. Inform. Systems Res. 32(3):713–735.LinkGoogle Scholar
  • Lanham T, Chen A, Radhakrishnan A, Steiner B, Denison C, Hernandez D, Li D, et al. (2023) Measuring faithfulness in chain-of-thought reasoning. Preprint, submitted July 17, https://arxiv.org/abs/2307.13702.Google Scholar
  • Leng Y (2024) Can LLMs mimic human-like mental accounting and behavioral biases? Proc. 25th ACM Conf. Econom. Comput., 581.Google Scholar
  • Leng Y, Nguyen T (2025) Latent neural coupling of risk and time preferences in LLMs mirrors human biases. Proc. 26th ACM Conf. Econom. Comput., vol. 542 (ACM, New York).Google Scholar
  • Leng Y, Sang Y, Agarwal A (2024) Reduce disparity between LLMs and humans: Optimal LLM sample calibration. Preprint, submitted April 23, https://doi.org/10.2139/ssrn.4802019.Google Scholar
  • Li P, Castelo N, Katona Z, Sarvary M (2024) Frontiers: Determining the validity of large language models for automated perceptual analysis. Marketing Sci. 43(2):254–266.Google Scholar
  • Linneberg MS, Korsgaard S (2019) Coding qualitative data: A synthesis guiding the novice. Qualitative Res. J. 19(3):259–270.CrossrefGoogle Scholar
  • Lyu Q, Havaldar S, Stein A, Zhang L, Rao D, Wong E, Apidianaki M, Callison-Burch C (2023) Faithful chain-of-thought reasoning. Preprint, submitted January 31, https://arxiv.org/abs/2301.13379.Google Scholar
  • Mei Q, Xie Y, Yuan W, Jackson MO (2024) A Turing test of whether AI chatbots are behaviorally similar to humans. Proc. Natl. Acad. Sci. USA 121(9):e2313925121.CrossrefGoogle Scholar
  • Miotto M, Rossberg N, Kleinberg B (2022) Who is GPT-3? An exploration of personality, values and demographics. Preprint, submitted September 28, https://arxiv.org/abs/2209.14338.Google Scholar
  • Naveed H, Khan AU, Qiu S, Saqib M, Anwar S, Usman M, Akhtar N, Barnes N, Mian A (2023) A comprehensive overview of large language models. Preprint, submitted July 12, https://arxiv.org/abs/2307.06435.Google Scholar
  • Nistor C, Selove M (2024) Influencers: The power of comments. Marketing Sci. 43(6):1153–1167.LinkGoogle Scholar
  • Nowak MA, Sigmund K (2005) Evolution of indirect reciprocity. Nature 437(7063):1291–1298.CrossrefGoogle Scholar
  • Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, et al. (2022) Training language models to follow instructions with human feedback. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY), 27730–27744.Google Scholar
  • Park JS, O’Brien J, Cai CJ, Morris MR, Liang P, Bernstein MS (2023) Generative agents: Interactive simulacra of human behavior. Proc. 36th Annual ACM Sympos. User Interface Software Tech. (ACM, New York), 1–22.Google Scholar
  • Pellert M, Lechner CM, Wagner C, Rammstedt B, Strohmaier M (2023) AI psychometrics: Using psychometric inventories to obtain psychological profiles of large language models. OSF preprint.Google Scholar
  • Pinski M, Adam M, Benlian A (2023) AI knowledge: Improving AI delegation through human enablement. Proc. 2023 CHI Conf. Human Factors Comput. Systems (ACM, New York), 1–17.Google Scholar
  • Raghu T, Jayaraman B, Rao HR (2004) Toward an integration of agent-and activity-centric approaches in organizational process modeling: Incorporating incentive mechanisms. Inform. Systems Res. 15(4):316–335.LinkGoogle Scholar
  • Rahwan I, Cebrian M, Obradovich N, Bongard J, Bonnefon JF, Breazeal C, Crandall JW, et al. (2019) Machine behaviour. Nature 568(7753):477–486.CrossrefGoogle Scholar
  • Raza S, Sapkota R, Karkee M, Emmanouilidis C (2025) TRiSM for agentic AI: A review of trust, risk, and security management in LLM-based agentic multi-agent systems. Preprint, submitted June 4, https://arxiv.org/abs/2506.04133.Google Scholar
  • Russell S (2019) Human Compatible: AI and the Problem of Control (Penguin UK, London).Google Scholar
  • Schanke S, Burtch G, Ray G (2021) Estimating the impact of “humanizing” customer service chatbots. Inform. Systems Res. 32(3):736–751.LinkGoogle Scholar
  • Schneider J (2024) Explainable generative AI (GenXAI): A survey, conceptualization, and research agenda. Artificial Intelligence Rev. 57(11):289.CrossrefGoogle Scholar
  • Seymour M, Yuan L, Riemer K, Dennis AR (2024) Less artificial, more intelligent: Understanding affinity, trustworthiness, and preference for digital humans. Inform. Systems Res. 36(2):1096–1128.LinkGoogle Scholar
  • Shiffrin R, Mitchell M (2023) Probing the psychology of AI models. Proc. Natl. Acad. Sci. USA 120(10):e2300963120.CrossrefGoogle Scholar
  • Shojaee P, Mirzadeh I, Alizadeh K, Horton M, Bengio S, Farajtabar M (2025) The illusion of thinking: Understanding the strengths and limitations of reasoning models via the lens of problem complexity. Preprint, submitted June 16, https://arxiv.org/pdf/2506.09250.Google Scholar
  • Sierra C, Osman N, Noriega P, Sabater-Mir J, Perelló A (2021) Value alignment: A formal approach. Preprint, submitted October 18, https://arxiv.org/abs/2110.09240.Google Scholar
  • Singh C, Inala JP, Galley M, Caruana R, Gao J (2024) Rethinking interpretability in the era of large language models. Preprint, submitted January 30, https://arxiv.org/abs/2402.01761.Google Scholar
  • Sprague Z, Yin F, Rodriguez JD, Jiang D, Wadhwa M, Singhal P, Zhao X, Ye X, Mahowald K, Durrett G (2024) To cot or not to cot? Chain-of-thought helps mainly on math and symbolic reasoning. Preprint, submitted September 18, https://arxiv.org/abs/2409.12183.Google Scholar
  • Strachan JW, Albergo D, Borghini G, Pansardi O, Scaliti E, Gupta S, Saxena K, et al. (2024) Testing theory of mind in large language models and humans. Nature Human Behav. 8(7):1285–1295.CrossrefGoogle Scholar
  • Tai RH, Bentley LR, Xia X, Sitt JM, Fankhauser SC, Chicas-Mosier AM, Monteith BG (2024) An examination of the use of large language models to aid analysis of textual data. Internat. J. Qualitative Methods 23:16094069241231168.CrossrefGoogle Scholar
  • Tamkin A, Brundage M, Clark J, Ganguli D (2021) Understanding the capabilities, limitations, and societal impact of large language models. Preprint, submitted February 4, https://arxiv.org/abs/2102.02503.Google Scholar
  • Tong Y, Tan CH, Teo HH (2017) Direct and indirect information system use: A multimethod exploration of social power antecedents in healthcare. Inform. Systems Res. 28(4):690–710.LinkGoogle Scholar
  • Tversky A, Kahneman D (1991) Loss aversion in riskless choice: A reference-dependent model. Quart. J. Econom. 106(4):1039–1061.CrossrefGoogle Scholar
  • Tversky A, Kahneman D (1992) Advances in prospect theory: Cumulative representation of uncertainty. J. Risk Uncertainty 5:297–323.CrossrefGoogle Scholar
  • Wang W, Gao G, Agarwal R (2023) Friend or foe? Teaming between artificial intelligence and workers with variation in experience. Management Sci. 70(9):5753–5775.LinkGoogle Scholar
  • Wang L, Ma C, Feng X, Zhang Z, Yang H, Zhang J, Chen Z, et al. (2024) A survey on large language model based autonomous agents. Frontiers Comput. Sci. 18(6):186345.CrossrefGoogle Scholar
  • Webb T, Holyoak KJ, Lu H (2023) Emergent analogical reasoning in large language models. Nature Human Behav. 7(9):1526–1541.CrossrefGoogle Scholar
  • Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D, et al. (2022) Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY), 24824–24837.Google Scholar
  • Xie C, Chen C, Jia F, Ye Z, Shu K, Bibi A, Hu Z, Torr P, Ghanem B, Li G (2024) Can large language model agents simulate human trust behaviors? Preprint, submitted February 7, https://arxiv.org/abs/2402.04559.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.