A Manager and an AI Walk into a Bar: Does ChatGPT Make Biased Decisions Like We Do?
Published Online:31 Jan 2025https://doi.org/10.1287/msom.2023.0279
References
- (2022) Prediction Machines, Updated and Expanded: The Simple Economics of Artificial Intelligence (Harvard Business Press, Cambridge, MA).Google Scholar
- (2023) Playing repeated games with large language models. Preprint, submitted May 26, https://arxiv.org/abs/2305.16867.Google Scholar
- (2023) Out of one, many: Using language models to simulate human samples. Political Anal. 31(3):337–351.Crossref, Google Scholar
- (2017) Behavioral anomalies in consumer wait-or-buy decisions and their implications for markdown management. Oper. Res. 65(2):357–378.Link, Google Scholar
- (2013) Designing buyback contracts for irrational but predictable newsvendors. Management Sci. 59(8):1800–1816.Link, Google Scholar
- (2018) A note on the risk aversion of informed newsvendors. J. Oper. Res. Soc. 69(7):1135–1145.Crossref, Google Scholar
- (2023) Using cognitive psychology to understand GPT-3. Proc. Natl. Acad. Sci. USA 120(6):e2218523120.Crossref, Google Scholar
- (2008) Learning by doing in the newsvendor problem: A laboratory investigation of the role of experience and feedback. Manufacturing Service Oper. Management. 10(3):519–538.Link, Google Scholar
- (2023) Using LLMs for market research. Preprint, submitted March 30, http://dx.doi.org/10.2139/ssrn.4395751.Google Scholar
- (2023) Playing games with GPT: What can we learn about a large language model from canonical strategic games? Preprint, submitted July 10, http://dx.doi.org/10.2139/ssrn.4493398.Google Scholar
- (2023) The emergence of economic rationality of GPT. Preprint, submitted May 22, https://arxiv.org/abs/2305.12763.Google Scholar
- (2022) Language models show human-like content effects on reasoning tasks. Preprint, submitted July 14, https://arxiv.org/abs/2207.07051.Google Scholar
- (2018) Biases in individual decision-making. Donohue K, Katok E, Leider S, eds. The Handbook of Behavioral Operations (Wiley, Hoboken, NJ), 149–198.Crossref, Google Scholar
- (2024) OM Forum—The best of both worlds: Machine learning and behavioral science in operations management. Manufacturing Service Oper. Management. 26(5):1605–1621.Link, Google Scholar
- (2023) A replication study of operations management experiments in management science. Management Sci. 69(9):4977–4991.Link, Google Scholar
- (2023) Exploring GPT-3 model’s capability in passing the Sally-Anne test a preliminary study in two languages. Preprint, submitted February 9, https://doi.org/10.31219/osf.io/8r3ma.Google Scholar
- (2023) Action identification characteristics and priming effects in ChatGPT. Preprint, submitted May 12, https://doi.org/10.31234/osf.io/aqbvk.Google Scholar
- (2016) Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. Eur. J. Epidemiology 31(4):337–350.Crossref, Google Scholar
- (2023) Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nature Comput. Sci. 3(10):833–838.Crossref, Google Scholar
- (2023) Large language models as simulated economic agents: What can we learn from Homo silicus? Preprint, submitted January 18, https://arxiv.org/abs/2301.07543.Google Scholar
- (2024) Generative artificial intelligence in supply chain and operations management: A capability-based framework for analysis and implementation. Internat. J. Production Res. 62(17):6120–6145.Crossref, Google Scholar
- (2024a) Artificial agents and operations management decision-making. Preprint, submitted March 13, http://dx.doi.org/10.2139/ssrn.4726933.Google Scholar
- (2024b) GPT and CLT: The impact of ChatGPT’s level of abstraction on consumer recommendations. J. Retailing Consumer Services 76:103580.Crossref, Google Scholar
- (2016) Inferring quality from wait time. Management Sci. 62(10):3023–3038.Link, Google Scholar
- (2024) Can LLMs mimic human-like mental accounting and behavioral biases? Preprint, submitted February 13, http://dx.doi.org/10.2139/ssrn.4705130.Google Scholar
- (2024) Do LLM agents exhibit social behavior? Preprint, submitted December 23, https://arxiv.org/abs/2312.15198.Google Scholar
- (2024) Frontiers: Determining the validity of large language models for automated perceptual analysis. Marketing Sci. 43(2):254–266.Link, Google Scholar
- (2015) Prospect theory explains newsvendor behavior: The role of reference points. Management Sci. 61(12):3009–3012.Link, Google Scholar
- (2023) Is ChatGPT humanly irrational? Preprint, submitted September 22, https://doi.org/10.21203/rs.3.rs-3220513/v1.Google Scholar
- (2024) (Ir)rationality and cognitive biases in large language models. Preprint, submitted February 14, https://arxiv.org/abs/2402.09193.Google Scholar
- (2024) A Turing test of whether AI chatbots are behaviorally similar to humans. Proc. Natl. Acad. Sci. USA 121(9):e2313925121.Crossref, Google Scholar
- (2024) AI emerges as the frontier in behavioral science. Proc. Natl. Acad. Sci. USA 121(10):e2401336121.Crossref, Google Scholar
- (2023) Experimental evidence on the productivity effects of generative artificial intelligence. Science 381(6654):187–192.Crossref, Google Scholar
- (2015) How to compete against a behavioral newsvendor. Production Oper. Management 24(11):1783–1793.Crossref, Google Scholar
- (2016) Markdown or everyday low price? The role of behavioral motives. Management Sci. 62(2):326–346.Link, Google Scholar
- (2024) Diminished diversity-of-thought in a standard large language model. Behav. Res. Methods 56(6):5754–5770.Crossref, Google Scholar
- (2023) The Machine Psychology of Cooperation: Can GPT models operationalise prompts for altruism, cooperation, competitiveness and selfishness in economic games? Preprint, submitted May 13, https://arxiv.org/abs/2305.07970.Google Scholar
- (2013) Overconfidence in newsvendor orders: An experimental study. Management Sci. 59(11):2502–2517.Link, Google Scholar
- (2023) Can AI solve newsvendor problem without making biased decisions? A behavioral experimental study. Preprint, submitted September 14, http://dx.doi.org/10.2139/ssrn.4567157.Google Scholar
- (2024) Do large language models show decision heuristics similar to humans? A case study using GPT-3.5. J. Experiment. Psych. General 153(4):1066–1075.Crossref, Google Scholar
- (2023) Would ChatGPT3 get a Wharton MBA? A prediction based on its performance in the operations management course. Mack Institute for Innovation Management at the Wharton School, University of Pennsylvania, Philadelphia.Google Scholar
- (2023) Are both generative AI and ChatGPT game changers for 21st-century operations and supply chain excellence? Internat. J. Production Econom. 265:109015.Crossref, Google Scholar
- (2024) Will the real Linda please stand up … to large language models? Examining the representativeness heuristic in LLMs. Preprint, submitted April 1, https://arxiv.org/abs/2404.01461.Google Scholar
- (2016) The ASA statement on p-values: Context, process, and purpose. Amer. Statistician 70(2):129–133.Crossref, Google Scholar
- , et al. (2024) Can large language model agents simulate human trust behaviors? Preprint, submitted February 7, https://arxiv.org/abs/2402.04559.Google Scholar
- (2024) AI for social science and social science of AI: A survey. Inform. Processing Management 61(3):103665.Crossref, Google Scholar

