Improving Human Sequential Decision Making with Reinforcement Learning

Published Online:https://doi.org/10.1287/mnsc.2022.02455

References

  • Akşin Z, Deo S, Jónasson JO, Ramdas K (2021) Learning from many: Partner exposure and team familiarity in fluid teams. Management Sci. 67(2):854–874.LinkGoogle Scholar
  • Allon G, Cohen MC, Moon K, Sinchaisri WP (2023) Managing multihoming workers in the gig economy. Preprint, submitted July 16, http://dx.doi.org/10.2139/ssrn.4502968.Google Scholar
  • Argote L (2012) Organizational Learning: Creating, Retaining and Transferring Knowledge (Springer Science & Business Media, New York).Google Scholar
  • Bastani O, Pu Y, Solar-Lezama A (2018) Verifiable reinforcement learning via policy extraction. NIPS’18 Proc. 32nd Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 2499–2509.Google Scholar
  • Bavafa H, Jónasson JO (2021) Recovering from critical incidents: Evidence from paramedic performance. Manufacturing Service Oper. Management 23(4):914–932.LinkGoogle Scholar
  • Bertsimas D, Dunn J (2017) Optimal classification trees. Machine Learn. 106(7):1039–1082.CrossrefGoogle Scholar
  • Brattland H, Høiseth JR, Burkeland O, Inderhaug TS, Binder PE, Iversen VC (2018) Learning from clients: A qualitative investigation of psychotherapists’ reactions to negative verbal feedback. Psychotherapy Res. 28(4):545–559.CrossrefGoogle Scholar
  • Breiman L (2001) Random forests. Machine Learn. 45(1):5–32.CrossrefGoogle Scholar
  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and Regression Trees (CRC Press, Boca Raton, FL).Google Scholar
  • Buciluǎ C Caruana R Niculescu-Mizil A(2006 Model compression. KDD’06 Proc. 12th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 535–541.Google Scholar
  • Chan TY, Li J, Pierce L (2014) Learning from peers: Knowledge transfer and sales force productivity growth. Marketing Sci. 33(4):463–484.LinkGoogle Scholar
  • Chandrasekaran A, Prabhu V, Yadav D, Chattopadhyay P, Parikh D (2018) Do explanations make VQA models more predictable to a human? Preprint, submitted October 29, https://arxiv.org/abs/1810.12366.Google Scholar
  • Chandrasekaran A, Yadav D, Chattopadhyay P, Prabhu V, Parikh D (2017) It takes two to tango: Towards theory of AI’s mind. Preprint, submitted April 3, https://arxiv.org/abs/1704.00717.Google Scholar
  • Chui M, Manyika J, Bughin J (2012) The social economy: Unlocking value and productivity through social technologies. Technical report, McKinsey Global Institute, New York.Google Scholar
  • Dietvorst BJ, Simmons JP, Massey C (2015) Algorithm aversion: People erroneously avoid algorithms after seeing them err. J. Experiment. Psych. General 144(1):114–126.CrossrefGoogle Scholar
  • Dietvorst BJ, Simmons JP, Massey C (2018) Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Sci. 64(3):1155–1170.LinkGoogle Scholar
  • Dorn B, Guzdial M (2010) Learning on the job: Characterizing the programming knowledge and learning strategies of web designers. CHI’10 Proc. SIGCHI Conf. Human Factors Comput. Systems (Association for Computing Machinery, New York), 703–712.Google Scholar
  • Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. Preprint, submitted February 28, https://arxiv.org/abs/1702.08608.Google Scholar
  • Eastwood J, Snook B, Luther K (2012) What people want from their professionals: Attitudes toward decision-making strategies. J. Behav. Decision Making 25(5):458–468.CrossrefGoogle Scholar
  • Fudenberg D, Liang A (2019) Predicting and understanding initial play. Amer. Econom. Rev. 109(12):4112–41.CrossrefGoogle Scholar
  • Fudenberg D, Kleinberg J, Liang A, Mullainathan S (2022) Measuring the completeness of economic models. J. Political Econom. 130(4):956–990.CrossrefGoogle Scholar
  • Fügener A, Grahl J, Gupta A, Ketter W (2022) Cognitive challenges in human–Artificial intelligence collaboration: Investigating the path toward productive delegation. Inform. Systems Res. 33(2):678–696.LinkGoogle Scholar
  • Gleicher M (2016) A framework for considering comprehensibility in modeling. Big Data 4(2):75–88.CrossrefGoogle Scholar
  • Green B, Chen Y (2019) The principles and limits of algorithm-in-the-loop decision making. Proc. ACM Human-Comput. Interaction, vol. 3 (Association for Computing Machinery, New York), 1–24.CrossrefGoogle Scholar
  • Gurvich I, O’Leary KJ, Wang L, Van Mieghem JA (2020) Collaboration, interruptions, and changeover times: Workflow model and empirical study of hospitalist charting. Manufacturing Service Oper. Management 22(4):754–774.LinkGoogle Scholar
  • Herkenhoff K, Lise J, Menzio G, Phillips G (2018) Knowledge diffusion in the workplace. Technical report, University of Minnesota, Minneapolis.Google Scholar
  • Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Preprint, submitted March 9, https://arxiv.org/abs/1503.02531.Google Scholar
  • Huckman RS, Pisano GP (2006) The firm specificity of individual performance: Evidence from cardiac surgery. Management Sci. 52(4):473–488.LinkGoogle Scholar
  • Ibanez MR, Clark JR, Huckman RS, Staats BR (2018) Discretionary task ordering: Queue management in radiological services. Management Sci. 64(9):4389–4407.LinkGoogle Scholar
  • Jarosch G, Oberfield E, Rossi-Hansberg E (2021) Learning from coworkers. Econometrica 89(2):647–676.CrossrefGoogle Scholar
  • Kagan E, Leider S, Sahin O (2021) Dynamic decision-making in operations management. Johns Hopkins Carey Business School Research Paper No. 21-13, Johns Hopkins Carey Business School, Baltimore.Google Scholar
  • Kc DS, Staats BR (2012) Accumulating a portfolio of experience: The effect of focal and related experience on surgeon performance. Manufacturing Service Oper. Management 14(4):618–633.LinkGoogle Scholar
  • Kim SH, Tong J, Peden C (2020) Admission control biases in hospital unit capacity management: How occupancy information hurdles and decision noise impact utilization. Management Sci. 66(11):5151–5170.LinkGoogle Scholar
  • Kleinberg J, Ludwig J, Mullainathan S, Obermeyer Z (2015) Prediction policy problems. Amer. Econom. Rev. 105(5):491–95.CrossrefGoogle Scholar
  • Kneusel RT, Mozer MC (2017) Improving human-machine cooperative visual search with soft highlighting. ACM Trans. Appl. Perception (TAP) 15(1):1–21.Google Scholar
  • Lage I, Ross AS, Kim B, Gershman SJ, Doshi-Velez F (2018) Human-in-the-loop interpretability prior. NIPS’18 Proc. 32nd Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 10180–10189.Google Scholar
  • Lai V, Tan C (2019) On human predictions with explanations and predictions of machine learning models: A case study on deception detection. FAT’19 Proc. Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 29–38.Google Scholar
  • Letham B, Rudin C, McCormick TH, Madigan D (2015) Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. Ann. Appl. Statist. 9(3):1350–1371.CrossrefGoogle Scholar
  • Logg JM, Minson JA, Moore DA (2019) Algorithm appreciation: People prefer algorithmic to human judgment. Organ. Behav. Human Decision Processes 151:90–103.CrossrefGoogle Scholar
  • Lu J, Lee D, Kim TW, Danks D (2019) Good explanation for algorithmic transparency. Preprint, submitted November 11, http://dx.doi.org/10.2139/ssrn.3503603.Google Scholar
  • Marshall A (2020) Uber changes its rules, and drivers adjust their strategies. Wired (February 18), https://www.wired.com/story/uber-changes-rules-drivers-adjust-strategies/.Google Scholar
  • McIlroy-Young R, Sen S, Kleinberg J, Anderson A (2020) Aligning superhuman AI with human behavior: Chess as a model system. KDD’20 Proc. 26th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1677–1687.Google Scholar
  • Meyer G, Adomavicius G, Johnson PE, Elidrisi M, Rush WA, Sperl-Hillen JM, O’Connor PJ (2014) A machine learning approach to improving dynamic decision making. Inform. Systems Res. 25(2):239–263.LinkGoogle Scholar
  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, et al. (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533.CrossrefGoogle Scholar
  • Nonaka I, Takeuchi H (1995) The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation (Oxford University Press, Oxford, UK).CrossrefGoogle Scholar
  • Pfeffer J, Sutton RI (2000) The Knowing-Doing Gap: How Smart Companies Turn Knowledge into Action (Harvard Business School Press, Boston).Google Scholar
  • Puiutta E, Veith EM (2020) Explainable reinforcement learning: A survey. Holzinger A, Kieseberg P, Tjoa A, Weippl E, eds. Machine Learning Knowledge Extraction. CD-MAKE 2020, Lecture Notes in Computer Science, vol. 12279 (Springer, Cham, Switzerland), 77–95.Google Scholar
  • Ramdas K, Saleh K, Stern S, Liu H (2017) Variety and experience: Learning and forgetting in the use of surgical devices. Management Sci. 64(6):2590–2608.LinkGoogle Scholar
  • Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?”: Explaining the predictions of any classifier. KDD’16 Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1135–1144.Google Scholar
  • Ross S, Gordon G, Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. Proc. 14th Internat. Conf. Artificial intelligence Statistics (JMLR), 627–635.Google Scholar
  • Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5):206–215.CrossrefGoogle Scholar
  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, et al. (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489.CrossrefGoogle Scholar
  • Song H, Tucker AL, Murrell KL, Vinson DR (2017) Closing the productivity gap: Improving worker productivity through public relative performance feedback and validation of best practices. Management Sci. 64(6):2628–2649.LinkGoogle Scholar
  • Spear SJ (2005) Fixing health care from the inside, today. Harvard Bus. Rev. 83(9):78–91.Google Scholar
  • Stites MC, Nyre-Yu M, Moss B, Smutz C, Smith MR (2021) Sage advice? The impacts of explanations for machine learning models on human decision-making in spam detection. Degen H, Ntoa S, eds. Artificial Intelligence HCI. HCII 2021, Lecture Notes in Computer Science, vol. 12797 (Springer, Cham, Switzerland), 269–284.Google Scholar
  • Sull DN, Eisenhardt KM (2015) Simple Rules: How to Thrive in a Complex World (Houghton Mifflin Harcourt, Boston).Google Scholar
  • Sun J, Zhang DJ, Hu H, Van Mieghem JA (2022) Predicting human discretion to adjust algorithmic prescription: A large-scale field experiment in warehouse operations. Management Sci. 68(2):846–865.LinkGoogle Scholar
  • Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
  • Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. NIPS’99 Proc. 13th Internat. Conf. Adv. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1057–1063.Google Scholar
  • Szulanski G (1996) Exploring internal stickiness: Impediments to the transfer of best practice within the firm. Strategic Management J. 17(S2):27–43.CrossrefGoogle Scholar
  • Tan TF, Netessine S (2019) When you work with a superman, will you also fly? An empirical study of the impact of coworkers on performance. Management Sci. 65(8):3495–3517.LinkGoogle Scholar
  • Tucker AL, Edmondson AC, Spear S (2002) When problem solving prevents organizational learning. J. Organ. Change Management 15(2):122–137.CrossrefGoogle Scholar
  • Verma A, Murali V, Singh R, Kohli P, Chaudhuri S (2018) Programmatically interpretable reinforcement learning. Internat. Conf. Machine Learn. (PMLR), 5045–5054.Google Scholar
  • Wang F, Rudin C (2015) Falling rule lists. Artificial Intelligence Statist. (PMLR), 1013–1022.Google Scholar
  • Watkins CJ, Dayan P (1992) Q-learning. Machine Learn. 8(3–4):279–292.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.