Understanding Managers’ Trade-Offs Between Exploration and Exploitation

Published Online:https://doi.org/10.1287/mksc.2021.1304

References

  • Adler PS, Goldoftas B, Levine DI (1999) Flexibility vs. efficiency? A case study of model changeovers in the Toyota production system. Organ. Sci. 10(1):43–68.LinkGoogle Scholar
  • Ahn WY, Vasilev G, Lee SH, Busemeyer JR, Kruschke JK, Bechara A, Vassileva J (2014) Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Front. Psychol. 5:849.CrossrefGoogle Scholar
  • Amaldoss W, Meyer RJ, Raju JS, Rapoport A (2000) Collaborating to compete. Marketing Sci. 19(2):105–126.LinkGoogle Scholar
  • Anderson CM (2001) Behavioral models of strategies in multi-armed bandit problems. Unpublished PhD Thesis, California Institute of Technology, Pasadena.Google Scholar
  • Ansari A, Montoya R, Netzer O (2012) Dynamic learning in behavioral games: A hidden Markov mixture of experts approach. Quant. Marketing Econom. 10(4):475–503.CrossrefGoogle Scholar
  • Ascarza E, Hardie BGS (2013) A joint model of usage and churn in contractual settings. Marketing Sci. 32(4):570–590.LinkGoogle Scholar
  • Ascarza E, Netzer O, Hardie BGS (2018) Some customers would rather leave without saying goodbye. Marketing Sci. 37(1):54–77.LinkGoogle Scholar
  • Baardman L, Fata E, Pani A, Perakis G (2019) Learning optimal online advertising portfolios with periodic budgets. Preprint, submitted March 27, http://dx.doi.org/10.2139/ssrn.3346642.Google Scholar
  • Banks J, Olson M, Porter D (1997) An experimental analysis of the bandit problem. Econom. Theory 10(1):55–77.CrossrefGoogle Scholar
  • Benner MJ, Tushman M (2002) Process management and technological innovation: A longitudinal study of the photography and paint industries. Admin. Sci. Quart. 47(4):676–706.CrossrefGoogle Scholar
  • Benner MJ, Tushman ML (2003) Exploitation, exploration, and process management: The productivity dilemma revisited. Acad. Management Rev. 28(2):238–256.CrossrefGoogle Scholar
  • Betancourt MJ, Girolami M (2015) Hamiltonian Monte Carlo for hierarchical models. Upadhyay SK, Singh U, Dey DK, Loganathan A, eds. Current Trends in Bayesian Methodology with Applications (CRC Press, Boca Raton, FL), 79–100.Google Scholar
  • Biele G, Erev I, Ert E (2009) Learning, risk attitude and hot stoves in restless bandit problems. J. Math. Psych. 53(3):155–167.CrossrefGoogle Scholar
  • Busemeyer JR, Stout JC (2002) A contribution of cognitive decision models to clinical assessment: Decomposing performance on the Bechara gambling task. Psych. Assessment 14(3):253–262.CrossrefGoogle Scholar
  • Camerer C, Ho TH (1999) Experience-weighted attraction learning in normal form games. Econometrica 67(4):827–874.CrossrefGoogle Scholar
  • Camerer CF, Ho TH, Chong JK (2004) A cognitive hierarchy model of games. Quart. J. Econom. 119(3):861–898.CrossrefGoogle Scholar
  • Cohen JD, McClure SM, Yu AJ (2007) Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos. Trans. Roy. Soc. London Ser. B 362(1481):933–942.CrossrefGoogle Scholar
  • Crosetto P, Filippin A (2013) The” bomb” risk elicitation task. J. Risk Uncertainty 47(1):31–65.CrossrefGoogle Scholar
  • Cui TH, Mallucci P (2016) Fairness ideals in distribution channels. J. Marketing Res. 53(6):969–987.CrossrefGoogle Scholar
  • Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441(7095):876–879.CrossrefGoogle Scholar
  • Denrell J (2005) Why most people disapprove of me: Experience sampling in impression formation. Psych. Rev. 112(4):951–978.CrossrefGoogle Scholar
  • Denrell J (2007) Adaptive learning and risk taking. Psych. Rev. 114(1):177–187.CrossrefGoogle Scholar
  • Denrell J, March JG (2001) Adaptation as information restriction: The hot stove effect. Organ. Sci. 12(5):523–538.LinkGoogle Scholar
  • Erev I, Haruvy E (2005) Generality, repetition, and the role of descriptive learning models. J. Math. Psych. 49(5):357–371.CrossrefGoogle Scholar
  • Erev I, Haruvy E (2015) Learning and the economics of small decisions. The Handbook of Experimental Economics, vol. 2 (Princeton University Press, Princeton, NJ), 638–716.Google Scholar
  • Erev I, Roth AE (2014) Maximization, learning, and economic behavior. Proc. Natl. Acad. Sci. USA 111(3):10818–10825.CrossrefGoogle Scholar
  • Erev I, Ert E, Yechiam E (2008) Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. J. Behav. Decision Making 21(5):575–597.CrossrefGoogle Scholar
  • Gans N, Knox G, Croson R (2007) Simple models of discrete choice and their performance in bandit experiments. Manufacturing Serv. Oper. Management 9(4):383–408.LinkGoogle Scholar
  • Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Statist. Sci. 7(4):457–472.CrossrefGoogle Scholar
  • Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for Bayesian models. Statist. Comput. 24(6):997–1016.CrossrefGoogle Scholar
  • Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian Data Analysis (CRC Press, Boca Raton, FL).CrossrefGoogle Scholar
  • Gilboa I, Pazgal A (2001) Cumulative discrete choice. Marketing Lett. 12(2):119–130.CrossrefGoogle Scholar
  • Gittins J, Glazebrook K, Weber R (2011) Multi-Armed Bandit Allocation Indices, 2nd ed. (Wiley, Hoboken, NJ).CrossrefGoogle Scholar
  • Gittins JC, Jones DM (1979) A dynamic allocation index for the discounted multiarmed bandit problem. Biometrika 66(3):561–565.CrossrefGoogle Scholar
  • Goldfarb A, Xiao M (2011) Who thinks about the competition? Managerial ability and strategic entry in US local telephone markets. Amer. Econom. Rev. 101(7):3130–3161.CrossrefGoogle Scholar
  • Goldfarb A, Yang B (2009) Are all managers created equal? J. Marketing Res. 46(5):612–622.CrossrefGoogle Scholar
  • Goldfarb A, Ho TH, Amaldoss W, Brown AL, Chen Y, Cui TH, Galasso A, et al. (2012) Behavioral models of managerial decision-making. Marketing Lett. 23(2):405–421.CrossrefGoogle Scholar
  • Hastie T, Tibshirani R, Friedman J (2016) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer, New York).Google Scholar
  • Hauser JR, Liberali GG, Urban GL (2014) Website morphing 2.0: Switching costs, partial exposure, random exit, and when to morph. Management Sci. 60(6):1594–1616.LinkGoogle Scholar
  • Hauser JR, Urban GL, Liberali G, Braun M (2009) Website morphing. Marketing Sci. 28(2):202–223.LinkGoogle Scholar
  • Hill DN, Nassif H, Liu Y, Iyer A, Vishwanathan SVN (2017) An efficient bandit algorithm for realtime multivariate optimization. Matwin S, Yu S, Farooq F, eds. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1813–1821. Google Scholar
  • Ho TH, Lim N, Camerer CF (2006) Modeling the psychology of consumer and firm behavior with behavioral economics. J. Marketing Res. 43(3):307–331.CrossrefGoogle Scholar
  • Horowitz A (1975) Experimental study of the two-armed bandit problem. Unpublished PhD thesis, University of North Carolina, Chapel Hill..Google Scholar
  • Lattimore T (2016) Regret analysis of the finite-horizon Gittins index strategy for multi-armed bandits. Feldman V, Rakhlin A, Shamir O, eds. Proc. Machine Learn. Res. Conf. Learn. Theory, vol. 49 (PMLR, Columbia University, New York), 1–32.Google Scholar
  • Liberali G, Ferecatu A (2019) Morphing consumer dynamics: Bandits meet HMM. Preprint, submitted December 16, http://dx.doi.org/10.2139/ssrn.3495518.Google Scholar
  • Lin S, Zhang J, Hauser JR (2015) Learning from experience, simply. Marketing Sci. 34(1):1–19.LinkGoogle Scholar
  • March JG (1991) Exploration and exploitation in organizational learning. Organ. Sci. 2(1):71–87.LinkGoogle Scholar
  • March JG (1996) Learning to be risk averse. Psych. Rev. 103(2):309–319.CrossrefGoogle Scholar
  • Meyer RJ, Shi Y (1995) Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management Sci. 41(5):817–834.LinkGoogle Scholar
  • Misra K, Schwartz EM, Abernethy J (2019) Dynamic online pricing with incomplete information using multiarmed bandit experiments. Marketing Sci. 38(2):226–252.LinkGoogle Scholar
  • Murphy KP, Bach F (2012) Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, MA).Google Scholar
  • Nenkov GY, Morrin M, Ward A, Schwartz B, Hulland J (2008) A short form of the Maximization Scale: Factor structure, reliability and validity studies. Judgment Decision Making 3(5):371–388.CrossrefGoogle Scholar
  • Nevo I, Erev I (2012) On surprise, change, and the effect of recent outcomes. Front. Psych.: Cognitive Sci. 3:24.Google Scholar
  • Niv Y, Edlund JA, Dayan P, O’Doherty JP (2012) Neural prediction errors reveal a risk-Sensitive reinforcement-learning process in the human brain. J. Neurosci. 32(2):551–562.CrossrefGoogle Scholar
  • Novak TP, Hoffman DL (2009) The fit of thinking style and situation: New measures of situation-specific experiential and rational cognition. J. Consumer Res. 36(1):56–72.CrossrefGoogle Scholar
  • Posen HE, Levinthal DA (2011) Chasing a moving target: Exploitation and exploration in dynamic environments. Management Sci. 58(3):587–601.LinkGoogle Scholar
  • Rapoport A, Amaldoss W (2000) Mixed strategies and iterative elimination of strongly dominated strategies: an experimental investigation of states of knowledge. J. Econom. Behav. Organ. 42(4):483–521.CrossrefGoogle Scholar
  • Rapoport A, Budescu DV (1997) Randomization in individual choice behavior. Psych. Rev. 104(3):603–617.CrossrefGoogle Scholar
  • Rescorla R, Wagner A (1972) A theory of Pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement. Abraham HB, William FP, eds. Classical Conditioning: Current Research and Theory (Appleton- Century-Crofts, New York), 64–99. Google Scholar
  • Roth AE, Erev I (1995) Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games Econom. Behav. 8(1):164–212.CrossrefGoogle Scholar
  • Roth AE, Erev I (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Amer. Econom. Rev. 88(4):848–881.Google Scholar
  • Sarin R, Vahid F (1999) Payoff Assessments without probabilities: A simple dynamic model of choice. Games Econom. Behav. 28(2):294–309.CrossrefGoogle Scholar
  • Schwartz B, Ward A, Monterosso J, Lyubomirsky S, White K, Lehman DR (2002) Maximizing vs. satisficing: Happiness is a matter of choice. J. Personality Soc. Psych. 83(5):1178–1197.CrossrefGoogle Scholar
  • Schwartz EM, Bradlow ET, Fader PS (2017) Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Sci. 36(4):500–522.LinkGoogle Scholar
  • Scott SL (2010) A modern Bayesian look at the multi-armed bandit. Appl. Stochastic Models Bus. Indust. 26(6):639–658.CrossrefGoogle Scholar
  • Shahrokhi Tehrani S, Ching AT (2019) A heuristic approach to explore: The value of perfect information. Preprint, submitted May 21, http://dx.doi.org/10.2139/ssrn.3386737.Google Scholar
  • Shmueli G (2010) To explain or to predict? Statist. Sci. 25(3):289–310.CrossrefGoogle Scholar
  • Simon HA (1959) Theories of decision-making in economics and behavioral science. Amer. Econom. Rev. 49(3):253–283.Google Scholar
  • Steyvers M, Lee MD, Wagenmakers EJ (2009) A Bayesian analysis of human decision-making on bandit problems. J. Math. Psych. 53(3):168–179.CrossrefGoogle Scholar
  • Toplak ME, West RF, Stanovich KE (2011) The cognitive reflection test as a predictor of performance on heuristics-and-biases tasks. Memory Cognition. 39(7):1275–1289.CrossrefGoogle Scholar
  • Tushman ML, O’Reilly CA (1996) Ambidextrous organizations: Managing evolutionary and revolutionary change. California Management Rev. 38(4):8–29.CrossrefGoogle Scholar
  • Tversky A, Kahneman D (1992) Advances in prospect theory: Cumulative representation of uncertainty. J. Risk Uncertainty 5(4):297–323.CrossrefGoogle Scholar
  • Urban GL, Liberali GG, MacDonald E, Bordley R, Hauser JR (2014) Morphing banner advertising. Marketing Sci. 33(1):27–46.LinkGoogle Scholar
  • Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25:287–298.CrossrefGoogle Scholar
  • Yang LC, Toubia O, De Jong MG (2015) A bounded rationality model of information search and choice in preference measurement. J. Marketing Res. 52(2):166–183.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.