Dynamic Coupon Targeting Using Batch Deep Reinforcement Learning: An Application to Livestream Shopping

Xiao Liu
Xiao Liu
[email protected]
https://orcid.org/0000-0002-7093-8534
Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author

Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Published Online:20 Oct 2022https://doi.org/10.1287/mksc.2022.1403

References

Ascarza E, Netzer O, Hardie BGS (2018) Some customers would rather leave without saying goodbye. Marketing Sci. 37(1):54–77.Link, Google Scholar
Bell DR, Lattin JM (2000) Looking for loss aversion in scanner panel data: The confounding effect of price response heterogeneity. Marketing Sci. 19(2):185–200.Link, Google Scholar
Bertsekas D (2019) Reinforcement Learning and Optimal Control (Athena Scientific, Belmont, MA).Google Scholar
Cai Y, Judd KL (2010) Stable and efficient computational methods for dynamic programming. J. Eur. Econom. Assoc. 8(2–3):626–634.Crossref, Google Scholar
Dubé JP, Misra S (2022) Personalized pricing and customer welfare. J. Political Econom. 131(1):131–189.Google Scholar
Dubé JP, Hitsch GJ, Rossi PE (2010) State dependence and alternative explanations for consumer inertia. RAND J. Econom. 41(3):417–445.Crossref, Google Scholar
Dudík M, Langford J, Li L (2011) Doubly robust policy evaluation and learning. Getoor L, Scheffer T, eds. Proc. 28th Internat. Conf. on Machine Learn. (Omnipress, Madison, WI), 1097–1104.Google Scholar
Dudík M, Erhan D, Langford J, Li L (2014) Doubly robust policy evaluation and optimization. Statist. Sci. 29(4):485–511.Crossref, Google Scholar
Fader PS, Hardie BGS, Lee KL (2005) RFM and CLV: Using iso-value curves for customer base analysis. J. Marketing Res. 42(4):415–430.Crossref, Google Scholar
Fan J, Wang Z, Xie Y, Yang Z (2020) A theoretical analysis of deep q-learning. Bayen AM, Jadbabaie A, Pappas G, Parrilo PA, Recht B, Tomlin C, Zeilinger M, eds. Learning for Dynamics and Control (JMLR, Cambridge, MA), 120:486–489.Google Scholar
Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann. Statist. 29(5):1189–1232.Crossref, Google Scholar
Fujimoto S, Conti E, Ghavamzadeh M, Pineau J (2019) Benchmarking batch deep reinforcement learning algorithms. Preprint, submitted October 3, https://arxiv.org/abs/1910.01708.Google Scholar
Furman J, Coyle D, Fletcher A, McAules D, Marsden P (2019) Unlocking digital competition: Report of the digital competition expert panel. Report, The National Archives, Kew, London. chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/785547/unlocking_digital_competition_furman_review_web.pdf.Google Scholar
Gedenk K, Neslin SA (1999) The role of retail promotion in determining future brand loyalty: Its effect on purchase event feedback. J. Retailing 75(4):433–459.Crossref, Google Scholar
Gönül FF, Kim BD, Shi M (2000) Mailing smarter to catalog customers. J. Interactive Marketing 14(2):2–16.Crossref, Google Scholar
Hauser JR, Urban GL, Liberali G, Braun M (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
He X, Pan J, Jin O, Xu T, Liu B, Xu T, Shi Y, et al. (2014) Practical lessons from predicting clicks on ads at Facebook. Saka E, Shen D, Lee K, Li Y, eds. Proc. 8th Internat. Workshop on Data Mining for Online Advertising (Association for Computing Machinery, New York), 1–9.Google Scholar
Hotz JV, Miller RA (1993) Conditional choice probabilities and the estimation of dynamic models. Rev. Econom. Stud. 60(3):497–529.Crossref, Google Scholar
Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. on Artificial Intelligence and Statist., vol. 89 (JMLR, Cambridge, MA), 869–878.Google Scholar
Jeuland AP (1979) Brand choice inertia as one aspect of the notion of brand loyalty. Management Sci. 25(7):671–682.Link, Google Scholar
Kahn BE, Kalwani MU, Morrison DG (1986) Measuring variety-seeking and reinforcement behaviors using panel data. J. Marketing Res. 23(2):89–100.Crossref, Google Scholar
Kim M, Sudhir K, Uetake K (2021) A structural model of a multitasking salesforce: Multidimensional incentives and plan design. Management Sci. 68(6):4602–4630.Google Scholar
Lucas RE (1976) Econometric policy evaluation: A critique. Carnegie-Rochester Conference Series on Public Policy, vol. 1, 19–46.Google Scholar
McCall JJ (1970) Economics of information and job search. Quart. J. Econom. 84(1):113–126.Crossref, Google Scholar
Misra K, Schwartz EM, Abernethy J (2019) Dynamic online pricing with incomplete information using multiarmed bandit experiments. Marketing Sci. 38(2):226–252.Link, Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, et al. (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529.Crossref, Google Scholar
Oprescu M, Syrgkanis V, Wu ZS (2019) Orthogonal random forest for causal inference. Chaudhuri K, Salakhutdinov R, eds. Internat. Conf. Machine Learn. vol. 97 (PMLR, Cambridge, MA), 4932–4941.Google Scholar
Rajendran KN, Tellis GJ (1994) Contextual and temporal components of reference price. J. Marketing 58(1):22–34.Crossref, Google Scholar
Rhee E, Russell GJ (2009) Forecasting household response in database marketing: A latent trait approach. Lawrence KD, Klimberg RK, eds. Advances in Business and Management Forecasting, vol. 6 (Emerald, Bingley, UK), 109–131.Crossref, Google Scholar
Rossi PE, McCulloch RE, Allenby GM (1996) The value of purchase history data in target marketing. Marketing Sci. 15(4):321–340.Link, Google Scholar
Rust J (1996) Numerical dynamic programming in economics. Amman HM, Kendrick DA, Rust J, eds. Handbook of Computational Economics, vol. 1 (Elsevier, North Holland Publishing Co., Amsterdam, Netherlands), 619–729.Google Scholar
Seetharaman PB, Che H (2009) Price competition in markets with consumer variety seeking. Marketing Sci. 28(3):516–525.Link, Google Scholar
Seethu Seetharaman PB (2009) 17 dynamic pricing. Rao VR, ed. Handbook of Pricing Research in Marketing (Edward Elgar Publishing, Cheltenham, UK), 384.Google Scholar
Seiler S (2013) The impact of search costs on consumer behavior: A dynamic approach. Quant. Marketing Econom. 11(2):155–203.Crossref, Google Scholar
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
UK Competition and Markets Authority (2018) Pricing algorithms: Economic working paper on the use of algorithms to facilitate collusion and personalised pricing. Working paper, UK Competition and Markets Authority, UK.Google Scholar
Urban GL, Liberali G, MacDonald E, Bordley R, Hauser JR (2013) Morphing banner advertising. Marketing Sci. 33(1):27–46.Link, Google Scholar
Van Heerde HJ, Neslin SA (2017) Sales promotion models. Handbook of Marketing Decision Models (Springer, Berlin), 13–77.Crossref, Google Scholar
Watkins CJCH (1989) Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, UK.Google Scholar
Wen H, Zhang J, Lin Q, Yang K, Huang P (2019) Multi-level deep cascade trees for conversion rate prediction in recommendation system. Proc. Conf. AAAI Artificial Intelligence 33:338–345.Crossref, Google Scholar
Winer RS (1986) A reference price model of brand choice for frequently purchased products. J. Consumer Res. 13(2):250–256.Crossref, Google Scholar
Zhang Q, Wang W, Chen Y (2019) In-consumption social listening with moment-to-moment unstructured data: The case of movie appreciation and live comments. Marketing Sci. 39(2):285–295.Google Scholar

Volume 42, Issue 4

July-August 2023

Pages 637-837, iii

Article Information

Supplemental Material

Metrics

Information

Received:September 03, 2020
Accepted:June 23, 2022
Published Online:October 20, 2022

Cite as

Xiao Liu (2022) Dynamic Coupon Targeting Using Batch Deep Reinforcement Learning: An Application to Livestream Shopping. Marketing Science 42(4):637-658.

https://doi.org/10.1287/mksc.2022.1403

Keywords

Acknowledgments

The author thanks seminar participants at The University of Arizona, Bocconi University, Central European University, University of Chicago, Dartmouth College, Korea Advanced Institute of Science & Technology, London Business School, London School of Economics, Nanyang Technological University, New York University, Peking University, Spotify, Stanford University, and University of Southern California for helpful comments; and Jiawen Yan, Zexi Ye, Xinyu Wei, Gaomin Wu, and Qi Zhao for excellent research assistance.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Dynamic Coupon Targeting Using Batch Deep Reinforcement Learning: An Application to Livestream Shopping

References

Volume 42, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News