Deep Reinforcement Learning for Sequential Targeting
Published Online:21 Dec 2022https://doi.org/10.1287/mnsc.2022.4621
References
- (2011) Location, location, location: An analysis of profitability of position in online advertising markets. J. Marketing Res. 48(6):1057–1073.Crossref, Google Scholar
- (1977) Effects of information presentation format on consumer information acquisition strategies. J. Consumer Res. 3(4):233–240.Crossref, Google Scholar
- (1990) Sales Promotion: Concepts, Methods, and Strategies (Prentice Hall, Englewood Cliffs, NJ).Google Scholar
- (2002) R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Machine Learn. Res. 3(Oct):213–231.Google Scholar
- Business Insider (2017) Just 2% of app installs lead to purchases. (February 22), https://www.businessinsider.com/just-2-of-app-installs-lead-to-purchases-2017-2.Google Scholar
- (2017) Boltzmann exploration done right. Adv. Neural Inform. Processing Systems 30, 6284–6293.Google Scholar
- (2002) Similarity estimation techniques from rounding algorithms. Proc. 34th Annual ACM Sympos. Theory Comput. (ACM), 380–388.Google Scholar
- (2016) Xgboost: A scalable tree boosting system. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining, 785–794.Google Scholar
- (2008) Determinants of firms’ backward-and forward-looking R&D search behavior. Organ. Sci. 19(4):609–622.Link, Google Scholar
- (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. Preprint, submitted September 3, https://arxiv.org/abs/1406.1078.Google Scholar
- eMarketer (2019) US time spent with mobile 2019. (May 30), https://www.emarketer.com/content/us-time-spent-with-mobile-2019.Google Scholar
- (2017) Noisy networks for exploration. Preprint, submitted June 30, https://arxiv.org/abs/1706.10295v1.Google Scholar
- (2006) Behavior-based price discrimination and customer recognition. Hendershott T, ed. Economics and Information Systems, vol. 1 (Elsevier Science, Oxford, United Kingdom), 377–436.Google Scholar
- (2009) Assigning discounts in a marketing campaign by using reinforcement learning and neural networks. Expert Systems Appl. 36(4):8022–8031.Crossref, Google Scholar
- (2020) Mastering Atari with discrete world models. Preprint, submitted December 22, https://arxiv.org/abs/2010.02193v2.Google Scholar
- (2014) Website morphing 2.0: Switching costs, partial exposure, random exit, and when to morph. Management Sci. 60(6):1594–1616.Link, Google Scholar
- (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
- (2010) Near-optimal regret bounds for reinforcement learning. J. Machine Learn. Res. 11(2010):1563–1600.Google Scholar
- (1999) Managing advertising and promotion for long-run profitability. Marketing Sci. 18(1):1–22.Link, Google Scholar
- (1986) Measuring variety-seeking and reinforcement behaviors using panel data. J. Marketing Res. 23(2):89–100.Crossref, Google Scholar
- (2018) Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning. Proc. 32nd AAAI Conf. Artificial Intelligence (AAAI), 2305–2313.Google Scholar
- (2014) Adam: A method for stochastic optimization. Preprint, submitted December 22, https://arxiv.org/abs/1412.6980v1.Google Scholar
- (2019) How do recommender systems affect sales diversity? A cross-category investigation via randomized field experiment. Inform. Systems Res. 30(1):239–259.Link, Google Scholar
- (2016) Deep reinforcement learning for dialogue generation. Preprint, submitted September 29, https://arxiv.org/abs/1606.01541.Google Scholar
- (2003) Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 30(1):76–80.Crossref, Google Scholar
- (1986) Advertising pulsing policies for generating awareness for new products. Marketing Sci. 5(2):89–106.Link, Google Scholar
- Medsker LR, Jain LC, eds. (2001) Recurrent Neural Networks: Design and Applications (CRC Press, Boca Raton, FL).Google Scholar
- (1997) The long-term impact of promotion and advertising on consumer brand choice. J. Marketing Res. 34(2):248–261.Crossref, Google Scholar
- (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533.Crossref, Google Scholar
- (2000) Content-based book recommending using learning for text categorization. Proc. 5th ACM Conf. Digital Libraries (ACM), 195–204.Google Scholar
- (2019) Orthogonal random forest for causal inference. Preprint, submitted September 25, https://arxiv.org/abs/1806.03467v4.Google Scholar
- (1988) Are consumers forward looking? Evidence from fiscal experiments. Amer. Econom. Rev. 78(2):413–418.Google Scholar
- (2019) Optimizing user engagement through adaptive ad sequencing. Technical report, Cornell University, Ithaca, NY.Google Scholar
- (1992) The effect of price promotions on variability in product category sales. Marketing Sci. 11(3):207–220.Link, Google Scholar
- (2017) An overview of multi-task learning in deep neural networks. Preprint, submitted June 15, https://arxiv.org/abs/1706.05098.Google Scholar
- (2019) An experimental investigation of the effects of retargeted advertising: The role of frequency and timing. J. Marketing Res. 56(3):401–418.Crossref, Google Scholar
- (2015) Prioritized experience replay. Preprint, submitted November 19, https://arxiv.org/abs/1511.05952v2.Google Scholar
- (2017) Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Sci. 36(4):500–522.Link, Google Scholar
- (2004) Modeling multiple sources of state dependence in random utility models: A distributed lag approach. Marketing Sci. 23(2):263–271.Link, Google Scholar
- (2006) Dynamic catalog mailing policies. Management Sci. 52(5):683–696.Link, Google Scholar
- (2008) An analysis of model-based interval estimation for markov decision processes. J. Comput. System Sci. 74(8):1309–1331.Crossref, Google Scholar
- (2006) Pac model-free reinforcement learning. Proc. 23rd Internat. Conf. Machine Learn., 881–888.Google Scholar
- (2014) Sequence to sequence learning with neural networks. Adv. Neural Inform. Processing Systems, 3104–3112.Google Scholar
- (2017) # exploration: A study of count-based exploration for deep reinforcement learning. Adv. Neural Inform. Processing Systems, 2753–2762.Google Scholar
- (2010) Adaptive ε-greedy exploration in reinforcement learning based on value differences. Annual Conf. Artificial Intelligence (Springer), 203–210.Google Scholar
- (2014) Morphing banner advertising. Marketing Sci. 33(1):27–46.Link, Google Scholar
- (2016) Deep reinforcement learning with double q-learning. Thirtieth AAAI Conf. Artificial Intelligence.Google Scholar
- (2016) Dueling network architectures for deep reinforcement learning. Preprint, submitted April 5, https://arxiv.org/abs/1511.06581v3.Google Scholar
- (1986) A reference price model of brand choice for frequently purchased products. J. Consumer Res. 13(2):250–256.Crossref, Google Scholar
- (2019a) Hierarchical reinforcement learning for course recommendation in MOOCs. Proc. 33rd Conf. Artificial Intelligence (AAAI),435–442.Crossref, Google Scholar
- (2019b) Personalized mobile targeting with user engagement stages: Combining a structural hidden Markov model and field experiment. Inform. Systems Res. 30(3):787–804.Link, Google Scholar
- (2018a) Impression allocation for combating fraud in e-commerce via deep reinforcement learning with action norm penalty. Proc. 27th Internat. Joint Conf. Artificial Intelligence, 3940–3946.Google Scholar
- (2018b) Recommendations with negative feedback via pairwise deep reinforcement learning. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM), 1040–1048.Google Scholar

