Proxy-Aided Demand Learning with an Application to Various Pricing Problems

Published Online:https://doi.org/10.1287/opre.2025.1793

References

  • Araman VF, Caldentey R (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.LinkGoogle Scholar
  • Ban GY, Keskin NB (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.LinkGoogle Scholar
  • Bennett A, Kallus N (2023) Proximal reinforcement learning: Efficient off-policy evaluation in partially observed Markov decision processes. Oper. Res. 72(3):1071–1086.LinkGoogle Scholar
  • Bernstein F, Modaresi S, Sauré D (2018) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Google Scholar
  • Bertsekas DP (2014) Constrained Optimization and Lagrange Multiplier Methods (Academic Press, New York).Google Scholar
  • Bertsimas D, Kallus N (2023) The power and limits of predictive approaches to observational data-driven optimization: The case of pricing. INFORMS J. Optim. 5(1):110–129.LinkGoogle Scholar
  • Bertsimas D, Vayanos P (2017) Data-driven learning in dynamic pricing using adaptive optimization. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
  • Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.LinkGoogle Scholar
  • Besbes O, Zeevi A (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.LinkGoogle Scholar
  • Besbes O, Phillips R, Zeevi A (2010) Testing the validity of a demand model: An operations perspective. Manufacturing Service Oper. Management 12(1):162–183.LinkGoogle Scholar
  • Bhattacharya D, Dupas P, Kanaya S (2024) Demand and welfare analysis in discrete choice models with social interactions. Rev. Econom. Stud. 91(2):748–784.CrossrefGoogle Scholar
  • Bijmolt TH, Van Heerde HJ, Pieters RG (2005) New empirical generalizations on the determinants of price elasticity. J. Marketing Res. 42(2):141–156.CrossrefGoogle Scholar
  • Blanchet JH, Glynn PW, Pei Y (2019) Unbiased multilevel Monte Carlo: Stochastic optimization, steady-state simulation, quantiles, and other applications. Preprint, submitted April 22, https://arxiv.org/abs/1904.09929.Google Scholar
  • Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.LinkGoogle Scholar
  • Bu J, Simchi-Levi D, Wang L (2023) Offline pricing and demand learning with censored data. Management Sci. 69(2):885–903.LinkGoogle Scholar
  • Cai H, Shi C, Song R, Lu W (2023) Jump interval-learning for individualized decision making with continuous treatments. J. Machine Learn. Res. 24(140):1–92.Google Scholar
  • Chen X, Wang Y (2023) Robust dynamic pricing with demand learning in the presence of outlier customers. Oper. Res. 71(4):1362–1386.LinkGoogle Scholar
  • Chen J, Bhattacharya R, Keith K (2024) Proximal causal inference with text data. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Proc. 38th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 135983–136017.Google Scholar
  • Chen B, Chao X, Ahn HS (2019) Coordinating pricing and inventory replenishment with nonparametric demand learning. Oper. Res. 67(4):1035–1052.AbstractGoogle Scholar
  • Chen B, Chao X, Shi C (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost sales and censored demand. Math. Oper. Res. 46(2):726–756.LinkGoogle Scholar
  • Chen B, Chao X, Wang Y (2020) Data-based dynamic pricing and inventory control with censored demand and limited price changes. Oper. Res. 68(5):1445–1456.LinkGoogle Scholar
  • Chen X, Simchi-Levi D, Wang Y (2022) Privacy-preserving dynamic personalized pricing with demand learning. Management Sci. 68(7):4878–4898.LinkGoogle Scholar
  • Chen G, Zeng D, Kosorok MR (2016) Personalized dose finding using outcome weighted learning. J. Amer. Statist. Assoc. 111(516):1509–1521.CrossrefGoogle Scholar
  • Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.LinkGoogle Scholar
  • Cohen MC, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.LinkGoogle Scholar
  • Cohen MC, Miao S, Wang Y (2021) Dynamic pricing with fairness constraints. Preprint, submitted September 25, https://doi.org/10.2139/ssrn.3930622.Google Scholar
  • Cohen MC, Leung NHZ, Panchamgam K, Perakis G, Smith A (2017) The impact of linear optimization on promotion planning. Oper. Res. 65(2):446–468.LinkGoogle Scholar
  • Cui Y, Pu H, Shi X, Miao W, Tchetgen Tchetgen E (2024) Semiparametric proximal causal inference. J. Amer. Statist. Assoc. 119(546):1348–1359.CrossrefGoogle Scholar
  • Dai YH (2002) Convergence properties of the BFGS algorithm. SIAM J. Optim. 13(3):693–701.CrossrefGoogle Scholar
  • den Boer AV, Keskin NB (2022) Dynamic pricing with demand learning and reference effects. Management Sci. 68(10):7112–7130.LinkGoogle Scholar
  • Dikkala N, Lewis G, Mackey L, Syrgkanis V (2020) Minimax estimation of conditional moment models. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 12248–12262.Google Scholar
  • Facchinei F, Kungurtsev V (2023) Stochastic approximation for expectation objective and expectation inequality-constrained nonconvex optimization. Preprint, submitted July 6, https://arxiv.org/abs/2307.02943.Google Scholar
  • Fan J, Guo Y, Yu M (2022) Policy optimization using semiparametric models for dynamic pricing. J. Amer. Statist. Assoc. 119(545):552–564.CrossrefGoogle Scholar
  • Ghassami A, Shpitser I, Tchetgen ET (2023) Partial identification of causal effects using proxy variables. Preprint, submitted April 10, https://arxiv.org/abs/2304.04374.Google Scholar
  • Ghassami A, Ying A, Shpitser I, Tchetgen ET (2022) Minimax kernel machine learning for a class of doubly robust functionals with application to proximal causal inference. Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. 25th Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 151 (PMLR, New York), 7210–7239.Google Scholar
  • Golrezaei N, Jaillet P, Liang JCN (2023) Incentive-aware contextual pricing with non-parametric market noise. Ruiz F, Dy J, van de Meent J-W, eds. Proc. 26th Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 206 (PMLR, New York), 9331–9361.Google Scholar
  • Gupta S, Kohavi R, Tang D, Xu Y, Andersen R, Bakshy E, Cardin N, et al. (2019) Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter 21(1):20–35.CrossrefGoogle Scholar
  • Hartford J, Lewis G, Leyton-Brown K, Taddy M (2017) Deep IV: A flexible approach for counterfactual prediction. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., Proceedings of the Machine Learning Research, vol. 70 (PMLR, New York), 1414–1423.Google Scholar
  • Hernán MA, Robins JM (2020) Causal Inference: What If (Chapman & Hall/CRC, Boca Raton, FL).Google Scholar
  • Hildenbrand W (1983) On the “law of demand.” Econometrica 51(4):997–1019.CrossrefGoogle Scholar
  • Ito S, Fujimaki R (2016) Large-scale price optimization via network flow. Lee DD, von Luxburg U, Garnett R, Sugiyama M, Guyon I, eds. Proc. 30th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3862–3870.Google Scholar
  • Javanmard A, Nazerzadeh H (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(1):315–363.Google Scholar
  • Kallus N, Zhou A (2018) Policy evaluation and optimization with continuous treatments. Storkey A, Perez-Cruz F, eds. Proc. 21st Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 84 (PMLR, New York), 1243–1251.Google Scholar
  • Kallus N, Mao X, Uehara M (2021) Causal inference under unmeasured confounding with negative controls: A minimax learning approach. Preprint, submitted March 25, https://arxiv.org/abs/2103.14029.Google Scholar
  • Kegenbekov Z, Jackson I (2021) Adaptive supply chain: Demand-supply synchronization using deep reinforcement learning. Algorithms 14(8):240.CrossrefGoogle Scholar
  • Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.LinkGoogle Scholar
  • Kompa B, Bellamy D, Kolokotrones T, Robins JM, Beam A (2022) Deep learning methods for proximal inference via maximum moment restriction. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 11189–11201.Google Scholar
  • Kress R (1989) Linear Integral Equations, Applied Mathematical Sciences, vol. 82 (Springer, Berlin).CrossrefGoogle Scholar
  • Kuroki M, Pearl J (2014) Measurement bias and effect restoration in causal inference. Biometrika 101(2):423–437.CrossrefGoogle Scholar
  • Lau HS, Lau AHL (1999) Manufacturer’s pricing strategy and return policy for a single-period commodity. Eur. J. Oper. Res. 116(2):291–304.CrossrefGoogle Scholar
  • Lee S, Homem-de Mello T, Kleywegt AJ (2012) Newsvendor-type models with decision-dependent uncertainty. Math. Methods Oper. Res. 76(2):189–221.CrossrefGoogle Scholar
  • Li X, Zheng Z (2023) Dynamic pricing with external information and inventory constraint. Management Sci. 70(9):5985–6001.Google Scholar
  • Lin KY (2006) Dynamic pricing with real-time demand learning. Eur. J. Oper. Res. 174(1):522–538.CrossrefGoogle Scholar
  • Liu A, Lau VK, Kananian B (2019) Stochastic successive convex approximation for non-convex constrained stochastic optimization. IEEE Trans. Signal Processing 67(16):4189–4203.CrossrefGoogle Scholar
  • Liu J, Park C, Li K, Tchetgen Tchetgen EJ (2024) Regression-based proximal causal inference. Amer. J. Epidemiology 194(7):2030–2036.CrossrefGoogle Scholar
  • Liu P, Yang Z, Wang Z, Sun WW (2025) Contextual dynamic pricing with strategic buyers. J. Amer. Statist. Assoc. 120(550):896–908.CrossrefGoogle Scholar
  • Luo Y, Sun WW, Liu Y (2024) Distribution-free contextual dynamic pricing. Math. Oper. Res. 49(1):599–618.LinkGoogle Scholar
  • Mastouri A, Zhu Y, Gultchin L, Korba A, Silva R, Kusner M, Gretton A, Muandet K (2021) Proximal causal learning with kernels: Two-stage estimation and moment restriction. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., Proceedings of the Machine Learning Research, vol. 139 (PMLR, New York), 7512–7523.Google Scholar
  • Miao W, Geng Z, Tchetgen Tchetgen EJ (2018) Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105(4):987–993.CrossrefGoogle Scholar
  • Miao R, Qi Z, Shi C, Lin L (2023) Personalized pricing with invalid instrumental variables: Identification, estimation, and policy learning. Preprint, submitted February 24, https://arxiv.org/abs/2302.12670.Google Scholar
  • Miao W, Shi X, Li Y, Tchetgen Tchetgen EJ (2024) A confounding bridge approach for double negative control inference on causal effects. Statist. Theory Related Fields 8(4):262–273.CrossrefGoogle Scholar
  • Pearl J (2009) Causal inference in statistics: An overview. Statist. Surveys 3:96–146.CrossrefGoogle Scholar
  • Perakis G, Singhvi D (2023) Dynamic pricing with unknown nonparametric demand and limited price changes. Oper. Res. 72(6):2726–2744.LinkGoogle Scholar
  • Qi Z, Miao R, Zhang X (2024) Proximal learning for individualized treatment regimes under unmeasured confounding. J. Amer. Statist. Assoc. 119(546):915–928.CrossrefGoogle Scholar
  • Qian M, Murphy SA (2011) Performance guarantees for individualized treatment rules. Ann. Statist. 39(2):1180–1210.CrossrefGoogle Scholar
  • Rakshit P, Shi X, Tchetgen ET (2025) Adaptive proximal causal inference with some invalid proxies. Preprint, submitted July 25, https://www.arxiv.org/abs/2507.19623.Google Scholar
  • Shah V, Johari R, Blanchet J (2019) Semi-parametric dynamic contextual pricing. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2363–2373.Google Scholar
  • Shen T, Cui Y (2023) Optimal treatment regimes for proximal causal learning. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. Proc. 37th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 47735–47748.Google Scholar
  • Singh R, Sahani M, Gretton A (2019) Kernel instrumental variable regression. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 4593–4605.Google Scholar
  • Talluri K, Van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.LinkGoogle Scholar
  • Tang J, Qi Z, Fang E, Shi C (2025) Offline feature-based pricing under censored demand: A causal inference approach. Manufacturing Service Oper. Management 27(2):535–553.LinkGoogle Scholar
  • Tchetgen Tchetgen EJ, Ying A, Cui Y, Shi X, Miao W (2024) An introduction to proximal causal inference. Statist. Sci. 39(3):375–390.CrossrefGoogle Scholar
  • Wang J, Qi Z, Shi C (2022) Blessing from experts: Super reinforcement learning in confounded environments. Preprint, submitted September 29, https://arxiv.org/abs/2209.15448.Google Scholar
  • Wang Y, Chen X, Chang X, Ge D (2021) Uncertainty quantification for demand prediction in contextual dynamic pricing. Production Oper. Management 30(6):1703–1717.CrossrefGoogle Scholar
  • Wang CH, Wang Z, Sun WW, Cheng G (2023) Online regularization toward always-valid high-dimensional dynamic pricing. J. Amer. Statist. Assoc. 119(548):2895–2907.CrossrefGoogle Scholar
  • Wu Y, Fu Y, Wang S, Sun X (2023) Doubly robust proximal causal learning for continuous treatments. 12th Internat. Conf. Learn. Representations (Vienna).Google Scholar
  • Wu S, Hitt LM, Chen P, Anandalingam G (2008) Customized bundle pricing for information goods: A nonlinear mixed-integer programming approach. Management Sci. 54(3):608–622.LinkGoogle Scholar
  • Zhang J, Tchetgen Tchetgen E (2024) On identification of dynamic treatment regimes with proxies of hidden confounders. Preprint, submitted February 22, https://arxiv.org/abs/2402.14942.Google Scholar
  • Zhang B, Tsiatis AA, Laber EB, Davidian M (2012) A robust method for estimating optimal treatment regimes. Biometrics 68(4):1010–1018.CrossrefGoogle Scholar
  • Zhao P, Chambaz A, Josse J, Yang S (2024) Positivity-free policy learning with observational data. Dasgupta S, Mandt S, Li Y, eds. Proc. 27th Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 238 (PMLR, New York), 1918–1926.Google Scholar
  • Zhao Y, Zeng D, Rush AJ, Kosorok MR (2012) Estimating individualized treatment rules using outcome weighted learning. J. Amer. Statist. Assoc. 107(499):1106–1118.CrossrefGoogle Scholar
  • Zimmert J, Seldin Y (2019) An optimal algorithm for stochastic and adversarial bandits. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 89 (PMLR, New York), 467–475.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.