Pricing Experimental Design: Causal Effect, Expected Revenue and Tail Risk
References
- (2017) Linear Thompson sampling revisited. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 54 (PMLR, New York), 176–184.Google Scholar
- (2021) Risk and optimal policies in bandit experiments. Preprint, submitted December 13, https://arxiv.org/abs/2112.06363.Google Scholar
- (2015) Juvenile incarceration, human capital, and future crime: Evidence from randomly assigned judges. Quart. J. Econom. 130(2):759–803.Crossref, Google Scholar
- (2019) Sequential patient recruitment and allocation for adaptive clinical trials. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 89 (PMLR, New York), 1891–1900.Google Scholar
- (2021) Policy learning with observational data. ECTA 89(1):133–161.Crossref, Google Scholar
- (2018) Exact p-values for network interference. J. Amer. Statist. Assoc. 113(521):230–240.Crossref, Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
- (2020) A/b testing with fat tails. J. Political Econom. 128(12):4614–4672.Crossref, Google Scholar
- (2014) Designing and deploying online field experiments. Proc. 23rd Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 283–292.Google Scholar
- (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.Link, Google Scholar
- (2015) Bandits with unobserved confounders: A causal approach. Proc. 29th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 1342–1350.Google Scholar
- (2022) Meta dynamic pricing: Transfer learning across experiments. Management Sci. 68(3):1865–1881.Link, Google Scholar
- (2021) Optimal Thompson sampling strategies for support-aware CVaR bandits. Internat. Conf. Machine Learn. (PMLR, New York), 716–726.Google Scholar
- (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
- (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
- (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
- (2020) Near-optimal A-B testing. Management Sci. 66(10):4477–4495.Link, Google Scholar
- (2022) Near-optimal non-parametric sequential tests and confidence sequences with possibly dependent observations. Preprint, submitted December 29, https://arxiv.org/abs/2212.14411.Google Scholar
- (2005) New empirical generalizations on the determinants of price elasticity. J. Marketing Res. 42(2):141–156.Crossref, Google Scholar
- (2021) Panel experiments and dynamic causal effects: A finite population perspective. Quant. Econom. 12(4):1171–1196.Crossref, Google Scholar
- (2023) Design and analysis of switchback experiments. Management Sci. 69(7):3759–3777.Link, Google Scholar
- (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
- (2022) Context-based dynamic pricing with partially linear demand model. Adv. Neural Inform. Processing Systems 35(1):23780–23791. Google Scholar
- (2022) Dynamic pricing for two-sided marketplaces with offer expiration. Preprint, submitted January 31, http://dx.doi.org/10.2139/ssrn.3700227.Google Scholar
- (2018) A general approach to multi-armed bandits under risk criteria. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. Learn. Theory, Proceedings of Machine Learning Research (PMLR, New York), 1295–1306.Google Scholar
- (2022) A unifying theory of Thompson sampling for continuous risk-averse bandits. Proc. AAAI Conf. Artificial Intelligence, vol. 36(6) (AAAI Press, Palo Alto, CA), 6159–6166.Crossref, Google Scholar
- (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.Link, Google Scholar
- (2023) Frontiers in service science: Data-driven revenue management: The interplay of data, model, and decisions. Service Sci. 15(2):79–91.Link, Google Scholar
- (2022a) Debiasing samples from online learning using bootstrap. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 8514–8533.Google Scholar
- (2022b) The Elements of Joint Learning and Optimization in Operations Management, vol. 18 (Springer Nature, Cham, Switzerland).Crossref, Google Scholar
- (2021) Statistical inference for online decision making: In a contextual bandit setting. J. Amer. Statist. Assoc. 116(533):240–255.Crossref, Google Scholar
- (2018) Double/debiased machine learning for treatment and structural parameters. Preprint, submitted July 30, https://arxiv.org/abs/1608.00060.Google Scholar
- (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
- (2002) Investigating the effects of store-brand introduction on retailer demand and pricing behavior. Management Sci. 48(10):1242–1267.Link, Google Scholar
- (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
- (2022) Cross-national differences in market response: Line-length, price, and distribution elasticities in 14 Indo-Pacific rim economies. J. Marketing Res. 59(2):251–270.Crossref, Google Scholar
- (2021) Online multi-armed bandits with adaptive inference. Adv. Neural Inform. Processing Systems 34(1):1939–1951.Google Scholar
- (2017) Estimation considerations in contextual bandits. Preprint, submitted November 19, https://arxiv.org/abs/1711.07077.Google Scholar
- (2019) Balanced linear contextual bandits. Proc. AAAI Conf. Artificial Intelligence, vol. 33 (AAAI Press, Palo Alto, CA), 3445–3453.Google Scholar
- (2014) Doubly robust policy evaluation and optimization. Statist. Sci. 29(4):485–511.Crossref, Google Scholar
- (2021) The fragility of optimized bandit algorithms. Preprint, submitted September 28, https://arxiv.org/abs/2109.13595.Google Scholar
- (2021) Policy optimization using semiparametric models for dynamic pricing. Preprint, submitted September 13, https://arxiv.org/abs/2109.06368.Google Scholar
- (2018) More robust doubly robust off-policy evaluation. Internat. Conf. Machine Learn. (PMLR, New York), 1447–1456.Google Scholar
- (2022b) Markovian interference in experiments. Preprint, submitted June 6, https://arxiv.org/abs/2206.02371.Google Scholar
- (2022a) Synthetically controlled bandits. Preprint, submitted February 14, https://arxiv.org/abs/2202.07079.Google Scholar
- (1999) Maximizing revenues of perishable assets with a risk factor. Oper. Res. 47(2):337–341.Link, Google Scholar
- (2016) Analytics for an online retailer: Demand forecasting and price optimization. Manufacturing Service Oper. Management 18(1):69–88.Link, Google Scholar
- (2018) Competition-based dynamic pricing in online retailing: A methodology validated with field experiments. Management Sci. 64(6):2496–2514.Link, Google Scholar
- (2013) Exploration vs exploitation vs safety: Risk-aware multi-armed bandits. Asian Conf. Machine Learn. (PMLR, New York), 245–260.Google Scholar
- (2020) Adaptive experimental design with temporal interference: A maximum likelihood approach. Adv. Neural Inform. Processing Systems 33(1):15054–15064. Google Scholar
- (2017) A survey on risk-averse and robust revenue management. Eur. J. Oper. Res. 263(2):337–348.Crossref, Google Scholar
- (2018) Optimizing conditional value-at-risk in dynamic pricing. OR Spectrum 40(3):711–750.Crossref, Google Scholar
- (2021) Dynamic pricing and assortment under a contextual MNL demand. Preprint, submitted October 19, https://arxiv.org/abs/2110.10018.Google Scholar
- (2012) Bayesian lasso for semiparametric structural equation models. Biometrics 68(2):567–577.Crossref, Google Scholar
- (2021) Confidence intervals for policy evaluation in adaptive experiments. Proc. Natl. Acad. Sci. USA 118(15):e2014602118.Crossref, Google Scholar
- (2011) Adaptive experimental design using the propensity score. J. Bus. Econom. Statist. 29(1):96–108.Crossref, Google Scholar
- (2022) Online statistical inference for matrix contextual bandit. Preprint, submitted December 21, https://arxiv.org/abs/2212.11385.Google Scholar
- (2021) Time-uniform, nonparametric, nonasymptotic confidence sequences. Preprint, submitted October 18, https://arxiv.org/abs/1810.08240.Google Scholar
- (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(1):315–363.Google Scholar
- (2022) Policy learning “without” overlap: Pessimism and generalized empirical Bernstein’s inequality. Preprint, submitted December 19, https://arxiv.org/abs/2212.09900.Google Scholar
- (2015) Always valid inference: Bringing sequential analysis to a/b testing. Preprint, submitted December 15, https://arxiv.org/abs/1512.04922.Google Scholar
- (2022) Experimental design in two-sided platforms: An analysis of bias. Management Sci. 68(10):7069–7089.Link, Google Scholar
- (2018) Confounding-robust policy improvement. Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 9289–9299.Google Scholar
- . (2020) Efficient adaptive experimental design for average treatment effect estimation. Preprint, submitted February 13, https://arxiv.org/abs/2002.05308.Google Scholar
- (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
- (2018) On incomplete learning and certainty-equivalence control. Oper. Res. 66(4):1136–1167.Link, Google Scholar
- (2022) Data-driven dynamic pricing and ordering with perishable inventory in a changing environment. Management Sci. 68(3):1938–1958.Link, Google Scholar
- (2024) Data-driven clustering and feature-based retail electricity pricing with smart meters. Oper. Res., ePub ahead of print September 3, https://doi.org/10.1287/opre.2022.0112.Google Scholar
- (2021) A revised approach for risk-averse multi-armed bandits under CVaR criterion. Oper. Res. Lett. 49(4):465–472.Crossref, Google Scholar
- (2010) An empirical analysis of mobile voice service and SMS: A structural model. Management Sci. 56(2):234–252.Link, Google Scholar
- (2011) An elasticity approach to the newsvendor with price-sensitive demand. Oper. Res. 59(2):301–312.Link, Google Scholar
- (2007) Practical guide to controlled experiments on the web: Listen to your customers not to the hippo. Proc. 13th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 959–967.Google Scholar
- (2020) Trustworthy Online Controlled Experiments: A Practical Guide to a/b Testing (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2021) Weak signal asymptotics for sequentially randomized experiments. Preprint, submitted January 25, https://arxiv.org/abs/2101.09855.Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2008) Risk in revenue management and dynamic pricing. Oper. Res. 56(2):326–343.Link, Google Scholar
- (2023) Dynamic pricing with external information and inventory constraint. Management Sci. 70(9):5985–6001.Google Scholar
- (2015) Toward minimax off-policy value estimation. Artificial Intelligence Statistics (PMLR, New York), 608–616.Google Scholar
- (2023) Contextual dynamic pricing with strategic buyers. Preprint, submitted July 8, https://arxiv.org/abs/2307.04055.Google Scholar
- (2023) Distribution-free contextual dynamic pricing. Math. Oper. Res. 49(1):599–618.Link, Google Scholar
- (2018) Contextual pricing for Lipschitz buyers. Adv. Neural Inform. Processing Systems 31(1):5648–5656.Google Scholar
- (2021) Network revenue management with nonparametric demand learning:{T}-regret and polynomial dimension dependency. Preprint, submitted October 22, http://dx.doi.org/10.2139/ssrn.3948140.Google Scholar
- (2022) Context-based dynamic pricing with online clustering. Production Oper. Management 31(9):3559–3575.Crossref, Google Scholar
- (2019) Dynamic learning and pricing with model misspecification. Management Sci. 65(11):4980–5000.Link, Google Scholar
- (2021) Adaptive experimental design: Prospects and applications in political science. Amer. J. Political Sci. 65(4):826–844.Crossref, Google Scholar
- (2020) Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 5577–5586.Google Scholar
- (2016) Dynamic pricing with demand covariates. Preprint, submitted April 25, https://arxiv.org/abs/1604.07463.Google Scholar
- (2022) Adaptivity and confounding in multi-armed bandit experiments. Preprint, submitted February 18, https://arxiv.org/abs/2202.09036.Google Scholar
- (1988) Root-n-consistent semiparametric regression. Econometrica 56(4):931–954.Crossref, Google Scholar
- (2012) Risk-aversion in multi-armed bandits. Proc. 26th Internat. Conf. Neural Inform. Processing Systems, vol. 2 (Curran Associates Inc., Red Hook, NY), 3275–3283.Google Scholar
- (2019) Time-consistent, risk-averse dynamic pricing. Eur. J. Oper. Res. 277(2):587–603.Crossref, Google Scholar
- (2015) Multi-armed bandit experiments in the online service economy. Appl. Stoch. Models Bus. Indust. 31(1):37–45.Crossref, Google Scholar
- (2019) Semi-parametric dynamic contextual pricing. Adv. Neural Inform. Processing Systems 32(1):2363–2373.Google Scholar
- (2023) Pricing experimental design: Causal effect, expected revenue and tail risk. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J, eds. Proc. 40th Internat. Conf. Machine Learn., Proc. Machine Learn. Res., vol. 202 (PMLR, New York), 31788–31799. Google Scholar
- (2024) Multi-armed bandit experimental design: Online decision-making and adaptive inference. Management Sci., ePub ahead of print September 20, https://doi.org/10.1287/mnsc.2023.00492.Link, Google Scholar
- (2022) A simple and optimal policy design with safety against heavy-tailed risk for multi-armed bandits. Preprint, submitted June 7, https://arxiv.org/abs/2206.02969.Google Scholar
- (2011) Contextual bandits with similarity information. Kakade SM, von Luxburg U, eds. Proc. 24th Annual Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 19 (PMLR, New York), 679–702.Google Scholar
- (2015) Batch learning from logged bandit feedback through counterfactual risk minimization. J. Machine Learn. Res. 16(1):1731–1755. Google Scholar
- (1988) The price elasticity of selective demand: A meta-analysis of econometric models of sales. J. Marketing Res. 25(4):331–341.Crossref, Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
- (2021) Experimenting in equilibrium. Management Sci. 67(11):6694–6715.Link, Google Scholar
- (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2017) Optimal and adaptive off-policy evaluation in contextual bandits. Internat. Conf. Machine Learn. (PMLR, New York), 3589–3597.Google Scholar
- (2021b) Multimodal dynamic pricing. Management Sci. 67(10):6136–6152.Link, Google Scholar
- (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
- (2021a) On dynamic pricing with covariates. Preprint, submitted December 25, https://arxiv.org/abs/2112.13254.Google Scholar
- (2024) Online regularization toward always-valid high-dimensional dynamic pricing. J. Amer. Statist. Assoc. 119(548):2895–2907.Crossref, Google Scholar
- (2023) Online tensor inference. Preprint, submitted December 28, https://arxiv.org/abs/2312.17111.Google Scholar
- (2019) Optimal experimental design for staggered rollouts. Preprint, submitted November 9, https://arxiv.org/abs/1911.03764.Google Scholar
- (2021) Logarithmic regret in feature-based dynamic pricing. Adv. Neural Inform. Processing Systems 34(1):13898–13910.Google Scholar
- (2019) Designing and evaluating dynamic pricing policies for major league baseball tickets. Manufacturing Service Oper. Management 21(1):121–138.Link, Google Scholar
- (2021) Off-policy evaluation via adaptive weighting with data from contextual bandits. Proc. 27th ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2125–2135.Google Scholar
- (2022) Offline multi-action policy learning: Generalization and optimization. Oper. Res. 71(1):148–183.Link, Google Scholar
- (2020) Thompson sampling algorithms for mean-variance bandits. Internat. Conf. Machine Learn. (PMLR, New York), 11599–11608.Google Scholar
- (2014) Generalized risk-aversion in stochastic multi-armed bandits. Preprint, submitted May 5, https://arxiv.org/abs/1405.0833.Google Scholar

