Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing
References
- (2011) Improved algorithms for linear stochastic bandits. Adv. Neural Inform. Processing Systems, vol. 24 (Curran Associates Inc., New York), 2312–2320.Google Scholar
- (2012) Online-to-confidence-set conversions and application to sparse stochastic bandits. Proc. 15th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1–9.Google Scholar
- (2021) Demand estimation under the multinomial logit model from sales transaction data. Manufacturing Service Oper. Management 23(5):1196–1216.Link, Google Scholar
- (1995) The continuum-armed bandit problem. SIAM J. Control Optim. 33(6):1926–1951.Crossref, Google Scholar
- (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
- (2010) Joint dynamic pricing of multiple perishable products under consumer choice. Management Sci. 56(8):1345–1361.Link, Google Scholar
- (2023) Algorithmic pricing and consumer sensitivity to price variability. Preprint, submitted May 8, https://doi.org/10.2139/ssrn.4435831.Google Scholar
- (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.Link, Google Scholar
- (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
- (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.Link, Google Scholar
- (2022) EE-net: Exploitation-exploration neural networks in contextual bandits. Tenth Internat. Conf. Learn. Representations (Virtual, 2022).Google Scholar
- (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.Link, Google Scholar
- (2022) Meta dynamic pricing: Transfer learning across experiments. Management Sci. 68(3):1865–1881.Link, Google Scholar
- (2008) Optimizing product line designs: Efficient methods and comparisons. Management Sci. 54(9):1544–1552.Link, Google Scholar
- (2007) The Netflix prize. Proc. KDD Cup Workshop 2007 (Association for Computing Machinery, New York), 35.Google Scholar
- (2019) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Abstract, Google Scholar
- (2019) Exact first-choice product line optimization. Oper. Res. 67(3):651–670.Link, Google Scholar
- (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
- (2018) What doubling tricks can and can’t do for multi-armed bandits. Preprint, submitted March 19, https://arxiv.org/abs/1803.06971.Google Scholar
- (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
- (2017) Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45(2):615–646.Google Scholar
- (2018) Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics. Ann. Statist. 46(1):60–89.Google Scholar
- (2016) Matrix completion via max-norm constrained optimization. Electronic J. Statist. 10(1):1493–1525.Google Scholar
- (2010) Matrix completion with noise. Proc. IEEE 98(6):925–936.Crossref, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (2018) More Amazon effects: Online competition and pricing behaviors. NBER Working Paper No. 25138, National Bureau of Economic Research, Cambridge, MA.Google Scholar
- (2022) Estimation and inference for convex functions and computational efficiency in high dimensional statistics. PhD thesis, University of Pennsylvania, Philadelphia.Google Scholar
- (2000) Mathematical properties of the optimal product line selection problem using choice-based conjoint analysis. Management Sci. 46(2):327–332.Link, Google Scholar
- (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.Link, Google Scholar
- (2017) A note on a tight lower bound for MNL-bandit assortment selection models. Preprint, submitted September 18, https://arxiv.org/abs/1709.06109v1.Google Scholar
- (2024) Robust dynamic assortment optimization in the presence of outlier customers. Oper. Res. 72(3):999–1015.Link, Google Scholar
- (2020) Dynamic assortment optimization with changing contextual information. J. Machine Learn. Res. 21(1):8918–8961.Google Scholar
- (2022a) A statistical learning approach to personalization in revenue management. Management Sci. 68(3):1923–1937.Link, Google Scholar
- (2021) Dynamic assortment planning under nested logit models. Production Oper. Management 30(1):85–102.Crossref, Google Scholar
- (2022b) Interconnected neural linear contextual bandits with UCB exploration. 26th Pacific-Asia Conf. Knowledge Discovery Data Mining (Springer, Berlin, Heidelberg), 169–181.Google Scholar
- (2022c) Nearly dimension-independent sparse linear bandit over small action spaces via best subset selection. J. Amer. Statist. Assoc. 119(545):246–258.Google Scholar
- (2017) Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Preprint, submitted November 21, https://doi.org/10.2139/ssrn.3075658.Google Scholar
- (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
- (2008) Stochastic linear optimization under bandit feedback. 21st Annual Conf. Learn. Theory (Helsinki, Finland, 2008), 355–366.Google Scholar
- (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
- (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.Link, Google Scholar
- (2001) Macro-economic determinants of consumer price knowledge: A meta-analysis of four decades of research. Internat. J. Res. Marketing 18(4):341–355.Crossref, Google Scholar
- (2022) Policy optimization using semiparametric models for dynamic pricing. J. Amer. Statist. Assoc. 119(545):552–564.Crossref, Google Scholar
- (2021) Recent developments in factor models and applications in econometric learning. Annu. Rev. Financial Econom. 13:401–430.Google Scholar
- (2016) Random forest for the contextual bandit problem. Proc. 19th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 93–101.Google Scholar
- (2023) Demand learning and pricing for varying assortments. Manufacturing Service Oper. Management 25(4):1227–1244.Link, Google Scholar
- (2014) Multiproduct price optimization and competition under the nested logit model with product-differentiated price sensitivities. Oper. Res. 62(2):450–461.Link, Google Scholar
- (2013) Does price elasticity vary with economic growth? A cross-category analysis. J. Marketing Res. 50(1):4–23.Crossref, Google Scholar
- (1985) Models and heuristics for product line selection. Marketing Sci. 4(1):1–19.Link, Google Scholar
- (1993) Conjoint analysis with product-positioning applications. Eliashberg J, Lilien GL, eds. Handbooks in Operations Research and Management Science, vol. 5 (North–Holland, Amsterdam), 467–515.Google Scholar
- (2020) High-dimensional sparse linear bandits. Adv. Neural Inform. Processing Systems, vol. 34 (Curran Associates Inc., New York), 10753–10763.Google Scholar
- (1933) Analysis of a complex of statistical variables into principal components. J. Ed. Psych. 24(6):417.Crossref, Google Scholar
- (2021) Near-optimal representation learning for linear bandits and linear RL. Proc. 38th Internat. Conf. Machine Learn. (PMLR, New York), 4349–4358.Google Scholar
- (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(1):315–363.Google Scholar
- (2019) Bilinear bandits with low-rank structure. Proc. 36th Internat. Conf. Machine Learn. (PMLR, New York), 3163–3172.Google Scholar
- (1958) The Varimax criterion for analytic rotation in factor analysis. Psychometrika 23(3):187–200.Crossref, Google Scholar
- (2020) Dynamic assortment personalization in high dimensions. Oper. Res. 68(4):1020–1037.Link, Google Scholar
- (2022) Efficient frameworks for generalized low-rank matrix bandit problems. Adv. Neural Inform. Processing Systems, vol. 36 (Curran Associates Inc., New York), 19971–19983.Google Scholar
- (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
- (2019) Doubly-robust LASSO bandit. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates Inc., New York), 5877–5887.Google Scholar
- (2021) Scheduling servers with stochastic bilinear rewards. Preprint, submitted December 13, https://arxiv.org/abs/2112.06362v1.Google Scholar
- (2004) Nearly tight bounds for the continuum-armed bandit problem. Proc. 17th Internat. Conf. Neural Inform. Processing Systems NIPS’04 (MIT Press, Cambridge, MA), 697–704.Google Scholar
- (2003) The value of knowing a demand curve: Bounds on regret for online posted-price auctions. Proc. 44th Annu. IEEE Sympos. Foundations Comput. Sci. 2003 (IEEE, Piscataway, NJ), 594–605.Google Scholar
- (2019) Bandits and experts in metric spaces. J. ACM 66(4):30.Google Scholar
- (2020) Contextual bandits with continuous actions: Smoothing, zooming, and adapting. J. Machine Learn. Res. 21(1):5402–5446.Google Scholar
- (2014) Assessing the influence of economic and customer experience factors on service purchase behaviors. Marketing Sci. 33(5):673–692.Link, Google Scholar
- (2017) Stochastic low-rank bandits. Preprint, submitted December 13, https://arxiv.org/abs/1712.04644.Google Scholar
- (2019) Stochastic linear bandits with hidden low rank structure. Preprint, submitted January 28, https://arxiv.org/abs/1901.09490.Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2024) Low-rank online dynamic assortment with dual contextual information. Preprint, submitted April 19, https://arxiv.org/abs/2404.17592v1.Google Scholar
- (2020) Product design under multinomial logit choices: Optimization of quality and prices in an evolving product line. Manufacturing Service Oper. Management 22(5):1011–1025.Link, Google Scholar
- (2018) Embracing the blessing of dimensionality in factor models. J. Amer. Statist. Assoc. 113(521):380–389.Google Scholar
- (2010) A contextual-bandit approach to personalized news article recommendation. WWW’10 Proc. 19th Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 661–670.Google Scholar
- (2010) Contextual multi-armed bandits. Proc. 13th Internat. Conf. Artificial Intelligence Statist. JMLR Workshop Conf. Proc. (PMLR, New York), 485–492.Google Scholar
- (2021) Low-rank generalized linear bandit problems. Proc. 24th Internat. Conf. Artificial Intelligence Statist. AISTATS (PMLR, New York), 460–468.Google Scholar
- (2000) Managing images in different cultures: A cross-national study of color meanings and preferences. J. Internat. Marketing 8(4):90–107.Crossref, Google Scholar
- (1988) An integer programming approach to the optimal product line selection problem. Marketing Sci. 7(2):126–140.Link, Google Scholar
- (2021) Dynamic joint assortment and pricing optimization with demand learning. Manufacturing Service Oper. Management 23(2):525–545.Google Scholar
- (2022) Online personalized assortment optimization with high-dimensional customer contextual data. Manufacturing Service Oper. Management 24(5):2741–2760.Link, Google Scholar
- (2022) Context-based dynamic pricing with online clustering. Production Oper. Management 31(9):3559–3575.Crossref, Google Scholar
- (2011) Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Statist. 39(2):1069–1097.Crossref, Google Scholar
- (2012) Restricted strong convexity and (weighted) matrix completion: Optimal bounds with noise. J. Machine Learn. Res. 13:1665–1697.Google Scholar
- (2021) Leveraging good representations in linear contextual bandits. Proc. 38th Internat. Conf. Machine Learn. (PMLR, New York), 8371–8380.Google Scholar
- (1991) Demographic contributions to marketing: An assessment. J. Acad. Marketing Sci. 19:53–59.Google Scholar
- (2016) Dynamic pricing with demand covariates. Preprint, submitted April 25, https://arxiv.org/abs/1604.07463.Google Scholar
- (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3):471–501.Crossref, Google Scholar
- (2021) Best arm identification in graphical bilinear bandits. Proc. 38th Internat. Conf. Machine Learn. (PMLR, New York), 9010–9019.Google Scholar
- (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58(5):527–535.Crossref, Google Scholar
- (2023) Vintage factor analysis with Varimax performs statistical inference. J. Roy. Statist. Soc. Ser. B Statist. Methodology 85(4):1037–1060.Google Scholar
- (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
- (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
- (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
- (2016) Discovering patient phenotypes using generalized low rank models. Biocomputing 2016 Proc. Pacific Sympos. (World Scientific, Singapore), 144–155.Google Scholar
- (2023) Combinatorial inference on the optimal assortment in multinomial logit models. Preprint, submitted February 27, https://doi.org/10.2139/ssrn.4371919.Google Scholar
- (2006) Impact of color on marketing. Management Decision 44(6):783–789.Google Scholar
- (2011) Contextual bandits with similarity information. Proc. 24th Annu. Conf. Learn. Theory JMLR Workshop Conf. Proc. (PMLR, New York), 679–702.Google Scholar
- (2005) Generalization error bounds for collaborative prediction with low-rank matrices. Adv. Neural Inform. Processing Systems, vol. 18 (Curran Associates, New York), 1321–1328.Google Scholar
- (2020) Exploiting relevance for online decision-making in high-dimensions. IEEE Trans. Signal Processing 69:1438–1451.Google Scholar
- (2016) Generalized low rank models. Foundations Trends Machine Learn. 9(1):1–118.Crossref, Google Scholar
- (2012) Estimating primary demand for substitutable products from sales transaction data. Oper. Res. 60(2):313–334.Link, Google Scholar
- (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2021) Consumer choice and market expansion: Modeling, optimization, and estimation. Oper. Res. 69(4):1044–1056.Google Scholar
- (2021) Learning across bandits in high dimension via robust statistics. Preprint, submitted December 28, https://arxiv.org/abs/2112.14233v1.Google Scholar
- (2022) Neural contextual bandits with deep representation and shallow exploration. Tenth Internat. Conf. Learn. Representations (Virtual, 2022).Google Scholar
- (2020) Impact of representation learning in linear bandits. Preprint, submitted October 13, https://arxiv.org/abs/2010.06531v1.Google Scholar
- (2020) Neural contextual bandits with UCB-based exploration. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 11492–11502.Google Scholar

