Online Learning and Decision Making Under Generalized Linear Model with High-Dimensional Data

Published Online:https://doi.org/10.1287/mnsc.2022.01557

References

  • Abbasi-Yadkori Y, Szepesvari C (2012) Online Learning for Linearly Parametrized Control Problems (University of Alberta, Edmonton).Google Scholar
  • Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Improved Algorithms for Linear Stochastic Bandits, Advances in Neural Information Processing Systems, vol. 24 (Curran Associates, Inc., Red Hook, NY), 2312–2320.Google Scholar
  • Agarwal A, Hsu D, Kale S, Langford J, Li L, Schapire R (2014) Taming the monster: A fast and simple algorithm for contextual bandits. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 32 (PMLR, New York), 1638–1646.Google Scholar
  • Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 28 (PMLR, New York), 127–135.Google Scholar
  • Ariu K, Abe K, Proutière A (2022) Thresholded lasso bandit. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 162 (PMLR, New York), 878–928.Google Scholar
  • Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
  • Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.CrossrefGoogle Scholar
  • Bagnoli M, Bergstrom T (2005) Log-concave probability and its applications. Econom. Theory 26(2):445–469.CrossrefGoogle Scholar
  • Bastani H, Bayati M (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.LinkGoogle Scholar
  • Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.CrossrefGoogle Scholar
  • Beygelzimer A, Langford J, Li L, Reyzin L, Schapire R (2011) Contextual bandit algorithms with supervised learning guarantees. Proc. 14th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 15 (PMLR, New York), 19–26.Google Scholar
  • Boyd S, Boyd SP, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Bühlmann P, Van De Geer S (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications (Springer Science & Business Media, Boston).CrossrefGoogle Scholar
  • Candes EJ, Wakin MB, Boyd SP (2008) Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. 14(5–6):877–905.CrossrefGoogle Scholar
  • Carpentier A, Munos R (2012) Bandit theory meets compressed sensing for high dimensional stochastic linear bandit. Lawrence ND, Girolami M, eds. Artificial Intelligence and Statistics (PMLR, La Palma, Spain), 190–198.Google Scholar
  • Dani V, Hayes TP, Kakade SM (2008) Stochastic linear optimization under bandit feedback. 21st Annual Conf. Learn. Theory - COLT 2008 (Helsinki, Finland), 355?366.Google Scholar
  • Deshpande Y, Montanari A (2012) Linear bandits in high dimension and recommendation systems. Proc. 50th Annual Allerton Conf. Comm. Control Comput. (Allerton) (IEEE, Piscataway, NJ), 1750–1754.Google Scholar
  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96(456):1348–1360.CrossrefGoogle Scholar
  • Fan J, Han F, Liu H (2014a) Challenges of big data analysis. National Sci. Rev. 1(2):293–314.CrossrefGoogle Scholar
  • Fan J, Xue L, Zou H (2014b) Strong oracle optimality of folded concave penalized estimation. Ann. Statist. 42(3):819.CrossrefGoogle Scholar
  • Fan J, Liu H, Sun Q, Zhang T (2018) I-lamm for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46(2):814.CrossrefGoogle Scholar
  • Filippi S, Cappe O, Garivier A, Szepesvári C (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 23 (Curran Associates Inc., Red Hook, NY), 586–594.Google Scholar
  • Goldenshluger A, Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3(1):230–261.LinkGoogle Scholar
  • Hao B, Lattimore T, Wang M (2020) High-dimensional sparse linear bandits. Preprint, submitted November 8, https://arxiv.org/abs/2011.04020.Google Scholar
  • Huang J, Ma S, Zhang CH (2008) Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18(4):1603–1618.Google Scholar
  • Kim GS, Paik MC (2019) Doubly-robust lasso bandit. Wallach H, Larochelle H, Beygelzimer A, D’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 5869–5879.Google Scholar
  • Kuzborskij I, Cella L, Cesa-Bianchi N (2018) Efficient linear bandits through matrix sketching. Preprint, submitted September 28, https://arxiv.org/abs/1809.11033.Google Scholar
  • Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Proc. 34th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 70 (PMLR, New York), 2071–2080.Google Scholar
  • Liu H, Yao T, Li R (2016) Global solutions to folded concave penalized nonconvex learning. Ann. Statist. 44(2):629–659.CrossrefGoogle Scholar
  • Liu H, Ye Y, Lee HY (2022) High-dimensional learning under approximate sparsity with applications to nonsmooth estimation and regularized neural networks. Oper. Res. 70(6):3176–3197.LinkGoogle Scholar
  • Liu H, Yao T, Li R, Ye Y (2017) Folded concave penalized sparse linear regression: Sparsity, statistical performance, and algorithmic theory for local solutions. Math. Programming 166:207–240.Google Scholar
  • Loh PL, Wainwright MJ (2013) Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates Inc., Red Hook, NY), 476–484.Google Scholar
  • McCullagh P, Nelder J (1989) Generalized Linear Models (Chapman and Hall/CRC, Boca Raton, FL).CrossrefGoogle Scholar
  • Meinshausen N (2007) Relaxed lasso. Comput. Statist. Data Anal. 52(1):374–393.CrossrefGoogle Scholar
  • Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34(3):1436–1462.CrossrefGoogle Scholar
  • Meinshausen N, Yu B (2009) Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37(1):246–270.CrossrefGoogle Scholar
  • Mitchell J (2012) How Google search really works. Accessed October 22, 2018, https://readwrite.com/2012/02/29/interview_changing_engines_mid-flight_qa_with_goog/#awesm=∼oiNkM4tAX3xhbP.Google Scholar
  • Montgomery DC, Peck EA, Vining GG (2012) Introduction to Linear Regression Analysis, vol. 821 (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Negahban S, Yu B, Wainwright MJ, Ravikumar PK (2009) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Bengio Y, Schuurmans D, Lafferty J, Williams C, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 22 (Curran Associates Inc., Red Hook, NY), 1348–1356.Google Scholar
  • Oh Mh, Iyengar G, Zeevi A (2021) Sparsity-agnostic lasso bandit. Proc. Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8271–8280.Google Scholar
  • OxfordDictionaries (2018) How many words are there in the English language? Accessed October 22, 2018, https://en.oxforddictionaries.com/explore/how-many-words-are-there-in-the-english-language/.Google Scholar
  • Ren Z, Zhou Z (2020) Dynamic batch learning in high-dimensional sparse linear contextual bandits. Preprint, submitted August 27, https://arxiv.org/abs/2008.11918.Google Scholar
  • Rigollet P, Zeevi A (2010) Nonparametric bandits with covariates. Preprint, submitted March 8, https://arxiv.org/abs/1003.1630.Google Scholar
  • Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (New Series) 58(5):527–535.CrossrefGoogle Scholar
  • Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.LinkGoogle Scholar
  • Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.LinkGoogle Scholar
  • Shewan D (2017) The comprehensive guide to online advertising costs. Accessed October 22, 2018, https://www.wordstream.com/blog/ws/2017/07/05/online-advertising-costs.Google Scholar
  • Slivkins A (2014) Contextual bandits with similarity information. J. Machine Learn. Res. 15(1):2533–2568.Google Scholar
  • Tencent (2012) Predict the click-through rate of ads given the query and user information. Accessed October 22, 2018, https://www.kaggle.com/c/kddcup2012-track2.Google Scholar
  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58(1):267–288.CrossrefGoogle Scholar
  • Tsybakov AB (2004) Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1):135–166.CrossrefGoogle Scholar
  • Van de Geer SA (2008) High-dimensional generalized linear models and the lasso. Ann. Statist. 36(2):614–645.CrossrefGoogle Scholar
  • Wang Y, Chen Y, Fang EX, Wang Z, Li R (2020) Nearly dimension-independent sparse linear bandit over small action spaces via best subset selection. Preprint, submitted September 4, https://arxiv.org/abs/2009.02003.Google Scholar
  • WordStream (2017) Average ctr (click-through rate): Learn how your ctr compares. Accessed October 22, 2018, https://www.wordstream.com/average-ctr.Google Scholar
  • Yu X, Lyu MR, King I (2017) Cbrap: Contextual bandits with random projection. Proc. 31st AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 2859–2866.Google Scholar
  • Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38(2):894–942.CrossrefGoogle Scholar
  • Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Statist. 36(4):1567–1594.CrossrefGoogle Scholar
  • Zhang CH, Zhang T (2012) A general theory of concave regularization for high-dimensional sparse estimation problems. Statist. Sci. 27(4):576–593.CrossrefGoogle Scholar
  • Zhao T, Liu H, Zhang T (2014) Pathwise coordinate optimization for sparse learning: Algorithm and theory. Preprint, submitted December 23, https://arxiv.org/abs/1412.7477.Google Scholar
  • Zhao T, Liu H, Zhang T (2018) Pathwise coordinate optimization for sparse learning: Algorithm and theory. Ann. Statist. 46(1):180–218.CrossrefGoogle Scholar
  • Zou H (2006) The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101(476):1418–1429.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.