Online Learning and Decision Making Under Generalized Linear Model with High-Dimensional Data

Xue Wang
Xue Wang
[email protected]
https://orcid.org/0009-0004-2296-9688
DAMO Academy, Alibaba Group, Bellevue, Washington 98004
Search for more papers by this author
,
Mike Mingcheng Wei
Mike Mingcheng Wei
[email protected]
https://orcid.org/0000-0002-7867-2646
Operations Management and Strategy, School of Management, University at Buffalo, Buffalo, New York 14260
Search for more papers by this author
,
Tao Yao
Corresponding Author
Tao Yao
[email protected]
https://orcid.org/0000-0002-2124-5678
Data and Business Intelligence, Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai 20003, China
Search for more papers by this author

DAMO Academy, Alibaba Group, Bellevue, Washington 98004

Search for more papers by this author

Mike Mingcheng Wei

[email protected]

https://orcid.org/0000-0002-7867-2646

Operations Management and Strategy, School of Management, University at Buffalo, Buffalo, New York 14260

Search for more papers by this author

Tao Yao

Corresponding Author

Tao Yao

[email protected]

https://orcid.org/0000-0002-2124-5678

Data and Business Intelligence, Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai 20003, China

Search for more papers by this author

Published Online:13 Nov 2024https://doi.org/10.1287/mnsc.2022.01557

References

Abbasi-Yadkori Y, Szepesvari C (2012) Online Learning for Linearly Parametrized Control Problems (University of Alberta, Edmonton).Google Scholar
Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Improved Algorithms for Linear Stochastic Bandits, Advances in Neural Information Processing Systems, vol. 24 (Curran Associates, Inc., Red Hook, NY), 2312–2320.Google Scholar
Agarwal A, Hsu D, Kale S, Langford J, Li L, Schapire R (2014) Taming the monster: A fast and simple algorithm for contextual bandits. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 32 (PMLR, New York), 1638–1646.Google Scholar
Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 28 (PMLR, New York), 127–135.Google Scholar
Ariu K, Abe K, Proutière A (2022) Thresholded lasso bandit. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 162 (PMLR, New York), 878–928.Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
Bagnoli M, Bergstrom T (2005) Log-concave probability and its applications. Econom. Theory 26(2):445–469.Crossref, Google Scholar
Bastani H, Bayati M (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.Link, Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.Crossref, Google Scholar
Beygelzimer A, Langford J, Li L, Reyzin L, Schapire R (2011) Contextual bandit algorithms with supervised learning guarantees. Proc. 14th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 15 (PMLR, New York), 19–26.Google Scholar
Boyd S, Boyd SP, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Bühlmann P, Van De Geer S (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications (Springer Science & Business Media, Boston).Crossref, Google Scholar
Candes EJ, Wakin MB, Boyd SP (2008) Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. 14(5–6):877–905.Crossref, Google Scholar
Carpentier A, Munos R (2012) Bandit theory meets compressed sensing for high dimensional stochastic linear bandit. Lawrence ND, Girolami M, eds. Artificial Intelligence and Statistics (PMLR, La Palma, Spain), 190–198.Google Scholar
Dani V, Hayes TP, Kakade SM (2008) Stochastic linear optimization under bandit feedback. 21st Annual Conf. Learn. Theory - COLT 2008 (Helsinki, Finland), 355?366.Google Scholar
Deshpande Y, Montanari A (2012) Linear bandits in high dimension and recommendation systems. Proc. 50th Annual Allerton Conf. Comm. Control Comput. (Allerton) (IEEE, Piscataway, NJ), 1750–1754.Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96(456):1348–1360.Crossref, Google Scholar
Fan J, Han F, Liu H (2014a) Challenges of big data analysis. National Sci. Rev. 1(2):293–314.Crossref, Google Scholar
Fan J, Xue L, Zou H (2014b) Strong oracle optimality of folded concave penalized estimation. Ann. Statist. 42(3):819.Crossref, Google Scholar
Fan J, Liu H, Sun Q, Zhang T (2018) I-lamm for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46(2):814.Crossref, Google Scholar
Filippi S, Cappe O, Garivier A, Szepesvári C (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 23 (Curran Associates Inc., Red Hook, NY), 586–594.Google Scholar
Goldenshluger A, Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3(1):230–261.Link, Google Scholar
Hao B, Lattimore T, Wang M (2020) High-dimensional sparse linear bandits. Preprint, submitted November 8, https://arxiv.org/abs/2011.04020.Google Scholar
Huang J, Ma S, Zhang CH (2008) Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18(4):1603–1618.Google Scholar
Kim GS, Paik MC (2019) Doubly-robust lasso bandit. Wallach H, Larochelle H, Beygelzimer A, D’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 5869–5879.Google Scholar
Kuzborskij I, Cella L, Cesa-Bianchi N (2018) Efficient linear bandits through matrix sketching. Preprint, submitted September 28, https://arxiv.org/abs/1809.11033.Google Scholar
Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Proc. 34th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 70 (PMLR, New York), 2071–2080.Google Scholar
Liu H, Yao T, Li R (2016) Global solutions to folded concave penalized nonconvex learning. Ann. Statist. 44(2):629–659.Crossref, Google Scholar
Liu H, Ye Y, Lee HY (2022) High-dimensional learning under approximate sparsity with applications to nonsmooth estimation and regularized neural networks. Oper. Res. 70(6):3176–3197.Link, Google Scholar
Liu H, Yao T, Li R, Ye Y (2017) Folded concave penalized sparse linear regression: Sparsity, statistical performance, and algorithmic theory for local solutions. Math. Programming 166:207–240.Google Scholar
Loh PL, Wainwright MJ (2013) Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates Inc., Red Hook, NY), 476–484.Google Scholar
McCullagh P, Nelder J (1989) Generalized Linear Models (Chapman and Hall/CRC, Boca Raton, FL).Crossref, Google Scholar
Meinshausen N (2007) Relaxed lasso. Comput. Statist. Data Anal. 52(1):374–393.Crossref, Google Scholar
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34(3):1436–1462.Crossref, Google Scholar
Meinshausen N, Yu B (2009) Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37(1):246–270.Crossref, Google Scholar
Mitchell J (2012) How Google search really works. Accessed October 22, 2018, https://readwrite.com/2012/02/29/interview_changing_engines_mid-flight_qa_with_goog/#awesm=∼oiNkM4tAX3xhbP.Google Scholar
Montgomery DC, Peck EA, Vining GG (2012) Introduction to Linear Regression Analysis, vol. 821 (John Wiley & Sons, Hoboken, NJ).Google Scholar
Negahban S, Yu B, Wainwright MJ, Ravikumar PK (2009) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Bengio Y, Schuurmans D, Lafferty J, Williams C, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 22 (Curran Associates Inc., Red Hook, NY), 1348–1356.Google Scholar
Oh Mh, Iyengar G, Zeevi A (2021) Sparsity-agnostic lasso bandit. Proc. Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8271–8280.Google Scholar
OxfordDictionaries (2018) How many words are there in the English language? Accessed October 22, 2018, https://en.oxforddictionaries.com/explore/how-many-words-are-there-in-the-english-language/.Google Scholar
Ren Z, Zhou Z (2020) Dynamic batch learning in high-dimensional sparse linear contextual bandits. Preprint, submitted August 27, https://arxiv.org/abs/2008.11918.Google Scholar
Rigollet P, Zeevi A (2010) Nonparametric bandits with covariates. Preprint, submitted March 8, https://arxiv.org/abs/1003.1630.Google Scholar
Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (New Series) 58(5):527–535.Crossref, Google Scholar
Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
Shewan D (2017) The comprehensive guide to online advertising costs. Accessed October 22, 2018, https://www.wordstream.com/blog/ws/2017/07/05/online-advertising-costs.Google Scholar
Slivkins A (2014) Contextual bandits with similarity information. J. Machine Learn. Res. 15(1):2533–2568.Google Scholar
Tencent (2012) Predict the click-through rate of ads given the query and user information. Accessed October 22, 2018, https://www.kaggle.com/c/kddcup2012-track2.Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 58(1):267–288.Crossref, Google Scholar
Tsybakov AB (2004) Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1):135–166.Crossref, Google Scholar
Van de Geer SA (2008) High-dimensional generalized linear models and the lasso. Ann. Statist. 36(2):614–645.Crossref, Google Scholar
Wang Y, Chen Y, Fang EX, Wang Z, Li R (2020) Nearly dimension-independent sparse linear bandit over small action spaces via best subset selection. Preprint, submitted September 4, https://arxiv.org/abs/2009.02003.Google Scholar
WordStream (2017) Average ctr (click-through rate): Learn how your ctr compares. Accessed October 22, 2018, https://www.wordstream.com/average-ctr.Google Scholar
Yu X, Lyu MR, King I (2017) Cbrap: Contextual bandits with random projection. Proc. 31st AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 2859–2866.Google Scholar
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38(2):894–942.Crossref, Google Scholar
Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Statist. 36(4):1567–1594.Crossref, Google Scholar
Zhang CH, Zhang T (2012) A general theory of concave regularization for high-dimensional sparse estimation problems. Statist. Sci. 27(4):576–593.Crossref, Google Scholar
Zhao T, Liu H, Zhang T (2014) Pathwise coordinate optimization for sparse learning: Algorithm and theory. Preprint, submitted December 23, https://arxiv.org/abs/1412.7477.Google Scholar
Zhao T, Liu H, Zhang T (2018) Pathwise coordinate optimization for sparse learning: Algorithm and theory. Ann. Statist. 46(1):180–218.Crossref, Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101(476):1418–1429.Crossref, Google Scholar

Volume 71, Issue 8

August 2025

Pages iv-vi, 6319-7222

Article Information

Supplemental Material

Metrics

Information

Received:June 03, 2022
Accepted:January 17, 2024
Published Online:November 13, 2024

Cite as

Xue Wang, Mike Mingcheng Wei, Tao Yao (2024) Online Learning and Decision Making Under Generalized Linear Model with High-Dimensional Data. Management Science 71(8):6647-6665.

https://doi.org/10.1287/mnsc.2022.01557

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning and Decision Making Under Generalized Linear Model with High-Dimensional Data

References

Volume 71, Issue 8

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News