Technical Note—On Adaptivity in Nonstationary Stochastic Optimization with Bandit Feedback

Yining Wang
Yining Wang
[email protected]
https://orcid.org/0000-0001-9410-0392
Naveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080
Search for more papers by this author

Naveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080

Published Online:31 Jul 2023https://doi.org/10.1287/opre.2022.0576

Abstract

In this paper, we study the nonstationary stochastic optimization problem with bandit feedback and dynamic regret measures. The seminal work of Besbes et al. (2015) shows that, when aggregated function changes are known a priori, a simple restarting algorithm attains the optimal dynamic regret. In this work, we design a stochastic optimization algorithm with fixed step sizes, which, combined with the multiscale sampling framework in existing research, achieves the optimal dynamic regret in nonstationary stochastic optimization without prior knowledge of function changing budget, thereby closing a question that has been open for a while. We also establish an additional result showing that any algorithm achieving good regret against stationary benchmarks with high probability could be automatically converted to an algorithm that achieves good regret against dynamic benchmarks (for problems that admit $\tilde{O} (\sqrt{T})$ regret against stationary benchmarks in fully adversarial settings, a dynamic regret of $\tilde{O} (V_{T}^{1 / 3} T^{2 / 3})$ is expected), which is potentially applicable to a wide class of bandit convex optimization and other types of bandit algorithms.

Volume 73, Issue 2

March-April 2025

Pages iii-viii, 583-1150, C2-C3

Article Information

Metrics

Information

Received:October 28, 2022
Accepted:June 12, 2023
Published Online:July 31, 2023

Cite as

Yining Wang (2023) Technical Note—On Adaptivity in Nonstationary Stochastic Optimization with Bandit Feedback. Operations Research 73(2):819-828.

https://doi.org/10.1287/opre.2022.0576

Keywords

Acknowledgments

The author thanks the department editor, the associate editor, and two anonymous referees for offering valuable comments and suggestions that greatly improved the paper. The author also thanks Haipeng Luo and Chen-Yu Wei for providing helpful comments and suggestions that improved the positioning and exposition of this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Technical Note—On Adaptivity in Nonstationary Stochastic Optimization with Bandit Feedback

Abstract

Volume 73, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News