Technical Note—Nonstationary Stochastic Optimization Under Lp,q-Variation Measures

Published Online:https://doi.org/10.1287/opre.2019.1843

References

  • Agarwal A, Dekel O, Xiao L (2010) Optimal algorithms for online convex optimization with multi-point bandit feedback. Kalai AT, Mohri M, eds. Proc. 23rd Conf. Learn. Theory (Omnipress, Madison, WI), 2840.Google Scholar
  • Agarwal A, Foster DP, Hsu D, Kakade SM, Rakhlin A (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23(1):213–240.CrossrefGoogle Scholar
  • Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.LinkGoogle Scholar
  • Boyd S, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Bubeck S, Eldan R, Lee YT (2017) Kernel-based methods for bandit convex optimization. Hatami H, McKenzie P, eds. Proc. 49th Annual ACM SIGACT Sympos. Theory Comput. (STOC 2017) (ACM, New York), 7285.CrossrefGoogle Scholar
  • Castro RM, Nowak RD (2008) Minimax bounds for active learning. IEEE Trans. Inform. Theory 54(5):2339–2353.CrossrefGoogle Scholar
  • Cover T, Thomas J (2006) Elements of Information Theory, 2nd ed. (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Daniely A, Gonen A, Shalev-Shwartz S (2015) Strongly adaptive online learning. Bach F, Blei D, eds. J. Machine Learn. Res. 37: 1405–1411.Google Scholar
  • den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.CrossrefGoogle Scholar
  • den Boer AV, Zwart B (2015) Dynamic pricing and learning with finite inventories. Oper. Res. 63(4):965–978.LinkGoogle Scholar
  • Flaxman AD, Kalai AT, McMahan HB (2005) Online convex optimization in the bandit setting: Gradient descent without a gradient. Buchsbaum A, ed. Proc. 16th Annual ACM-SIAM Sympos. Discrete Algorithms (SODA ‘05) (SIAM, Philadelphia), 385394.Google Scholar
  • Gur Y (2014) Sequential optimization in changing environments: theory and application to online content recommendation services. PhD thesis, Columbia University, New York.Google Scholar
  • Hall EC, Willett RM (2015) Online convex optimization in dynamic environments. IEEE J. Selected Topics Signal Processing 9(4):647–662.CrossrefGoogle Scholar
  • Hazan E (2016) Introduction to online convex optimization. Foundations Trends Optim. 2(3–4):157–325.CrossrefGoogle Scholar
  • Hazan E, Kale S (2014) Beyond the regret minimization barrier: Optimal algorithms for stochastic strongly-convex optimization. J. Machine Learn. Res. 15(July):2489–2512.Google Scholar
  • Hazan E, Levy K (2014) Bandit convex optimization: Towards tight bounds. Gharamani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Proc. Adv. Neural Inform. Processing Systems 27 (NIPS 2014) (Curran Associates, Red Hook, NY), 784792.Google Scholar
  • Hazan E, Agarwal A, Kale S (2007) Logarithmic regret algorithms for online convex optimization. Machine Learn. 69(2):169–192.CrossrefGoogle Scholar
  • Ibragimov IA, Has’minskii RZ (1981) Statistical Estimation: Asymptotic Theory (Springer-Verlag, New York).CrossrefGoogle Scholar
  • Jadbabaie A, Rakhlin A, Shahrampour S, Sridharan K (2015) Online optimization: Competing with dynamic comparators. Lebanon G, Vishwanathan SVN, eds. Proc. 18th Internat. Conf. Artificial Intelligence Statist. (AISTATS), San Diego, CA, 398–406.Google Scholar
  • Jamieson KG, Nowak R, Recht B (2012) Query complexity of derivative-free optimization. Pereira F, Burges CJC, Bottou L, Weinberger KQ, eds. Proc. Adv. Neural Inform. Processing Systems 25 (NIPS 2012) (Curran Associates, Red Hook, NY), 26722680.Google Scholar
  • Karnin ZS, Anava O (2016) Multi-armed bandits: Competing with optimal sequences. Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems 29 (NIPS 2016) (Curran Associates, Red Hook, NY), 199207.Google Scholar
  • Keskin NB, Zeevi A (2017) Chasing demand: Learning and earning in a changing environment. Math. Oper. Res. 42(2):277–307.LinkGoogle Scholar
  • McMahan HB (2017) A survey of algorithms and analysis for adaptive online learning. J. Machine Learn. Res. 18(90):1–50.Google Scholar
  • Mokhtari A, Shahrampour S, Jadbabaie A, Ribeiro A (2016) Online optimization in dynamic environments: Improved regret rates for strongly convex problems. Proc. IEEE Conf. Decision Control (CDC) (IEEE, Piscataway, NJ), 7196–7201.CrossrefGoogle Scholar
  • Nesterov Y, Nemirovskii A (1994) Interior-Point Polynomial Algorithms in Convex Programming (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Shamir O (2017) An optimal algorithm for bandit and zero-order convex optimization with two-point feedback. J. Machine Learn. Res. 18(1):1703–1713.Google Scholar
  • Tsybakov AB (2009) Introduction to Nonparametric Estimation, Springer Series in Statistics (Springer, New York).CrossrefGoogle Scholar
  • Yu B (1997) Assouad, Fano, and Le Cam. Pollard D, Torgersen E, Yang GL, eds. Festschrift for Lucien Le Cam (Springer, New York), 423–435.CrossrefGoogle Scholar
  • Yudin D, Nemirovskii A (1983) Problem Complexity and Method Efficiency in Optimization (John Wiley & Sons, New York).Google Scholar
  • Zhang L, Yang T, Jin R, Zhou ZH (2018) Dynamic regret of strongly adaptive models. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), Stockholm, Sweden, 5877–5886.Google Scholar
  • Zinkevich M (2003) Online convex programming and generalized infinitesimal gradient ascent. Fawcett T, Mishra N, eds. Proc. 20th Internat. Conf. Machine Learn. (ICML-2003) (AAAI Press, Menlo Park, CA), 928–935.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.