Non-Stationary Stochastic Optimization

Published Online:https://doi.org/10.1287/opre.2015.1408

References

  • Agarwal A, Dekel O, Xiao L (2010) Optimal algorithms for online convex optimization with multi-point bandit feedback. Proc. 23rd Ann. Conf. Learn. Theory, COLT ’10 (Omnipress, Madison, WI), 28–40.Google Scholar
  • Agarwal A, Foster DP, Hsu D, Kakade SM, Rakhlin A (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23:213–240.CrossrefGoogle Scholar
  • Araman VF, Caldentey R (2011) Revenue management with incomplete demand information. Cochran JJ, Cox LA Jr., Keskinocak P, Kharoufeh JP, Smith JC, eds. Wiley Encyclopedia Operations Research and Management Science (John Wiley & Sons, Hoboken, NJ), 1–17.CrossrefGoogle Scholar
  • Ben-Tal A, Nemirovski A (1998) Robust convex optimization. Math. Oper. Res. 23(4):769–805.LinkGoogle Scholar
  • Benveniste A, Priouret P, Metivier M (1990) Adaptive Algorithms and Stochastic Approximations (Springer, New York).CrossrefGoogle Scholar
  • Bertsimas D, Brown DB, Caramanis C (2011) Theory and applications of robust optimization. SIAM Rev. 53:464–501.CrossrefGoogle Scholar
  • Besbes O, Muharremoglu A (2013) On implications of demand censoring in the newsvendor problem. Management Sci. 59(6):1407–1424.LinkGoogle Scholar
  • Besbes O, Zeevi A (2011) On the minimax complexity of pricing in a changing environment. Oper. Res. 59(1):66–79.LinkGoogle Scholar
  • Blackwell D (1956) An analog of the minimax theorem for vector payoffs. Pacific J. Math. 6:1–8.CrossrefGoogle Scholar
  • Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.LinkGoogle Scholar
  • Cesa-Bianchi N, Lugosi G (2006) Prediction, Learning, and Games (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Cope EW (2009) Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Trans. Automatic Control 54:1243–1253.CrossrefGoogle Scholar
  • den Boer A, Zwart B (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.LinkGoogle Scholar
  • Flaxman AD, Kalai AT, McMahan HB (2005) Online convex optimization in the bandit setting: Gradient descent without gradient. Proc. Sixteenth Ann. ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 385–394.Google Scholar
  • Hannan J (1957) Approximation to Bayes risk in repeated play. Contubotions to the Theory of Games, Vol. 3 (Princeton University Press, Princeton, NJ), 97–139.Google Scholar
  • Haykin SS (2001) Kalman Filtering and Neural Networks (John Wiley & Sons, New York).CrossrefGoogle Scholar
  • Hazan E, Kale S (2010) Extracting certainty from uncertainty: Regret bounded by variation in costs. Machine Learn. 80:165–188.CrossrefGoogle Scholar
  • Hazan E, Kale S (2011) Beyond the regret minimization barrier: An optimal algorithm for stochastic strongly-convex optimization. J. Machine Learn. Res. Proceedings Track 19:421–436.Google Scholar
  • Hazan E, Agarwal A, Kale S (2007) Logarithmic regret algorithms for online convex optimization. Machine Learn. 69:169–192.CrossrefGoogle Scholar
  • Huh WT, Rusmevichientong P (2009) A non-parametric asymptotic analysis of inventory planning with censored demand. Math. Oper. Res. 34(1):103–123.LinkGoogle Scholar
  • Kalai A, Vempala S (2003) Efficient algorithms for online decision problems. Learning Theory and Kernel Machines (Springer, Berlin),26–40.CrossrefGoogle Scholar
  • Kalman RE (1960) A new approach to linear filtering and prediction problems. J. Fluids Engrg. 81:35–45.Google Scholar
  • Keller G, Rady S (1999) Optimal experimentation in a changing environment. Rev. Econom. Stud. 66:475–507.CrossrefGoogle Scholar
  • Keskin BN, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.LinkGoogle Scholar
  • Kiefer J, Wolfowitz J (1952) Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23:462–466.CrossrefGoogle Scholar
  • Kleinberg RD (2004) Nearly tight bounds for the continuum-armed bandit problem. Saul L, Weiss Y, Bottou L, eds. Adv. Neural Inform. Processing Systems 17 (MIT Press, Cambridge, MA), 697–704.Google Scholar
  • Kushner HJ, Yin GG (2003) Stochastic Approximation and Recursive Algorithms and Applications (Springer, New York).Google Scholar
  • Lai TL (2003) Stochastic approximation. Ann. Statist. 31:391–406.CrossrefGoogle Scholar
  • McMahan HB, Blum A (2004) Online geometric optimization in the bandit setting against an adaptive adversary. Learning Theory (Springer, Berlin), 109–123.CrossrefGoogle Scholar
  • Nemirovski A, Yudin D (1983) Problem Complexity and Method Efficiency in Optimization (John Wiley & Sons, New York).Google Scholar
  • Robbins H, Monro S (1951) A stochastic approximatiom method. Ann. Math. Statist. 22:400–407.CrossrefGoogle Scholar
  • Tsybakov AB (2008) Introduction to Nonparametric Estimation (Springer, New York).Google Scholar
  • Zinkevich M (2003) Online convex programming and generalized infinitesimal gradient ascent. 20th Internat. Conf. Machine Learn. (AAAI, Palo Alto, CA), 928–936.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.