Online Stochastic Optimization with Wasserstein-Based Nonstationarity

Published Online:https://doi.org/10.1287/mnsc.2020.03850

References

  • Adelman D (2007) Dynamic bid prices in revenue management. Oper. Res. 55(4):647–661.LinkGoogle Scholar
  • Agrawal S, Devanur NR (2014a) Bandits with concave rewards and convex knapsacks. Proc. 15th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 989–1006.Google Scholar
  • Agrawal S, Devanur NR (2014b) Fast algorithms for online stochastic convex programming. Proc. 26th Ann. ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 1405–1424.Google Scholar
  • Agrawal S, Wang Z, Ye Y (2014) A dynamic near-optimal algorithm for online linear programming. Oper. Res. 62(4):876–890.LinkGoogle Scholar
  • Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR.org), 214–223.Google Scholar
  • Arlotto A, Gurvich I (2019) Uniformly bounded regret in the multisecretary problem. Stochastic Systems 9(3):231–260.LinkGoogle Scholar
  • Arlotto A, Xie X (2020) Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards. Stochastic Systems 10(2):170–191.LinkGoogle Scholar
  • Asadpour A, Wang X, Zhang J (2020) Online resource allocation with limited flexibility. Management Sci. 66(2):642–666.LinkGoogle Scholar
  • Badanidiyuru A, Kleinberg R, Slivkins A (2013) Bandits with knapsacks. Proc. IEEE 54th Ann. Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 207–216.Google Scholar
  • Balseiro S, Lu H, Mirrokni V (2020) The best of many worlds: Dual mirror descent for online allocation problems. Oper. Res. 71(1):iii–v, 1–395, C2–C3.Google Scholar
  • Banerjee S, Freund D (2020) Uniform loss algorithms for online stochastic decision-making with applications to bin packing. SIGMETRICS '20: ACM SIGMETRICS / Internat. Conf. Measurement Modeling Comput. Systems (Association for Computing Machinery, New York), 1–2.Google Scholar
  • Banerjee S, Freund D (2024) Good prophets know when the end is near. Management Sci., ePub ahead of print September 23, https://doi.org/10.1287/mnsc.2023.04307.Google Scholar
  • Bemporad A, Morari M (1999) Robust model predictive control: A survey. Robustness in Identification and Control (Springer, London), 207–226.CrossrefGoogle Scholar
  • Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.LinkGoogle Scholar
  • Besbes O, Gur Y, Zeevi A (2014) Stochastic multi-armed-bandit problem with non-stationary rewards. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 199–207.Google Scholar
  • Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.LinkGoogle Scholar
  • Blanchet J, Kang Y, Murthy K (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830–857.CrossrefGoogle Scholar
  • Buchbinder N, Naor J (2009) Online primal-dual algorithms for covering and packing. Math. Oper. Res. 34(2):270–286.LinkGoogle Scholar
  • Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Sci. 66(7):2993–3009.LinkGoogle Scholar
  • Cheung WC, Simchi-Levi D, Zhu R (2019) Non-stationary reinforcement learning: The blessing of (more) optimism. Preprint, submitted June 17, https://dx.doi.org/10.2139/ssrn.3397818.Google Scholar
  • Cooper WL (2002) Asymptotic behavior of an allocation policy for revenue management. Oper. Res. 50(4):720–727.LinkGoogle Scholar
  • Correa J, Foncea P, Pizarro D, Verdugo V (2019) From pricing to prophets, and back!. Oper. Res. Lett. 47(1):25–29.CrossrefGoogle Scholar
  • Devanur NR, Jain K, Sivan B, Wilkens CA (2019) Near optimal online algorithms and fast approximation algorithms for resource allocation problems. J. ACM 66(1):7.CrossrefGoogle Scholar
  • Esfahani PM, Kuhn D (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1–2):115–166.CrossrefGoogle Scholar
  • Ferguson TS (1989) Who solved the secretary problem? Statist. Sci. 4(3):282–289.CrossrefGoogle Scholar
  • Galichon A (2018) Optimal Transport Methods in Economics (Princeton University Press, Princeton, NJ).Google Scholar
  • Gallego G, Topaloglu H (2019) Revenue Management and Pricing Analytics, vol. 209 (Springer, New York).CrossrefGoogle Scholar
  • Gallego G, Van Ryzin G (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Sci. 40(8):999–1020.LinkGoogle Scholar
  • Garivier A, Moulines E (2008) On upper-confidence bound policies for non-stationary bandit problems. Preprint, submitted May 22, https://arxiv.org/abs/0805.3415.Google Scholar
  • Golrezaei N, Nazerzadeh H, Rusmevichientong P (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.LinkGoogle Scholar
  • Gupta A, Molinaro M (2014) How experts can solve LPS online. Proc. Eur. Sympos. Algorithms (Springer, Berlin, Heidelberg), 517–529.Google Scholar
  • Hall E, Willett R (2013) Dynamical models and tracking regret in online convex programming. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 579–587.Google Scholar
  • Hazan E (2016) Introduction to online convex optimization. Frontiers Optim. 2(3–4):157–325.Google Scholar
  • Immorlica N, Abinav Sankararaman K, Schapire R, Aleksandrs S (2019) Adversarial bandits with knapsacks. Proc. IEEE 60th Ann. Sympos, Foundations Comput. Sci. (IEEE, Piscataway, NJ), 202–219.Google Scholar
  • Jadbabaie A, Rakhlin A, Shahrampour S, Sridharan K (2015) Online optimization: Competing with dynamic comparators. Artificial Intelligence and Statistics (PMLR, New York), 398–406.Google Scholar
  • Jasin S (2015) Performance of an LP-based control for revenue management with unknown demand parameters. Oper. Res. 63(4):909–915.LinkGoogle Scholar
  • Jasin S, Kumar S (2012) A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Math. Oper. Res. 37(2):313–345.LinkGoogle Scholar
  • Jasin S, Kumar S (2013) Analysis of deterministic LP-based booking limit and bid price controls for revenue management. Oper. Res. 61(6):1312–1320.LinkGoogle Scholar
  • Jenatton R, Huang J, Archambeau C (2016) Adaptive algorithms for online convex optimization with long-term constraints. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 402–411.Google Scholar
  • Jiang J, Ma W, Zhang J (2022) Degeneracy is ok: Logarithmic regret for network revenue management with indiscrete distributions. Preprint, submitted October 14, https://arxiv.org/abs/2210.07996.Google Scholar
  • Jiang J, Wang S, Zhang J (2023) Achieving high individual service levels without safety stock? Optimal rationing policy of pooled resources. Oper. Res. 71(1):358–377.LinkGoogle Scholar
  • Jiang J, Zhang J (2020) Online resource allocation with stochastic resource consumption. Preprint, submitted December 14, https://arxiv.org/abs/2012.07933.Google Scholar
  • Kunnumkal S, Talluri K (2016) On a piecewise-linear approximation for network revenue management. Math. Oper. Res. 41(1):72–91.LinkGoogle Scholar
  • Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Lecarpentier E, Rachelson E (2019) Non-stationary Markov decision processes, a worst-case approach using model-based reinforcement learning. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 7216–7225.Google Scholar
  • Li X, Sun C, Ye Y (2020) Simple and fast algorithm for binary integer and online linear programming. Preprint, submitted March 5, https://arxiv.org/abs/2003.02513.Google Scholar
  • Lu H, Balseiro S, Mirrokni V (2020) Dual mirror descent for online allocation problems. Preprint, submitted November 18, https://arxiv.org/abs/2002.10421.Google Scholar
  • Ma Y, Rusmevichientong P, Sumida M, Topaloglu H (2020) An approximation algorithm for network revenue management under nonstationary arrivals. Oper. Res. 68(3):834–855.LinkGoogle Scholar
  • Mehta A, Saberi A, Vazirani U, Vazirani V (2005) Adwords and generalized on-line matching. Proc. 46th Ann. IEEE Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 264–273.Google Scholar
  • Miao S, Wang Y, Zhang J (2021) A general framework for resource constrained revenue management with demand learning and large action space. Preprint, submitted May 10, https://dx.doi.org/10.2139/ssrn.3841273.Google Scholar
  • Molinaro M, Ravi R (2013) The geometry of online packing linear programs. Math. Oper. Res. 39(1):46–59.LinkGoogle Scholar
  • Neely MJ, Yu H (2017) Online convex optimization with time-varying constraints. Preprint, submitted February 15, https://arxiv.org/abs/1702.04783.Google Scholar
  • Rangi A, Franceschetti M, Tran-Thanh L (2018) Unifying the stochastic and the adversarial bandits with knapsack. Preprint, submitted October 23, https://arxiv.org/abs/1811.12253.Google Scholar
  • Rawlings JB (2000) Tutorial overview of model predictive control. IEEE Control Systems Magazine 20(3):38–52.CrossrefGoogle Scholar
  • Reiman MI, Wang Q (2008) An asymptotically optimal policy for a quantity-based network revenue management problem. Math. Oper. Res. 33(2):257–282.LinkGoogle Scholar
  • Sun R, Wang X, Zhou Z (2020) Near-optimal primal-dual algorithms for quantity-based network revenue management. Preprint, submitted November 12, https://arxiv.org/abs/2011.06327.Google Scholar
  • Talluri K, Van Ryzin G (1998) An analysis of bid-price controls for network revenue management. Management Sci. 44(11-part-1):1577–1593.LinkGoogle Scholar
  • Talluri K, Van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.LinkGoogle Scholar
  • Talluri KT, Van Ryzin GJ (2006) The Theory and Practice of Revenue Management, vol. 68 (Springer Science & Business Media, New York).Google Scholar
  • Vanderbei RJ (1998) Linear programming: foundations and extensions. J. Oper. Res. Soc. 49(1):94.Google Scholar
  • Vera A, Banerjee S (2021) The Bayesian prophet: A low-regret framework for online decision making. Management Sci. 67(3):1329–1992, iii–iv.LinkGoogle Scholar
  • Vera A, Banerjee S, Gurvich I (2019) Online allocation and pricing: Constant regret via bellman inequalities. Preprint, submitted June 14, https://arxiv.org/abs/1906.06361.Google Scholar
  • Villani C (2008) Optimal Transport: Old and New, vol. 338 (Springer Science & Business Media, Berlin, Heidelberg).Google Scholar
  • Wagener N, Cheng C-A, Sacks J, Boots B (2019) An online learning approach to model predictive control. Preprint, submitted February 24, https://arxiv.org/abs/1902.08967.Google Scholar
  • Yi X, Li X, Yang T, Xie L, Chai T, Johansson K (2021) Regret and cumulative constraint violation analysis for online convex optimization with long term constraints. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 11998–12008.Google Scholar
  • Yuan J, Lamperski A (2018) Online convex optimization for cumulative constraints. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 6137–6146.Google Scholar
  • Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43(3):381–394.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.