Online Stochastic Optimization with Wasserstein-Based Nonstationarity
Published Online:3 Mar 2025https://doi.org/10.1287/mnsc.2020.03850
References
- (2007) Dynamic bid prices in revenue management. Oper. Res. 55(4):647–661.Link, Google Scholar
- (2014a) Bandits with concave rewards and convex knapsacks. Proc. 15th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 989–1006.Google Scholar
- (2014b) Fast algorithms for online stochastic convex programming. Proc. 26th Ann. ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 1405–1424.Google Scholar
- (2014) A dynamic near-optimal algorithm for online linear programming. Oper. Res. 62(4):876–890.Link, Google Scholar
- (2017) Wasserstein generative adversarial networks. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR.org), 214–223.Google Scholar
- (2019) Uniformly bounded regret in the multisecretary problem. Stochastic Systems 9(3):231–260.Link, Google Scholar
- (2020) Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards. Stochastic Systems 10(2):170–191.Link, Google Scholar
- (2020) Online resource allocation with limited flexibility. Management Sci. 66(2):642–666.Link, Google Scholar
- (2013) Bandits with knapsacks. Proc. IEEE 54th Ann. Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 207–216.Google Scholar
- (2020) The best of many worlds: Dual mirror descent for online allocation problems. Oper. Res. 71(1):iii–v, 1–395, C2–C3.Google Scholar
- (2020) Uniform loss algorithms for online stochastic decision-making with applications to bin packing. SIGMETRICS '20: ACM SIGMETRICS / Internat. Conf. Measurement Modeling Comput. Systems (Association for Computing Machinery, New York), 1–2.Google Scholar
- (2024) Good prophets know when the end is near. Management Sci., ePub ahead of print September 23, https://doi.org/10.1287/mnsc.2023.04307.Google Scholar
- (1999) Robust model predictive control: A survey. Robustness in Identification and Control (Springer, London), 207–226.Crossref, Google Scholar
- (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
- (2014) Stochastic multi-armed-bandit problem with non-stationary rewards. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 199–207.Google Scholar
- (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
- (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830–857.Crossref, Google Scholar
- (2009) Online primal-dual algorithms for covering and packing. Math. Oper. Res. 34(2):270–286.Link, Google Scholar
- (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Sci. 66(7):2993–3009.Link, Google Scholar
- (2019) Non-stationary reinforcement learning: The blessing of (more) optimism. Preprint, submitted June 17, https://dx.doi.org/10.2139/ssrn.3397818.Google Scholar
- (2002) Asymptotic behavior of an allocation policy for revenue management. Oper. Res. 50(4):720–727.Link, Google Scholar
- (2019) From pricing to prophets, and back!. Oper. Res. Lett. 47(1):25–29.Crossref, Google Scholar
- (2019) Near optimal online algorithms and fast approximation algorithms for resource allocation problems. J. ACM 66(1):7.Crossref, Google Scholar
- (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1–2):115–166.Crossref, Google Scholar
- (1989) Who solved the secretary problem? Statist. Sci. 4(3):282–289.Crossref, Google Scholar
- (2018) Optimal Transport Methods in Economics (Princeton University Press, Princeton, NJ).Google Scholar
- (2019) Revenue Management and Pricing Analytics, vol. 209 (Springer, New York).Crossref, Google Scholar
- (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Sci. 40(8):999–1020.Link, Google Scholar
- (2008) On upper-confidence bound policies for non-stationary bandit problems. Preprint, submitted May 22, https://arxiv.org/abs/0805.3415.Google Scholar
- (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
- (2014) How experts can solve LPS online. Proc. Eur. Sympos. Algorithms (Springer, Berlin, Heidelberg), 517–529.Google Scholar
- (2013) Dynamical models and tracking regret in online convex programming. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 579–587.Google Scholar
- (2016) Introduction to online convex optimization. Frontiers Optim. 2(3–4):157–325.Google Scholar
- (2019) Adversarial bandits with knapsacks. Proc. IEEE 60th Ann. Sympos, Foundations Comput. Sci. (IEEE, Piscataway, NJ), 202–219.Google Scholar
- (2015) Online optimization: Competing with dynamic comparators. Artificial Intelligence and Statistics (PMLR, New York), 398–406.Google Scholar
- (2015) Performance of an LP-based control for revenue management with unknown demand parameters. Oper. Res. 63(4):909–915.Link, Google Scholar
- (2012) A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Math. Oper. Res. 37(2):313–345.Link, Google Scholar
- (2013) Analysis of deterministic LP-based booking limit and bid price controls for revenue management. Oper. Res. 61(6):1312–1320.Link, Google Scholar
- (2016) Adaptive algorithms for online convex optimization with long-term constraints. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 402–411.Google Scholar
- (2022) Degeneracy is ok: Logarithmic regret for network revenue management with indiscrete distributions. Preprint, submitted October 14, https://arxiv.org/abs/2210.07996.Google Scholar
- (2023) Achieving high individual service levels without safety stock? Optimal rationing policy of pooled resources. Oper. Res. 71(1):358–377.Link, Google Scholar
- (2020) Online resource allocation with stochastic resource consumption. Preprint, submitted December 14, https://arxiv.org/abs/2012.07933.Google Scholar
- (2016) On a piecewise-linear approximation for network revenue management. Math. Oper. Res. 41(1):72–91.Link, Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2019) Non-stationary Markov decision processes, a worst-case approach using model-based reinforcement learning. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 7216–7225.Google Scholar
- (2020) Simple and fast algorithm for binary integer and online linear programming. Preprint, submitted March 5, https://arxiv.org/abs/2003.02513.Google Scholar
- (2020) Dual mirror descent for online allocation problems. Preprint, submitted November 18, https://arxiv.org/abs/2002.10421.Google Scholar
- (2020) An approximation algorithm for network revenue management under nonstationary arrivals. Oper. Res. 68(3):834–855.Link, Google Scholar
- (2005) Adwords and generalized on-line matching. Proc. 46th Ann. IEEE Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 264–273.Google Scholar
- (2021) A general framework for resource constrained revenue management with demand learning and large action space. Preprint, submitted May 10, https://dx.doi.org/10.2139/ssrn.3841273.Google Scholar
- (2013) The geometry of online packing linear programs. Math. Oper. Res. 39(1):46–59.Link, Google Scholar
- (2017) Online convex optimization with time-varying constraints. Preprint, submitted February 15, https://arxiv.org/abs/1702.04783.Google Scholar
- (2018) Unifying the stochastic and the adversarial bandits with knapsack. Preprint, submitted October 23, https://arxiv.org/abs/1811.12253.Google Scholar
- (2000) Tutorial overview of model predictive control. IEEE Control Systems Magazine 20(3):38–52.Crossref, Google Scholar
- (2008) An asymptotically optimal policy for a quantity-based network revenue management problem. Math. Oper. Res. 33(2):257–282.Link, Google Scholar
- (2020) Near-optimal primal-dual algorithms for quantity-based network revenue management. Preprint, submitted November 12, https://arxiv.org/abs/2011.06327.Google Scholar
- (1998) An analysis of bid-price controls for network revenue management. Management Sci. 44(11-part-1):1577–1593.Link, Google Scholar
- (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
- (2006) The Theory and Practice of Revenue Management, vol. 68 (Springer Science & Business Media, New York).Google Scholar
- Vanderbei RJ (1998) Linear programming: foundations and extensions. J. Oper. Res. Soc. 49(1):94.Google Scholar
- (2021) The Bayesian prophet: A low-regret framework for online decision making. Management Sci. 67(3):1329–1992, iii–iv.Link, Google Scholar
- (2019) Online allocation and pricing: Constant regret via bellman inequalities. Preprint, submitted June 14, https://arxiv.org/abs/1906.06361.Google Scholar
- (2008) Optimal Transport: Old and New, vol. 338 (Springer Science & Business Media, Berlin, Heidelberg).Google Scholar
- (2019) An online learning approach to model predictive control. Preprint, submitted February 24, https://arxiv.org/abs/1902.08967.Google Scholar
- (2021) Regret and cumulative constraint violation analysis for online convex optimization with long term constraints. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 11998–12008.Google Scholar
- (2018) Online convex optimization for cumulative constraints. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 6137–6146.Google Scholar
- (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43(3):381–394.Link, Google Scholar

