Online Planning in Nonstationary Environments

Published Online:https://doi.org/10.1287/opre.2022.0604

References

  • Agrawal S, Devanur NR (2015) Fast algorithms for online stochastic convex programming. Indyk P, ed. Proc. 26th Ann. ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 1405–1424.Google Scholar
  • Agrawal S, Wang Z, Ye Y (2014) A dynamic near-optimal algorithm for online linear programming. Oper. Res. 62(4):876–890.LinkGoogle Scholar
  • Alaei S, Hajiaghayi M, Liaghat V (2012) Online prophet-inequality matching with applications to ad allocation. Faltings B, Leyton-Brown K, Ipeirotis P, eds. Proc. 13th ACM Conf. Electronic Commerce (Association for Computing Machinery (ACM), New York), 18–35.Google Scholar
  • Asadpour A, Wang X, Zhang J (2020) Online resource allocation with limited flexibility. Management Sci. 66(2):642–666.LinkGoogle Scholar
  • Aviv Y, Federgruen A (2001) Capacitated multi-item inventory systems with random and seasonally fluctuating demands: Implications for postponement strategies. Management Sci. 47(4):512–531.LinkGoogle Scholar
  • Balseiro SR, Lu H, Mirrokni V (2023) The best of many worlds: Dual mirror descent for online allocation problems. Oper. Res. 71(1):101–119.LinkGoogle Scholar
  • Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.LinkGoogle Scholar
  • Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.LinkGoogle Scholar
  • Bubeck S (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3–4):231–357.CrossrefGoogle Scholar
  • Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Sci. 66(7):2993–3009.LinkGoogle Scholar
  • Chen Q, Jasin S, Duenyas I (2019a) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.LinkGoogle Scholar
  • Chen X, Wang Y, Wang Y (2019b) Non-stationary stochastic optimization under l_ {p, q}-variation measures. Oper. Res. 67(6):1752–1765.LinkGoogle Scholar
  • Cheung WC, Simchi-Levi D, Zhu R (2022) Hedging the drift: Learning to optimize under nonstationarity. Management Sci. 68(3):1696–1713.LinkGoogle Scholar
  • Chou MC, Chua GA, Teo C-P, Zheng H (2010) Design for process flexibility: Efficiency of the long chain and sparse structure. Oper. Res. 58(1):43–58.LinkGoogle Scholar
  • Ehrenthal JCF, Honhon D, Van Woensel T (2014) Demand seasonality in retail inventory management. Eur. J. Oper. Res. 238(2):527–539.CrossrefGoogle Scholar
  • Gong X-Y, Simchi-Levi D (2024) Bandits atop reinforcement learning: Tackling online inventory models with cyclic demands. Management Sci. 70(9):6139–6157.AbstractGoogle Scholar
  • Guo H, Liu X, Wei H, Ying L (2022) Online convex optimization with hard constraints: Towards the best of two worlds and beyond. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Proc. 36th Internat. Conf. Neural Inform. Processing Systems, Advances in Neural Information Processing Systems 35 (Curran Associates Inc., Red Hook, NY), 36426–36439.Google Scholar
  • Hong JL, Jiang G (2019) Offline simulation online application: A new framework of simulation-based decision making. Asia-Pacific J. Oper. Res. 36(6):1940015.CrossrefGoogle Scholar
  • Huh WT, Rusmevichientong P (2014) Online sequential optimization with biased gradients: Theory and applications to censored demand. INFORMS J. Comput. 26(1):150–159.LinkGoogle Scholar
  • Immorlica N, Sankararaman K, Schapire R, Slivkins A (2022) Adversarial bandits with knapsacks. J. ACM 69(6):1–47.CrossrefGoogle Scholar
  • Jiang G, Hong JL, Nelson BL (2020) Online risk monitoring using offline simulation. INFORMS J. Comput. 32(2):356–375.AbstractGoogle Scholar
  • Jiang J, Li X, Zhang J (2025) Online stochastic optimization with Wasserstein based non-stationarity. Management Sci., ePub ahead of print March 3, https://doi.org/10.1287/mnsc.2020.03850.LinkGoogle Scholar
  • Jiang J, Wang S, Zhang J (2023) Achieving high individual service levels without safety stock? Optimal rationing policy of pooled resources. Oper. Res. 71(1):358–377.LinkGoogle Scholar
  • Jordan WC, Graves SC (1995) Principles on the benefits of manufacturing process flexibility. Management Sci. 41(4):577–594.LinkGoogle Scholar
  • Li X, Ye Y (2022) Online linear programming: Dual convergence, new algorithms, and regret bounds. Oper. Res. 70(5):2948–2966.LinkGoogle Scholar
  • Li X, Rong Y, Zhang R, Zheng H (2025) Online advertisement allocation under customer choices and algorithmic fairness. Management Sci. 71(1):825–843.AbstractGoogle Scholar
  • Liang Y, Luo H, Duan H, Li D, Liao H, Feng J, Zhao J, et al. (2024) Meituan’s real-time intelligent dispatching algorithms build the world’s largest minute-level delivery network. INFORMS J. Appl. Analysis 54(1):84–101.LinkGoogle Scholar
  • Lyu G (2019) Online resource allocation: Theory and applications. PhD thesis, National University of Singapore, Singapore.Google Scholar
  • Lyu G, Cheung WC, Teo C-P, Wang H (2024) Multiobjective stochastic optimization: A case of real-time matching in ride-sourcing markets. Manufacturing Service Oper. Management 26(2):500–518.LinkGoogle Scholar
  • Lyu G, Cheung W-C, Chou MC, Teo C-P, Zheng Z, Zhong Y (2019) Capacity allocation in flexible production networks: Theory and applications. Management Sci. 65(11):5091–5109.LinkGoogle Scholar
  • Ma Y, Rusmevichientong P, Sumida M, Topaloglu H (2020) An approximation algorithm for network revenue management under nonstationary arrivals. Oper. Res. 68(3):834–855.LinkGoogle Scholar
  • Mahdavi M, Jin R, Yang T (2012) Trading regret for efficiency: Online convex optimization with long term constraints. J. Machine Learn. Res. 13(1):2503–2528.Google Scholar
  • Nemirovskij AS, Yudin DB (1983) Problem Complexity and Method Efficiency in Optimization (Wiley-Interscience, Hoboken, NJ).Google Scholar
  • Papadimitriou C, Pollner T, Saberi A, Wajc D (2021) Online stochastic max-weight bipartite matching: Beyond prophet inequalities. Biró P, Chawla S, Echenique F, eds. Proc. 22nd ACM Conf. Econom. Comput. (Association for Computing Machinery (ACM), New York), 763–764.Google Scholar
  • Rakhlin A, Sridharan K (2013) Online learning with predictable sequences. Shalev-Shwartz S, Steinwart I, eds. Proc. Conf. Learn. Theory (PMLR, Princeton, NJ), 993–1019.Google Scholar
  • Scroccaro PZ, Kolarijani AS, Esfahani PM (2023) Adaptive composite online optimization: Predictions in static and dynamic environments. IEEE Trans. Automatic Control 68(5):2906–2921.CrossrefGoogle Scholar
  • Shalev-Shwartz S (2012) Online learning and online convex optimization. Foundations Trends Machine Learn. 4(2):107–194.CrossrefGoogle Scholar
  • Shalev-Shwartz S, Shamir O, Srebro N, Sridharan K (2009) Stochastic convex optimization. Proc. 22nd Ann. Conf. Learn. Theory (Montreal, Quebec).Google Scholar
  • Shi C, Wei Y, Zhong Y (2019) Process flexibility for multiperiod production systems. Oper. Res. 67(5):1300–1320.LinkGoogle Scholar
  • Wang X, Zhang J (2015) Process flexibility: A distribution-free bound on the performance of k-chain. Oper. Res. 63(3):555–571.LinkGoogle Scholar
  • Wang X, Truong V-A, Bank D (2018) Online advance admission scheduling for services with customer preferences. Preprint, submitted May 26, https://arxiv.org/abs/1805.10412.Google Scholar
  • Xu Z, Zhang H, Zhang J, Zhang RQ (2020) Online demand fulfillment under limited flexibility. Management Sci. 66(10):4667–4685.LinkGoogle Scholar
  • Yu P-L (1973) A class of solutions for group decision problems. Management Sci. 19(8):936–946.LinkGoogle Scholar
  • Yu H, Neely M, Wei X (2017) Online convex optimization with stochastic constraints. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Proc. 31st Internat. Conf. Neural Inform. Processing Systems, Advances in Neural Information Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY), 1427–1437.Google Scholar
  • Zhang H, Chao X, Shi C (2018) Perishable inventory systems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Res. 66(5):1276–1286.LinkGoogle Scholar
  • Zhong Y, Zheng Z, Chou MC, Teo C-P (2018) Resource pooling and allocation policies to deliver differentiated service. Management Sci. 64(4):1555–1573.LinkGoogle Scholar
  • Zinkevich M (2003) Online convex programming and generalized infinitesimal gradient ascent. Fawcett T, Mishra N, eds. Proc. 20th Internat. Conf. Machine Learn. (AAAI Press, Washington, DC), 928–936.Google Scholar
  • Zipkin P (1989) Critical number policies for inventory models with periodic data. Management Sci. 35(1):71–80.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.