An Online Mirror Descent Learning Algorithm for Multiproduct Inventory Systems
References
- (2016) Linear contextual bandits with knapsacks. Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Inc., Barcelona, Spain), 3450–3458.Google Scholar
- (2022) Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management. Oper. Res. 70(3):1646–1664.Link, Google Scholar
- Amazon (2024) Amazon 2023 annual report. Accessed April 22, 2025, https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf.Google Scholar
- (2018) Bandits with knapsacks. J. ACM 65(3):13.Crossref, Google Scholar
- (2023) The best of many worlds: Dual mirror descent for online allocation problems. Oper. Res. 71(1):101–119.Link, Google Scholar
- (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3):167–175.Crossref, Google Scholar
- (2023) Inventory control and learning for one-warehouse multistore system with censored demand. Oper. Res. 71(6):2092–2110.Link, Google Scholar
- (2013) On implications of demand censoring in the newsvendor problem. Management Sci. 59(6):1407–1424.Link, Google Scholar
- (2001) Stochastic multi-product inventory models with limited storage. J. Optim. Theory Appl. 111:553–588. Crossref, Google Scholar
- (2002) Average-cost optimality of a base-stock policy for a multi-product inventory model with limited storage. Zaccour G, ed. Decision and Control in Management Science: Essays in Honor of Alain Haurie (Springer, New York), 241–260.Crossref, Google Scholar
- (1997) How to use expert advice. J. ACM 44(3):427–485.Crossref, Google Scholar
- (2025) Tailored base-surge policies in dual-sourcing inventory systems with demand learning. Oper. Res. 73(4):1723–1743.Link, Google Scholar
- (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost sales and censored demand. Math. Oper. Res. 46(2):726–756.Link, Google Scholar
- (2020) Optimal learning algorithms for stochastic inventory systems with random capacities. Production Oper. Management 29(7):1624–1649.Crossref, Google Scholar
- (2024) Learning to order for inventory systems with lost sales and uncertain supplies. Management Sci. 70(12):8631–8646.Link, Google Scholar
- (2022) Dynamic pricing and inventory control with fixed ordering cost and incomplete demand information. Management Sci. 68(8):5684–5703.Link, Google Scholar
- Costco (2019) Costco wholesale annual report 2019. Accessed April 22, 2025, https://stocklight.com/stocks/us/nasdaq-cost/costco-wholesale/annual-reports/nasdaq-cost-2019-10K-191146791.pdf.Google Scholar
- (2012) Ergodic mirror descent. SIAM J. Optim. 22(4):1549–1578.Crossref, Google Scholar
- (2010) Composite objective mirror descent. Kalai AT, Mohri M, eds. Proc. 23rd Conf. Learn. Theory (COLT) (Omnipress, Haifa, Israel), 14–26.Google Scholar
- (2022) Online mirror descent and dual averaging: Keeping pace in the dynamic case. J. Machine Learn. Res. 23(1):5271–5308.Google Scholar
- (2022) An asymptotically optimal heuristic for multi-item inventory models with joint inventory constraints. Working paper, Columbia University, New York.Google Scholar
- (2016) Introduction to online convex optimization. Foundations Trends Optim. 2(3–4):157–325.Crossref, Google Scholar
- (2009) A nonparametric asymptotic analysis of inventory planning with censored demand. Math. Oper. Res. 34(1):103–123.Link, Google Scholar
- (2009) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.Link, Google Scholar
- (1969) Optimality of myopic inventory policies for several substitute products. Management Sci. 15(5):284–304.Link, Google Scholar
- (2019) Service level constrained inventory systems. Production Oper. Management 28(9):2365–2389.Crossref, Google Scholar
- (2020) Efficiently solving MDPs with stochastic mirror descent. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., vol. 119 (PMLR, New York), 4890–4900.Google Scholar
- (2011) Solving variational inequalities with stochastic mirror-prox algorithm. Stochastic Systems 1(1):17–58.Link, Google Scholar
- (2024) What’s causing the warehouse space shortage and how businesses are tackling it. Accessed January 20, 2024, https://www.newcastlesys.com/blog/whats-causing-the-warehouse-space-shortage-and-how-businesses-are-tackling-it.Google Scholar
- (2012) An optimal method for stochastic composite optimization. Math. Programming 133(1):365–397.Crossref, Google Scholar
- (2023) Policy mirror descent for reinforcement learning: Linear convergence, new sampling complexity, and generalized problem classes. Math. Programming 198(1):1059–1106.Crossref, Google Scholar
- (2023) Dynamic learning policy for multi-warehouse multi-store systems with censored demands. Working paper, University of Texas at Dallas, Richardson.Google Scholar
- (1983) Problem Complexity and Method Efficiency in Optimization (Wiley-Interscience, New York).Google Scholar
- (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.Crossref, Google Scholar
- (2007) Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Programming 109(2):319–344.Crossref, Google Scholar
- (2018) Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Automatic Control 63(3):714–725.Crossref, Google Scholar
- (2012) Online learning and online convex optimization. Foundations Trends Machine Learn. 4(2):107–194.Crossref, Google Scholar
- (2016) Nonparametric data-driven algorithms for multiproduct inventory systems with censored demand. Oper. Res. 64(2):362–370.Link, Google Scholar
- (2024) Online learning for dual-index policies in dual-sourcing systems. Manufacturing Service Oper. Management 26(2):758–774.Link, Google Scholar
- (2025a) Fairness-constrained inventory control with demand learning. Working paper, University of Miami, Miami.Google Scholar
- (2025b) Offline feature-based pricing under censored demand: A causal inference approach. Manufacturing Service Oper. Management 27(2):535–553.Link, Google Scholar
- (2022) Mirror descent policy optimization. Beygelzimer A, Dauphin Y, Liang P, Vaughan JW, eds. Proc. Tenth Internat. Conf. Learn. Representations (ICLR).Google Scholar
- (1965) Optimal policy for a multi-product, dynamic, nonstationary inventory problem. Management Sci. 12(3):206–222.Link, Google Scholar
- (2024) Sample complexity of neural policy mirror descent for policy optimization on low-dimensional manifolds. J. Machine Learn. Res. 25(226):1–67.Google Scholar
- (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.Link, Google Scholar
- (2018) Perishable inventory systems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Res. 66(5):1276–1286.Link, Google Scholar
- (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.Link, Google Scholar
- (2017) Stochastic mirror descent in variationally coherent optimization problems. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 9397–9406.Google Scholar
- (2000) Foundations of Inventory Management (McGraw-Hill, New York).Google Scholar
- (2008) On the structure of lost-sales inventory models. Oper. Res. 56(4):937–944.Link, Google Scholar

