Online Learning for Dual-Index Policies in Dual-Sourcing Systems

Jingwen Tang
Jingwen Tang
[email protected]
https://orcid.org/0009-0008-5612-3313
Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109;
Search for more papers by this author
,
Boxiao Chen
Boxiao Chen
[email protected]
https://orcid.org/0000-0002-5967-4822
College of Business Administration, University of Illinois Chicago, Chicago, Illinois 60607;
Search for more papers by this author
,
Cong Shi
Corresponding Author
Cong Shi
[email protected]
https://orcid.org/0000-0003-3564-3391
Management Science, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146
Search for more papers by this author

Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109;

Search for more papers by this author

Boxiao Chen

[email protected]

https://orcid.org/0000-0002-5967-4822

College of Business Administration, University of Illinois Chicago, Chicago, Illinois 60607;

Search for more papers by this author

Cong Shi

Corresponding Author

Cong Shi

[email protected]

https://orcid.org/0000-0003-3564-3391

Management Science, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146

Search for more papers by this author

Published Online:12 Dec 2023https://doi.org/10.1287/msom.2022.0323

References

Agrawal S, Jia R (2022) Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management. Oper. Res. 70(3):1646–1664.Link, Google Scholar
Allon G, Van Mieghem JA (2010) Global dual sourcing: Tailored base-surge allocation to near-and offshore production. Management Sci. 56(1):110–124.Link, Google Scholar
Azizzadenesheli K, Lazaric A, Anandkumar A (2016) Reinforcement learning of POMDPs using spectral methods. Feldman V, Rakhlin A, Shamir O, eds. 29th Conf. Learn. Theory, vol. 49 (PMLR, New York), 193–256.Google Scholar
Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
Bulinskaya EV (1964) Some results concerning optimum inventory policies. Theory Probab. Appl. 9(3):389–403.Crossref, Google Scholar
Chen B, Shi C (2020) Tailored base-surge policies in dual-sourcing inventory systems with demand learning. Technical report, University of Michigan, Ann Arbor.Google Scholar
Chen B, Chao X, Ahn HS (2019) Coordinating pricing and inventory replenishment with nonparametric demand learning. Oper. Res. 67(4):1035–1052.Abstract, Google Scholar
Chen B, Chao X, Shi C (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost sales and censored demand. Math. Oper. Res. 46(2):726–756.Link, Google Scholar
Chen W, Shi C, Duenyas I (2020) Optimal learning algorithms for stochastic inventory systems with random capacities. Production Oper. Management 29(7):1624–1649.Crossref, Google Scholar
Chen B, Wang Y, Zhou Y (2023) Optimal policies for dynamic pricing and inventory control with nonparametric censored demands. Management Sci., ePub ahead of print August 31, https://doi.org/10.1287/mnsc.2023.4859.Link, Google Scholar
Chen B, Simchi-Levi D, Wang Y, Zhou Y (2022) Dynamic pricing and inventory control with fixed ordering cost and incomplete demand information. Management Sci. 68(8):5684–5703.Link, Google Scholar
Cheung WC, Ma W, Simchi-Levi D, Wang X (2022) Inventory balancing with online learning. Management Sci. 68(3):1776–1807.Link, Google Scholar
Federgruen A, Liu Z, Lu L (2020) Synthesis and generalization of structural results in inventory management: A generalized convexity property. Math. Oper. Res. 45(2):547–575.Link, Google Scholar
Federgruen A, Liu Z, Lu L (2022) Dual sourcing: Creating and utilizing flexible capacities with a second supply source. Production Oper. Management 31(7):2789–2805.Crossref, Google Scholar
Feng Q, Gallego G, Sethi SP, Yan H, Zhang H (2005) Periodic-review inventory model with three consecutive delivery modes and forecast updates. J. Optim. Theory Appl. 124(1):137–155.Crossref, Google Scholar
Fukuda Y (1964) Optimal policies for the inventory problem with negotiable leadtime. Management Sci. 10(4):690–708.Link, Google Scholar
Gong XY, Simchi-Levi D (2023) Bandits atop reinforcement learning: Tackling online inventory models with cyclic demands. Management Sci., ePub ahead of print October 26, https://doi.org/10.1287/mnsc.2023.4947.Link, Google Scholar
Hua Z, Yu Y, Zhang W, Xu X (2015) Structural properties of the optimal policy for dual-sourcing systems with general lead times. IIE Trans. 47(8):841–850.Crossref, Google Scholar
Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.Link, Google Scholar
Janakiraman G, Seshadri S (2017) Dual sourcing inventory systems: On optimal policies and the value of costless returns. Production Oper. Management 26(2):203–210.Crossref, Google Scholar
Janakiraman G, Seshadri S, Sheopuri A (2015) Analysis of tailored base-surge policies in dual sourcing inventory systems. Management Sci. 61(7):1547–1561.Link, Google Scholar
Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.Crossref, Google Scholar
Levi R, Perakis G, Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Oper. Res. 63(6):1294–1306.Link, Google Scholar
Levi R, Roundy RO, Shmoys DB (2007) Provably near-optimal sampling-based policies for stochastic inventory control models. Math. Oper. Res. 32(4):821–839.Link, Google Scholar
Li Q, Yu P (2014) Multimodularity and its applications in three stochastic dynamic inventory problems. Manufacturing Service Oper. Management 16(3):455–463.Link, Google Scholar
Ortner R (2020) Regret bounds for reinforcement learning via Markov chain concentration. J. Artificial Intelligence Res. 67:115–128.Crossref, Google Scholar
Paulin D (2015) Concentration inequalities for Markov chains by Marton couplings and spectral methods. Electronic J. Probab. 20:1–32.Crossref, Google Scholar
Sheopuri A, Janakiraman G, Seshadri S (2010) New policies for the stochastic inventory control problem with two supply sources. Oper. Res. 58(3):734–745.Link, Google Scholar
Shi C, Chen W, Duenyas I (2016) Nonparametric data-driven algorithms for multiproduct inventory systems with censored demand. Oper. Res. 64(2):362–370.Link, Google Scholar
Sun J, Van Mieghem JA (2019) Robust dual sourcing inventory management: Optimality of capped dual index policies and smoothing. Manufacturing Service Oper. Management 21(4):912–931.Link, Google Scholar
Svoboda J, Minner S, Yao M (2021) Typology and literature review on multiple supplier inventory control models. Eur. J. Oper. Res. 293(1):1–23.Crossref, Google Scholar
Veeraraghavan S, Scheller-Wolf A (2008) Now or later: A simple policy for effective dual sourcing in capacitated systems. Oper. Res. 56(4):850–864.Link, Google Scholar
Whittemore AS, Saunders SC (1977) Optimal inventory under stochastic demand with two supply options. SIAM J. Appl. Math. 32(2):293–305.Crossref, Google Scholar
Xin L, Goldberg DA (2018) Asymptotic optimality of tailored base-surge policies in dual-sourcing inventory systems. Management Sci. 64(1):437–452.Link, Google Scholar
Xin L, Van Mieghem JA (2021) Dual-sourcing, dual-mode dynamic stochastic inventory models: A review. Technical report, Chicago University, Chicago.Google Scholar
Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.Link, Google Scholar
Zhang H, Chao X, Shi C (2018) Perishable inventory systems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Res. 66(5):1276–1286.Link, Google Scholar
Zhang H, Chao X, Shi C (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.Link, Google Scholar

cover image Manufacturing & Service Operations Management

Volume 26, Issue 2

March-April 2024

Pages 407-795, C2

Article Information

Supplemental Material

Metrics

Information

Received:July 01, 2022
Accepted:August 10, 2023
Published Online:December 12, 2023

Cite as

Jingwen Tang, Boxiao Chen, Cong Shi (2023) Online Learning for Dual-Index Policies in Dual-Sourcing Systems. Manufacturing & Service Operations Management 26(2):758-774.

https://doi.org/10.1287/msom.2022.0323

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning for Dual-Index Policies in Dual-Sourcing Systems

References

Volume 26, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News