Inventory Management with Transformer: Automated Decision Making for Order Timing and Quantity

Mo Liu
Corresponding Author
Mo Liu
[email protected]
https://orcid.org/0000-0002-7598-066X
Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599
Search for more papers by this author
,
Yumo Bai
Yumo Bai
[email protected]
Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599
Search for more papers by this author
,
Meng Qi
Meng Qi
[email protected]
SC Johnson College of Business, Cornell University, Ithaca, New York 14850
Search for more papers by this author
,
Zuo-Jun (Max) Shen
Zuo-Jun (Max) Shen
[email protected]
Department of Data and Systems Engineering, University of Hong Kong, Hong Kong, China
Search for more papers by this author

Mo Liu

Corresponding Author

Mo Liu

[email protected]

https://orcid.org/0000-0002-7598-066X

Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599

Search for more papers by this author

Yumo Bai

[email protected]

Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599

Search for more papers by this author

Meng Qi

[email protected]

SC Johnson College of Business, Cornell University, Ithaca, New York 14850

Search for more papers by this author

Zuo-Jun (Max) Shen

[email protected]

Department of Data and Systems Engineering, University of Hong Kong, Hong Kong, China

Search for more papers by this author

Published Online:7 Apr 2026https://doi.org/10.1287/serv.2024.0236

References

Aksen D, Altınkemer K, Chand S (2003) The single-item lot-sizing problem with immediate lost sales. Eur. J. Oper. Res. 147(3):558–566.Crossref, Google Scholar
Alp O, Erkip NK, Güllü R (2003) Optimal lot-sizing/vehicle-dispatching policies under stochastic lead times and stepwise fixed costs. Oper. Res. 51(1):160–166.Link, Google Scholar
Alvo M, Russo D, Kanoria Y (2023) Neural inventory control in networks via hindsight differentiable policy optimization. Preprint, submitted June 20, https://arxiv.org/abs/2306.11246v1.Google Scholar
Amazon (2021) Global selling. Sell worldwide with Amazon. Accessed October 15, 2021, https://sell.amazon.com/global-selling.Google Scholar
Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.Link, Google Scholar
Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.Link, Google Scholar
Bertsimas D, McCord C (2019) From predictions to prescriptions in multistage optimization problems. Preprint, submitted April 26, https://arxiv.org/abs/1904.11637.Google Scholar
Beutel AL, Minner S (2012) Safety stock planning under causal demand forecasting. Internat. J. Production Econom. 140(2):637–645.Crossref, Google Scholar
Boute RN, Gijsbrechts J, van Jaarsveld W, Vanvuchelen N (2022) Deep reinforcement learning for inventory control: A roadmap. Eur. J. Oper. Res. 298(2):401–412.Crossref, Google Scholar
Brahimi N, Dauzere-Peres S, Najid NM, Nordli A (2006) Single item lot sizing problems. Eur. J. Oper. Res. 168(1):1–16.Crossref, Google Scholar
BusinessInsider (2025) Walmart’s AI-assisted distribution centres aim to cut food waste and boost profits. (September 17), https://www.businessinsider.com/ai-robotics-walmart-distribution-supply-chain-efficiency-2025-9.Google Scholar
Chen Z, Chan J (2024) Large language model in creative work: The role of collaboration modality and user expertise. Management Sci. 70(12):9101–9117.Link, Google Scholar
Chen L, Lu K, Rajeswaran A, Lee K, Grover A, Laskin M, Abbeel P, Srinivas A, Mordatch I (2021) Decision transformer: Reinforcement learning via sequence modeling. Adv. Neural Inform. Processing Systems 34:15084–15097.Google Scholar
Cheung WC, Simchi-Levi D (2019) Sampling-based approximation schemes for capacitated stochastic inventory control models. Math. Oper. Res. 44(2):668–692.Link, Google Scholar
Chu LY, Shanthikumar JG, Shen ZJM (2008) Solving operational statistics via a Bayesian analysis. Oper. Res. Lett. 36(1):110–116.Crossref, Google Scholar
Cohen MC, Dai T, eds. (2025) AI in Supply Chains: Perspectives from Global Thought Leaders, Springer Series in Supply Chain Management, vol. 27 (Springer, Cham, Switzerland).Google Scholar
Cristian R, Harsha P, Perakis G, Quanz BL, Spantidakis I (2023) End-to-end learning via constraint-enforcing approximators for linear programs with applications to supply chains. Proc. AAAI Conf. Artificial Intelligence 37(6):7253–7260.Google Scholar
Dai T, Singh S (2021) Artificial intelligence on call: The physician’s decision of whether to use AI in clinical practice. J. Marketing Res. 62(5):854–875.Google Scholar
Dai T, Tayur S (2022) Designing AI-augmented healthcare delivery systems for physician buy-in and patient acceptance. Production Oper. Management 31(12):4443–4451.Crossref, Google Scholar
Donti P, Amos B, Kolter JZ (2017) Task-based end-to-end model learning in stochastic optimization. Adv. Neural Inform. Processing Systems 30:5484–5494.Google Scholar
Dubey A, Jauhri A, Pandey A, Kadian A, Al-Dahle A, Letman A, Mathur A, et al. (2024) The Llama 3 herd of models. Preprint, submitted July 31, https://arxiv.org/abs/2407.21783.Google Scholar
Ekambaram V, Jati A, Dayama P, Mukherjee S, Nguyen N, Gifford WM, Reddy C, Kalagnanam J (2024) Tiny time mixers (TTMs): Fast pre-trained models for enhanced zero/few-shot forecasting of multivariate time series. Adv. Neural Inform. Processing Systems 37:74147–74181.Google Scholar
Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize.” Management Sci. 68(1):9–26.Link, Google Scholar
Er C, Liu M (2025) Decision-focused bias correction for fluid approximation. Preprint, submitted December 4, https://arxiv.org/abs/2512.15726.Google Scholar
Gijsbrechts J, Boute RN, Van Mieghem JA, Zhang DJ (2022) Can deep reinforcement learning improve inventory management? Performance on lost sales, dual-sourcing, and multi-echelon problems. Manufacturing Service Oper. Management 24(3):1349–1368.Link, Google Scholar
Guan Y, Miller AJ (2008) Polynomial-time algorithms for stochastic uncapacitated lot-sizing problems. Oper. Res. 56(5):1172–1183.Link, Google Scholar
Halman N, Orlin JB, Simchi-Levi D (2012) Approximating the nonlinear newsvendor and single-item stochastic lot-sizing problems when data is given by an oracle. Oper. Res. 60(2):429–446.Link, Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780.Crossref, Google Scholar
Ho-Nguyen N, Kılınç-Karzan F (2020) Risk guarantees for end-to-end prediction and optimization processes. Management Sci. 68(12):8680–8698.Google Scholar
Huang K, KüçüKyavuz S (2008) On stochastic lot-sizing problems with random lead times. Oper. Res. Lett. 36(3):303–308.Crossref, Google Scholar
Iglehart DL (1963) Optimality of (s, S) policies in the infinite horizon dynamic inventory problem. Management Sci. 9(2):259–267.Link, Google Scholar
invent.ai (2025) Five Below deploys invent.ai’s platform for forecasting, inventory optimization and replenishment. (June 3), https://www.invent.ai/news/five-below-deploys-invent-ai-platform-for-forecasting-inventory-optimization-and-replenishment.Google Scholar
Janner M, Li Q, Levine S (2021) Offline reinforcement learning as one big sequence modeling problem. Adv. Neural Inform. Processing Systems 34:1273–1286.Google Scholar
JD (2021) JD.com announces fourth quarter and full year 2020 results. Accessed October 15, 2021, https://ir.jd.com/news-releases/news-release-details/jdcom-announces-fourth-quarter-and-full-year-2020-results.Google Scholar
Jiang R, Guan Y (2011) An o (n2)-time algorithm for the stochastic uncapacitated lot-sizing problem with random lead times. Oper. Res. Lett. 39(1):74–77.Crossref, Google Scholar
Klabjan D, Simchi-Levi D, Song M (2013) Robust stochastic lot-sizing by means of histograms. Production Oper. Management 22(3):691–710.Crossref, Google Scholar
Levi R, Shi C (2013) Approximation algorithms for the stochastic lot-sizing problem with order lead times. Oper. Res. 61(3):593–602.Link, Google Scholar
Levi R, Janakiraman G, Nagarajan M (2008a) A 2-approximation algorithm for stochastic inventory control models with lost sales. Math. Oper. Res. 33(2):351–374.Link, Google Scholar
Levi R, Perakis G, Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Oper. Res. 63(6):1294–1306.Link, Google Scholar
Levi R, Roundy RO, Shmoys DB (2007a) Provably near-optimal sampling-based policies for stochastic inventory control models. Math. Oper. Res. 32(4):821–839.Link, Google Scholar
Levi R, Pál M, Roundy RO, Shmoys DB (2007b) Approximation algorithms for stochastic inventory control models. Math. Oper. Res. 32(2):284–302.Link, Google Scholar
Levi R, Roundy RO, Shmoys DB, Truong VA (2008b) Approximation algorithms for capacitated stochastic inventory control models. Oper. Res. 56(5):1184–1199.Link, Google Scholar
Lin L, Bai Y, Mei S (2023) Transformers as decision makers: Provable in-context reinforcement learning via supervised pretraining. Preprint, submitted October 12, https://arxiv.org/abs/2310.08566.Google Scholar
Liu S, Cai Z, Chen G, Li X (2025) Towards better understanding of in-context learning ability from in-context uncertainty quantification. Trans. Machine Learn. Res.Google Scholar
Liyanage LH, Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Oper. Res. Lett. 33(4):341–348.Crossref, Google Scholar
Madeka D, Torkkola K, Eisenach C, Luo A, Foster DP, Kakade SM (2022) Deep inventory management. Preprint, submitted October 6, https://arxiv.org/abs/2210.03137.Google Scholar
Meller J, Taigel F (2019) Machine learning for inventory management: Analyzing two concepts to get from data to decisions. Preprint, submitted November 11, https://doi.org/10.2139/ssrn.3256643.Google Scholar
Oroojlooyjadid A, Nazari M, Snyder LV, Takáč M (2022) A deep Q-network for the beer game: Deep reinforcement learning for inventory optimization. Manufacturing Service Oper. Management 24(1):285–304.Link, Google Scholar
Qi M, Shen Z-JM (2022) Integrating prediction/estimation and optimization with applications in operations management. Tutorials Oper. Res. Emerging Impactful Topics Oper. 2022:36–58.Link, Google Scholar
Qi M, Mak HY, Shen ZJM (2020) Data-driven research in retail operations—A review. Naval Res. Logist. 67(8):595–616.Crossref, Google Scholar
Qi M, Shi Y, Qi Y, Ma C, Yuan R, Wu D, Shen ZJ (2023) A practical end-to-end inventory management model with deep learning. Management Sci. 69(2):759–773.Link, Google Scholar
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. Technical report, OpenAI, https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.Google Scholar
Sadana U, Chenreddy A, Delage E, Forel A, Frejinger E, Vidal T (2025) A survey of contextual optimization methods for decision-making under uncertainty. Eur. J. Oper. Res. 320(2):271–289.Crossref, Google Scholar
Sandbothe RA, Thompson GL (1990) A forward algorithm for the capacitated lot size model with stockouts. Oper. Res. 38(3):474–486.Link, Google Scholar
Snyder LV, Shen ZJM (2019) Fundamentals of Supply Chain Theory (Wiley Online Library, Hoboken, NJ).Crossref, Google Scholar
Sun W, McFaddin S, Tran LH, Subramanian S, Greenewald K, Tenzin Y, Xue Z, Drissi Y, Ettl M (2024) PresAIse, a prescriptive AI solution for enterprise. INFOR Inform. Systems Oper. Res. 62(4):629–645.Crossref, Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv. Neural Inform. Processing Systems 31:6000–6010.Google Scholar
Wagner HM, Whitin TM (1958) Dynamic version of the economic lot size model. Management Sci. 5(1):89–96.Link, Google Scholar
Wang H, Pan Y, Sun F, Liu S, Talluri K, Chen G, Li X (2024) Understanding the training and generalization of pretrained transformer for sequential decision making. Preprint, submitted May 23, https://arxiv.org/abs/2405.14219.Google Scholar
Xie Y, Ma W, Xin L (2024) VC theory for inventory policies. Preprint, submitted April 17, https://arxiv.org/abs/2404.11509.Google Scholar
Xie Y, Hao X, Liu J, Ma W, Xin L, Cao L, Zhang Y (2025) Deepstock: Reinforcement learning with policy regularizations for inventory management. Preprint, submitted November 21, https://doi.org/10.2139/ssrn.5784782. Google Scholar
Zheng Y-S (1991) A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems. J. Appl. Probab. 28(4):802–810.Crossref, Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:November 29, 2024
Accepted:February 23, 2026
Published Online:April 07, 2026

Cite as

Mo Liu, Yumo Bai, Meng Qi, Zuo-Jun (Max) Shen (2026) Inventory Management with Transformer: Automated Decision Making for Order Timing and Quantity. Service Science 0(0).

https://doi.org/10.1287/serv.2024.0236

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Inventory Management with Transformer: Automated Decision Making for Order Timing and Quantity

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News