Inventory Management with Transformer: Automated Decision Making for Order Timing and Quantity

Published Online:https://doi.org/10.1287/serv.2024.0236

References

  • Aksen D, Altınkemer K, Chand S (2003) The single-item lot-sizing problem with immediate lost sales. Eur. J. Oper. Res. 147(3):558–566.CrossrefGoogle Scholar
  • Alp O, Erkip NK, Güllü R (2003) Optimal lot-sizing/vehicle-dispatching policies under stochastic lead times and stepwise fixed costs. Oper. Res. 51(1):160–166.LinkGoogle Scholar
  • Alvo M, Russo D, Kanoria Y (2023) Neural inventory control in networks via hindsight differentiable policy optimization. Preprint, submitted June 20, https://arxiv.org/abs/2306.11246v1.Google Scholar
  • Amazon (2021) Global selling. Sell worldwide with Amazon. Accessed October 15, 2021, https://sell.amazon.com/global-selling.Google Scholar
  • Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.LinkGoogle Scholar
  • Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.LinkGoogle Scholar
  • Bertsimas D, McCord C (2019) From predictions to prescriptions in multistage optimization problems. Preprint, submitted April 26, https://arxiv.org/abs/1904.11637.Google Scholar
  • Beutel AL, Minner S (2012) Safety stock planning under causal demand forecasting. Internat. J. Production Econom. 140(2):637–645.CrossrefGoogle Scholar
  • Boute RN, Gijsbrechts J, van Jaarsveld W, Vanvuchelen N (2022) Deep reinforcement learning for inventory control: A roadmap. Eur. J. Oper. Res. 298(2):401–412.CrossrefGoogle Scholar
  • Brahimi N, Dauzere-Peres S, Najid NM, Nordli A (2006) Single item lot sizing problems. Eur. J. Oper. Res. 168(1):1–16.CrossrefGoogle Scholar
  • BusinessInsider (2025) Walmart’s AI-assisted distribution centres aim to cut food waste and boost profits. (September 17), https://www.businessinsider.com/ai-robotics-walmart-distribution-supply-chain-efficiency-2025-9.Google Scholar
  • Chen Z, Chan J (2024) Large language model in creative work: The role of collaboration modality and user expertise. Management Sci. 70(12):9101–9117.LinkGoogle Scholar
  • Chen L, Lu K, Rajeswaran A, Lee K, Grover A, Laskin M, Abbeel P, Srinivas A, Mordatch I (2021) Decision transformer: Reinforcement learning via sequence modeling. Adv. Neural Inform. Processing Systems 34:15084–15097.Google Scholar
  • Cheung WC, Simchi-Levi D (2019) Sampling-based approximation schemes for capacitated stochastic inventory control models. Math. Oper. Res. 44(2):668–692.LinkGoogle Scholar
  • Chu LY, Shanthikumar JG, Shen ZJM (2008) Solving operational statistics via a Bayesian analysis. Oper. Res. Lett. 36(1):110–116.CrossrefGoogle Scholar
  • Cohen MC, Dai T, eds. (2025) AI in Supply Chains: Perspectives from Global Thought Leaders, Springer Series in Supply Chain Management, vol. 27 (Springer, Cham, Switzerland).Google Scholar
  • Cristian R, Harsha P, Perakis G, Quanz BL, Spantidakis I (2023) End-to-end learning via constraint-enforcing approximators for linear programs with applications to supply chains. Proc. AAAI Conf. Artificial Intelligence 37(6):7253–7260.Google Scholar
  • Dai T, Singh S (2021) Artificial intelligence on call: The physician’s decision of whether to use AI in clinical practice. J. Marketing Res. 62(5):854–875.Google Scholar
  • Dai T, Tayur S (2022) Designing AI-augmented healthcare delivery systems for physician buy-in and patient acceptance. Production Oper. Management 31(12):4443–4451.CrossrefGoogle Scholar
  • Donti P, Amos B, Kolter JZ (2017) Task-based end-to-end model learning in stochastic optimization. Adv. Neural Inform. Processing Systems 30:5484–5494.Google Scholar
  • Dubey A, Jauhri A, Pandey A, Kadian A, Al-Dahle A, Letman A, Mathur A, et al. (2024) The Llama 3 herd of models. Preprint, submitted July 31, https://arxiv.org/abs/2407.21783.Google Scholar
  • Ekambaram V, Jati A, Dayama P, Mukherjee S, Nguyen N, Gifford WM, Reddy C, Kalagnanam J (2024) Tiny time mixers (TTMs): Fast pre-trained models for enhanced zero/few-shot forecasting of multivariate time series. Adv. Neural Inform. Processing Systems 37:74147–74181.Google Scholar
  • Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize.” Management Sci. 68(1):9–26.LinkGoogle Scholar
  • Er C, Liu M (2025) Decision-focused bias correction for fluid approximation. Preprint, submitted December 4, https://arxiv.org/abs/2512.15726.Google Scholar
  • Gijsbrechts J, Boute RN, Van Mieghem JA, Zhang DJ (2022) Can deep reinforcement learning improve inventory management? Performance on lost sales, dual-sourcing, and multi-echelon problems. Manufacturing Service Oper. Management 24(3):1349–1368.LinkGoogle Scholar
  • Guan Y, Miller AJ (2008) Polynomial-time algorithms for stochastic uncapacitated lot-sizing problems. Oper. Res. 56(5):1172–1183.LinkGoogle Scholar
  • Halman N, Orlin JB, Simchi-Levi D (2012) Approximating the nonlinear newsvendor and single-item stochastic lot-sizing problems when data is given by an oracle. Oper. Res. 60(2):429–446.LinkGoogle Scholar
  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780.CrossrefGoogle Scholar
  • Ho-Nguyen N, Kılınç-Karzan F (2020) Risk guarantees for end-to-end prediction and optimization processes. Management Sci. 68(12):8680–8698.Google Scholar
  • Huang K, KüçüKyavuz S (2008) On stochastic lot-sizing problems with random lead times. Oper. Res. Lett. 36(3):303–308.CrossrefGoogle Scholar
  • Iglehart DL (1963) Optimality of (s, S) policies in the infinite horizon dynamic inventory problem. Management Sci. 9(2):259–267.LinkGoogle Scholar
  • invent.ai (2025) Five Below deploys invent.ai’s platform for forecasting, inventory optimization and replenishment. (June 3), https://www.invent.ai/news/five-below-deploys-invent-ai-platform-for-forecasting-inventory-optimization-and-replenishment.Google Scholar
  • Janner M, Li Q, Levine S (2021) Offline reinforcement learning as one big sequence modeling problem. Adv. Neural Inform. Processing Systems 34:1273–1286.Google Scholar
  • JD (2021) JD.com announces fourth quarter and full year 2020 results. Accessed October 15, 2021, https://ir.jd.com/news-releases/news-release-details/jdcom-announces-fourth-quarter-and-full-year-2020-results.Google Scholar
  • Jiang R, Guan Y (2011) An o (n2)-time algorithm for the stochastic uncapacitated lot-sizing problem with random lead times. Oper. Res. Lett. 39(1):74–77.CrossrefGoogle Scholar
  • Klabjan D, Simchi-Levi D, Song M (2013) Robust stochastic lot-sizing by means of histograms. Production Oper. Management 22(3):691–710.CrossrefGoogle Scholar
  • Levi R, Shi C (2013) Approximation algorithms for the stochastic lot-sizing problem with order lead times. Oper. Res. 61(3):593–602.LinkGoogle Scholar
  • Levi R, Janakiraman G, Nagarajan M (2008a) A 2-approximation algorithm for stochastic inventory control models with lost sales. Math. Oper. Res. 33(2):351–374.LinkGoogle Scholar
  • Levi R, Perakis G, Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Oper. Res. 63(6):1294–1306.LinkGoogle Scholar
  • Levi R, Roundy RO, Shmoys DB (2007a) Provably near-optimal sampling-based policies for stochastic inventory control models. Math. Oper. Res. 32(4):821–839.LinkGoogle Scholar
  • Levi R, Pál M, Roundy RO, Shmoys DB (2007b) Approximation algorithms for stochastic inventory control models. Math. Oper. Res. 32(2):284–302.LinkGoogle Scholar
  • Levi R, Roundy RO, Shmoys DB, Truong VA (2008b) Approximation algorithms for capacitated stochastic inventory control models. Oper. Res. 56(5):1184–1199.LinkGoogle Scholar
  • Lin L, Bai Y, Mei S (2023) Transformers as decision makers: Provable in-context reinforcement learning via supervised pretraining. Preprint, submitted October 12, https://arxiv.org/abs/2310.08566.Google Scholar
  • Liu S, Cai Z, Chen G, Li X (2025) Towards better understanding of in-context learning ability from in-context uncertainty quantification. Trans. Machine Learn. Res.Google Scholar
  • Liyanage LH, Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Oper. Res. Lett. 33(4):341–348.CrossrefGoogle Scholar
  • Madeka D, Torkkola K, Eisenach C, Luo A, Foster DP, Kakade SM (2022) Deep inventory management. Preprint, submitted October 6, https://arxiv.org/abs/2210.03137.Google Scholar
  • Meller J, Taigel F (2019) Machine learning for inventory management: Analyzing two concepts to get from data to decisions. Preprint, submitted November 11, https://doi.org/10.2139/ssrn.3256643.Google Scholar
  • Oroojlooyjadid A, Nazari M, Snyder LV, Takáč M (2022) A deep Q-network for the beer game: Deep reinforcement learning for inventory optimization. Manufacturing Service Oper. Management 24(1):285–304.LinkGoogle Scholar
  • Qi M, Shen Z-JM (2022) Integrating prediction/estimation and optimization with applications in operations management. Tutorials Oper. Res. Emerging Impactful Topics Oper. 2022:36–58.LinkGoogle Scholar
  • Qi M, Mak HY, Shen ZJM (2020) Data-driven research in retail operations—A review. Naval Res. Logist. 67(8):595–616.CrossrefGoogle Scholar
  • Qi M, Shi Y, Qi Y, Ma C, Yuan R, Wu D, Shen ZJ (2023) A practical end-to-end inventory management model with deep learning. Management Sci. 69(2):759–773.LinkGoogle Scholar
  • Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. Technical report, OpenAI, https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.Google Scholar
  • Sadana U, Chenreddy A, Delage E, Forel A, Frejinger E, Vidal T (2025) A survey of contextual optimization methods for decision-making under uncertainty. Eur. J. Oper. Res. 320(2):271–289.CrossrefGoogle Scholar
  • Sandbothe RA, Thompson GL (1990) A forward algorithm for the capacitated lot size model with stockouts. Oper. Res. 38(3):474–486.LinkGoogle Scholar
  • Snyder LV, Shen ZJM (2019) Fundamentals of Supply Chain Theory (Wiley Online Library, Hoboken, NJ).CrossrefGoogle Scholar
  • Sun W, McFaddin S, Tran LH, Subramanian S, Greenewald K, Tenzin Y, Xue Z, Drissi Y, Ettl M (2024) PresAIse, a prescriptive AI solution for enterprise. INFOR Inform. Systems Oper. Res. 62(4):629–645.CrossrefGoogle Scholar
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv. Neural Inform. Processing Systems 31:6000–6010.Google Scholar
  • Wagner HM, Whitin TM (1958) Dynamic version of the economic lot size model. Management Sci. 5(1):89–96.LinkGoogle Scholar
  • Wang H, Pan Y, Sun F, Liu S, Talluri K, Chen G, Li X (2024) Understanding the training and generalization of pretrained transformer for sequential decision making. Preprint, submitted May 23, https://arxiv.org/abs/2405.14219.Google Scholar
  • Xie Y, Ma W, Xin L (2024) VC theory for inventory policies. Preprint, submitted April 17, https://arxiv.org/abs/2404.11509.Google Scholar
  • Xie Y, Hao X, Liu J, Ma W, Xin L, Cao L, Zhang Y (2025) Deepstock: Reinforcement learning with policy regularizations for inventory management. Preprint, submitted November 21, https://doi.org/10.2139/ssrn.5784782. Google Scholar
  • Zheng Y-S (1991) A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems. J. Appl. Probab. 28(4):802–810.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.