Control of Dual-Sourcing Inventory Systems Using Recurrent Neural Networks
Published Online:6 Jul 2023https://doi.org/10.1287/ijoc.2022.0136
References
- (2010) Global dual sourcing: Tailored base-surge allocation to near- and offshore production. Management Sci. 56(1):110–124.Link, Google Scholar
- (1951) Optimal inventory policy. Econometrica 19(3):250–272.Crossref, Google Scholar
- (2021) Multi-objective optimization for value-sensitive and sustainable basket recommendations. Preprint, submitted November 10, https://arxiv.org/abs/2111.05944.Google Scholar
- (2022) Neural ordinary differential equation control of dynamics on graphs. Physical Rev. Res. 4(1):013221.Crossref, Google Scholar
- (2007) A heuristic for triggering emergency orders in an inventory system. Eur. J. Oper. Res. 176(2):880–891.Crossref, Google Scholar
- (1961) A delivery-lag inventory model with an emergency provision. Naval Res. Logistics Quart. 8:285–311.Google Scholar
- (2017) Continuously differentiable exponential linear units. Preprint, submitted April 24, https://arxiv.org/abs/1704.07483.Google Scholar
- (2000) A model of inductive bias learning. J. Artificial Intelligence Res. 12(1):149–198.Crossref, Google Scholar
- (2018) Automatic differentiation in machine learning: A survey. J. Machine Learn. Res. 18(1):5595–5637.Google Scholar
- (2013) Estimating or propagating gradients through stochastic neurons for conditional computation. Preprint, submitted August 15, https://arxiv.org/abs/1308.3432.Google Scholar
- (2022) Near-optimal control of dynamical systems with neural ordinary differential equations. Machine Learn. Sci. Tech. 3(4):045004.Crossref, Google Scholar
- (2022) AI Pontryagin or how neural networks learn to control dynamical systems. Nature Comm. 13:333.Crossref, Google Scholar
- (2023) GitHub repository. https://github.com/INFORMSJoC/2022.0136.Google Scholar
- (2019) Reinforcement learning, fast and slow. Trends Cognitive Sci. 23(5):408–422.Crossref, Google Scholar
- (2022a) Dual sourcing and smoothing under nonstationary demand time series: Reshoring with speed factories. Management Sci. 68(2):1039–1057.Link, Google Scholar
- (2022b) Deep reinforcement learning for inventory control: A roadmap. Eur. J. Oper. Res. 298(2):401–412.Crossref, Google Scholar
- (1976) The influence of pattern similarity and transfer learning upon the training of a base perceptron B2. Proc. Sympos. Informatica, 3–121.Google Scholar
- (2019) Tailored base-surge policies in dual-sourcing inventory systems with demand learning. Preprint, submitted September 27, https://dx.doi.org/10.2139/ssrn.3456834.Google Scholar
- (2021) Learning neural event functions for ordinary differential equations. 9th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (2018) Neural ordinary differential equations. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 31 (Montréal, Canada), 6572–6583.Google Scholar
- (1998) Optimal production and inventory policy for multiple products under resource constraints. Management Sci. 44(7):950–961.Link, Google Scholar
- (2012) Linear off-policy actor-critic. Proc. 29th Internat. Conf. Machine Learn. (Omnipress, Madison, WI).Google Scholar
- (2018) Why ReLU units sometimes die: Analysis of single-unit error backpropagation in neural networks. 52nd Asilomar Conf. Signals Systems Comput. (IEEE, Piscataway, NJ), 864–868.Google Scholar
- (2015) Deep reinforcement learning in large discrete action spaces. Preprint, submitted December 24, https://arxiv.org/abs/1512.07679.Google Scholar
- (1993) Neural network control of an unstable process. Proc. 36th Midwest Sympos. Circuits Systems (IEEE, Piscataway, NJ), 35–40.Google Scholar
- (2006) Optimal inventory policy with two suppliers. Oper. Res. 54(2):389–393.Link, Google Scholar
- (1964) Optimal policies for the inventory problem with negotiable leadtime. Management Sci. 10(4):690–708.Link, Google Scholar
- (2022) Can deep reinforcement learning improve inventory management? Performance on lost sales, dual-sourcing, and multi-echelon problems. Manufacturing Service Oper. Management 24(3):1349–1368.Link, Google Scholar
- (2021) A survey of recent progress in the asymptotic analysis of inventory systems. Production Oper. Management 30(6):1718–1750.Crossref, Google Scholar
- (2017) Approximating continuous functions by ReLU nets of minimal width. Preprint, submitted October 31, https://arxiv.org/abs/1710.11278.Google Scholar
- (2016) RMSprop—PyTorch 1.10.0 documentation. Accessed June 11, 2023, https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html.Google Scholar
- (1997) Long short-term memory. Neural Comput. 9(8):1735–1780.Crossref, Google Scholar
- (2020) Learning to control PDEs with differentiable physics. 8th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257.Crossref, Google Scholar
- (2015) Structural properties of the optimal policy for dual-sourcing systems with general lead times. IIE Trans. 47(8):841–850.Crossref, Google Scholar
- (2010) Inventory control with generalized expediting. Oper. Res. 58(5):1414–1426.Link, Google Scholar
- (2015) Analysis of tailored base-surge policies in dual sourcing inventory systems. Management Sci. 61(7):1547–1561.Link, Google Scholar
- (2019) Service level constrained inventory systems. Production Oper. Management 28(9):2365–2389.Crossref, Google Scholar
- (2018) Is Q-learning provably efficient? Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 31 (NeurIPS, San Diego, CA).Google Scholar
- (2020) Pontryagin differentiable programming: An end-to-end learning and control framework. Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (NeurIPS, San Diego, CA).Google Scholar
- (2014) Emergency orders in the periodic-review inventory system with fixed ordering costs and compound Poisson demand. Internat. J. Production Econom. 157:147–157.Crossref, Google Scholar
- (2021) Physics-informed machine learning. Nature Rev. Phys. 3(6):422–440.Crossref, Google Scholar
- (1998) Efficient backprop. Orr GB, Müller KR, eds. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 1524 (Springer, Berlin), 9–48.Crossref, Google Scholar
- (2019) Approximate mixed-integer programming solution with machine learning technique and linear programming relaxation. Third Internat. Conf. Smart Grid Smart Cities (IEEE, Piscataway, NJ), 101–107.Google Scholar
- (1976) Taylor expansion of the accumulated rounding error. BIT 16(2):146–160.Crossref, Google Scholar
- (2022) Iterative prediction-and-optimization for e-logistics distribution network design. INFORMS J. Comput. 34(2):769–789.Link, Google Scholar
- (2019) Deep Lagrangian networks: Using physics as model prior for deep learning. 7th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (2021) Managing two-dose COVID-19 vaccine rollouts with limited supply: Operations strategies for distributing time-sensitive resources. Production Oper. Management 31(12):4424–4442.Google Scholar
- (2021) Data set: 187 weeks of customer forecasts and orders for microprocessors from Intel corporation. Manufacturing Service Oper. Management 24(1):682–689.Link, Google Scholar
- (1971) The near-myopic nature of the lagged-proportional-cost inventory problem with lost sales. Oper. Res. 19(7):1708–1716.Link, Google Scholar
- (2023) Optimal control of PDEs using physics-informed neural networks. J. Comput. Phys. 473:111731.Crossref, Google Scholar
- (2010) Rectified linear units improve restricted Boltzmann machines. Fürnkranz J, Joachims T, eds. Proc. 27th Internat. Conf. Machine Learn. (Omnipress, Madison, WI), 807–814.Google Scholar
- (2020) Solving mixed integer programs using neural networks. Preprint, submitted December 23, https://arxiv.org/abs/2012.13349.Google Scholar
- (2020) Minimum width for universal approximation. Preprint, submitted June 16, https://arxiv.org/abs/2006.08859.Google Scholar
- (2017) Automatic differentiation in PyTorch. This work was part of the NIPS 2017 Autodiff workshop.Google Scholar
- (2017) Survey of model-based reinforcement learning: Applications on robotics. J. Intelligent Robotic Systems 86(2):153–173.Crossref, Google Scholar
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality, vol. 703 (John Wiley & Sons, New York).Crossref, Google Scholar
- (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378:686–707.Crossref, Google Scholar
- (2013) Gaussian processes for time-series modelling. Philos. Trans. Roy. Soc. A Math. Physical Engrg. Sci. 371(1984):20110550.Crossref, Google Scholar
- (2020) Modeling system dynamics with physics-informed neural networks based on Lagrangian mechanics. IFAC-PapersOnLine 53(2):9195–9200.Crossref, Google Scholar
- (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536.Crossref, Google Scholar
- (1958) Inventory models of the Arrow-Harris-Marschak type with time lag. Arrow KJ, Karlin S, Scarf HE, eds. Studies in the Mathematical Theory of Inventory and Production (Stanford University Press, Stanford, CA).Google Scholar
- (2007) Effective dual sourcing with a single index policy. Working paper, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
- (2015) Deep learning in neural networks: An overview. Neural Networks 61:85–117.Crossref, Google Scholar
- (2020) Off-policy actor-critic with shared experience replay. Proc. 37th Internat. Conf. Machine Learn., vol. 119 (PMLR, New York), 8545–8554.Google Scholar
- (2010) New policies for the stochastic inventory control problem with two supply sources. Oper. Res. 58(3):734–745.Link, Google Scholar
- (1993) Inventory control in a fluctuating demand environment. Oper. Res. 41(2):351–370.Link, Google Scholar
- (2020) Capacity and inventory management: Review, trends, and projections. Manufacturing Service Oper. Management 22(1):36–46.Link, Google Scholar
- (2017) Optimal policies for a dual-sourcing inventory problem with endogenous stochastic lead times. Oper. Res. 65(2):379–395.Link, Google Scholar
- (2021) Smart policies for multisource inventory systems and general tandem queues with order tracking and expediting. Oper. Res. 70(4):2421–2438.Link, Google Scholar
- (2019) Robust dual sourcing inventory management: Optimality of capped dual index policies and smoothing. Manufacturing Service Oper. Management 21(4):912–931.Link, Google Scholar
- (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
- (2021) Typology and literature review on multiple supplier inventory control models. Eur. J. Oper. Res. 293(1):1–23.Crossref, Google Scholar
- (2009) Using continuous action spaces to solve discrete problems. 2009 Internat. Joint Conf. Neural Networks (IEEE, Piscataway, NJ), 1149–1156.Google Scholar
- (2018) Deep reinforcement learning and the deadly triad. Preprint, submitted December 6, https://arxiv.org/abs/1812.02648.Google Scholar
- (2017) Attention is all you need. Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (NeurIPS, San Diego).Google Scholar
- (2008) Now or later: A simple policy for effective dual sourcing in capacitated systems. Oper. Res. 56(4):850–864.Link, Google Scholar
- (1965) The optimal inventory policy for batch ordering. Oper. Res. 13(3):424–432.Link, Google Scholar
- (2020) Differentiable molecular simulations for control and learning. Preprint, submitted February 27, https://arxiv.org/abs/2003.00868.Google Scholar
- (2018) Look before you leap: Bridging model-free and model-based reinforcement learning for planned-ahead vision-and-language navigation. Ferrari V, Hebert M, Sminchisescu C, Weiss Y, eds. Computer Vision (Springer International Publishing, Cham, Switzerland), 38–55.Google Scholar
- (1998) Runge-Kutta neural network for identification of dynamical systems in high accuracy. IEEE Trans. Neural Networks 9(2):294–307.Crossref, Google Scholar
- (2017) Sample efficient actor-critic with experience replay. 5th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (1990) Backpropagation through time: What it does and how to do it. Proc. IEEE. 78(10):1550–1560.Crossref, Google Scholar
- (1977) Optimal inventory under stochastic demand with two supply options. SIAM J. Appl. Math. 32(2):293–305.Crossref, Google Scholar
- (1992) Individual comparisons by ranking methods. Kotz S, Johnson NL, eds. Breakthroughs in Statistics, Springer Series in Statistics (Springer, New York), 196–202.Crossref, Google Scholar
- (1990) An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural Comput. 2(4):490–501.Crossref, Google Scholar
- (2021a) 1.79-approximation algorithms for continuous review single-sourcing lost-sales and dual-sourcing inventory models. Oper. Res. 70(1):111–128.Link, Google Scholar
- (2021b) Understanding the performance of capped base-stock policies in lost-sales inventory models. Oper. Res. 69(1):61–70.Link, Google Scholar
- (2018) Asymptotic optimality of tailored base-surge policies in dual-sourcing inventory systems. Management Sci. 64(1):437–452.Link, Google Scholar
- (2021) Dual-sourcing, dual-mode dynamic stochastic inventory models: A review. Preprint submitted September 29, http://dx.doi.org/10.2139/ssrn.3885147.Google Scholar
- (2022) Injecting logical constraints into neural networks via straight-through estimators. Chaudhuri K, Jegelka S, Song L, Szepesvári C, Niu G, Sabato S, eds. Internat. Conf. Machine Learn. (PMLR, New York), 25096–25122.Google Scholar
- (2021) Improving sample efficiency in model-free reinforcement learning from images. Proc. 35th Conf. AAAI Artificial Intelligence, 33rd Conf. Innovative Appl. Artificial Intelligence IAAI 2021, 11th Sympos. Educational Adv. Artificial Intelligence, EAAI 2021, vol. 35 (AAAI Press), 10674–10681.Google Scholar
- (2018) Understanding straight-through estimator in training activation quantized neural nets. 7th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (2020) Symplectic ODE-Net: Learning hamiltonian dynamics with control. 8th Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (2008a) Old and new methods for lost-sales inventory systems. Oper. Res. 56(5):1256–1263.Link, Google Scholar
- (2008b) On the structure of lost-sales inventory models. Oper. Res. 56(4):937–944.Link, Google Scholar

