Marrying Stochastic Gradient Descent with Bandits: Learning Algorithms for Inventory Systems with Fixed Costs

Published Online:https://doi.org/10.1287/mnsc.2020.3799

References

  • Agrawal S , Jia R (2019) Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management. Karlin A, ed. Proc. 2019 ACM Conf. Econom. Comput. (ACM, New York), 743–744.Google Scholar
  • Azoury KS (1985) Bayes solution to dynamic inventory models under unknown demand distribution. Management Sci. 31(9):1150–1160.LinkGoogle Scholar
  • Besbes O , Muharremoglu A (2013) On implications of demand censoring in the newsvendor problem. Management Sci. 59(6):1407–1424.LinkGoogle Scholar
  • Bubeck S , Munos R , Stoltz G , Szepesvári C (2011) X-armed bandits. J. Machine Learn. Res. 12(May):1655–1695.Google Scholar
  • Burnetas AN , Smith CE (2000) Adaptive ordering and pricing for perishable products. Oper. Res. 48(3):436–443.LinkGoogle Scholar
  • Caliskan-Demirag O , Chen Y , Yang Y (2012) Ordering policies for periodic-review inventory systems with quantity-dependent fixed costs. Oper. Res. 60(4):785–796.LinkGoogle Scholar
  • Chao X , Zipkin PH (2008) Optimal policy for a periodic-review inventory system under a supply capacity contract. Oper. Res. 56(1):59–68.LinkGoogle Scholar
  • Chen S (2004) The infinite horizon periodic review problem with setup costs and capacity constraints: A partial characterization of the optimal policy. Oper. Res. 52(3):409–421.LinkGoogle Scholar
  • Chen S , Lambrecht M (1996) X-Y band and modified (s, S) policy. Oper. Res. 44(6):1013–1019.LinkGoogle Scholar
  • Chen L , Plambeck EL (2008) Dynamic inventory management with learning about the demand distribution and substitution probability. Manufacturing Service Oper. Management 10(2):236–256.LinkGoogle Scholar
  • Chen X , Simchi-Levi D (2004a) Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The finite horizon case. Oper. Res. 52(6):887–896.LinkGoogle Scholar
  • Chen X , Simchi-Levi D (2004b) Coordinating inventory control and pricing strategies with random demand and fixed ordering cost: The infinite horizon case. Math. Oper. Res. 29(3):698–723.LinkGoogle Scholar
  • Chen B , Chao X , Ahn HS (2019) Coordinating pricing and inventory replenishment with nonparametric demand learning. Oper. Res. 67(4):1035–1052.AbstractGoogle Scholar
  • Chen B , Chao X , Shi C (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost-sales and censored demand. Math. Oper. Res. Forthcoming.Google Scholar
  • Chen Y , Ray S , Song Y (2006) Optimal pricing and inventory control policy in periodic-review systems with fixed ordering cost and lost sales. Naval Res. Logist. 53(2):117–136.CrossrefGoogle Scholar
  • Chen W , Shi C , Duenyas I (2020) Optimal learning algorithms for stochastic inventory systems with random capacities. Production Oper. Management 29(7):1624–1649.CrossrefGoogle Scholar
  • Cheung M , Elmachtoub AN , Levi R , Shmoys DB (2016) The submodular joint replenishment problem. Math. Programming 158(1–2):207–233.CrossrefGoogle Scholar
  • Chu LY , Shanthikumar JG , Shen ZJM (2008) Solving operational statistics via a Bayesian analysis. Oper. Res. Lett. 36(1):110–116.CrossrefGoogle Scholar
  • Federgruen A , Zipkin P (1984) An efficient algorithm for computing optimal (s, S) policies. Oper. Res. 32(6):1268–1285.LinkGoogle Scholar
  • Feng Q (2010) Integrating dynamic pricing and replenishment decisions under supply capacity uncertainty. Management Sci. 56(12):2154–2172.LinkGoogle Scholar
  • Gallego G , Özer Ö (2001) Integrating replenishment decisions with advance demand information. Management Sci. 47(10):1344–1360.LinkGoogle Scholar
  • Gallego G , Scheller-Wolf A (2000) Capacitated inventory problems with fixed order costs: Some optimal policy structure. Eur. J. Oper. Res. 126(3):603–613.CrossrefGoogle Scholar
  • Gavirneni S (2001) An efficient heuristic for inventory control when the customer is using a (s, S) policy. Oper. Res. Lett. 28(4):187–192.CrossrefGoogle Scholar
  • Godfrey GA , Powell WB (2001) An adaptive, distribution-free algorithm for the newsvendor problem with censored demands, with applications to inventory and distribution. Management Sci. 47(8):1101–1112.LinkGoogle Scholar
  • Guan Y , Miller AJ (2008) Polynomial-time algorithms for stochastic uncapacitated lot-sizing problems. Oper. Res. 56(5):1172–1183.LinkGoogle Scholar
  • Hazan E (2016) Introduction to online convex optimization. Foundations Trends Optim. 2(3–4):157–325.CrossrefGoogle Scholar
  • Hu P , Lu Y , Song M (2019) Joint pricing and inventory control with fixed and convex/concave variable production costs. Production Oper. Management 28(4):847–877.CrossrefGoogle Scholar
  • Huang K , Küçükyavuz S (2008) On stochastic lot-sizing problems with random lead times. Oper. Res. Lett. 36(3):303–308.CrossrefGoogle Scholar
  • Huh WT , Janakiraman G (2008) (s, S) optimality in joint inventory-pricing control: An alternate approach. Oper. Res. 56(3):783–790.LinkGoogle Scholar
  • Huh WT , Rusmevichientong P (2009) A nonparametric asymptotic analysis of inventory planning with censored demand. Math. Oper. Res. 34(1):103–123.LinkGoogle Scholar
  • Huh WT , Janakiraman G , Muckstadt JA , Rusmevichientong P (2009) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.LinkGoogle Scholar
  • Huh WT , Levi R , Rusmevichientong P , Orlin JB (2011) Adaptive data-driven inventory control with censored demand based on kaplan-meier estimator. Oper. Res. 59(4):929–941.LinkGoogle Scholar
  • Iglehart DL (1963) Optimality of (s, S) policies in the infinite horizon dynamic inventory problem. Management Sci. 9(2):259–267.LinkGoogle Scholar
  • Iglehart DL (1964) The dynamic inventory problem with unknown demand distribution. Management Sci. 10(3):429–440.LinkGoogle Scholar
  • Khouja M , Goyal S (2008) A review of the joint replenishment problem literature: 1989–2005. Eur. J. Oper. Res. 186(1):1–16.CrossrefGoogle Scholar
  • Kleinberg R , Slivkins A , Upfal E (2008) Multi-armed bandits in metric spaces. Dwork C, ed. Proc. 40th Annual ACM Sympos. Theory Comput. (ACM, New York), 681–690.Google Scholar
  • Kleywegt AJ , Shapiro A , Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.CrossrefGoogle Scholar
  • Levi R , Shi C (2013) Approximation algorithms for the stochastic lot-sizing problem with order lead times. Oper. Res. 61(3):593–602.LinkGoogle Scholar
  • Levi R , Perakis G , Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Oper. Res. 63(6):1294–1306.LinkGoogle Scholar
  • Levi R , Roundy RO , Shmoys DB (2007) Provably near-optimal sampling-based policies for stochastic inventory control models. Math. Oper. Res. 32(4):821–839.LinkGoogle Scholar
  • Lim V (2016) How poor inventory management ruined Target Canada. Tradegecko Inventory Management blog , March 2, https://www.tradegecko.com/blog/inventory-management/how-poor-inventory-management-ruined-target-canada.Google Scholar
  • Liyanage LH , Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Oper. Res. Lett. 33(4):341–348.CrossrefGoogle Scholar
  • Lu X , Song JS , Zhu K (2005) On the censored newsvendor and the optimal acquisition of information. Oper. Res. 53(6):1024–1026.LinkGoogle Scholar
  • Lu X , Song JS , Zhu K (2008) Analysis of perishable-inventory systems with censored demand data. Oper. Res. 56(4):1034–1038.LinkGoogle Scholar
  • Murray GR , Silver EA (1966) A Bayesian analysis of the style goods inventory problem. Management Sci. 12(11):785–797.LinkGoogle Scholar
  • Nagarajan V , Shi C (2016) Approximation algorithms for inventory problems with submodular or routing costs. Math. Programming 160(1–2):225–244.CrossrefGoogle Scholar
  • Özer Ö , Wei W (2004) Inventory control with limited capacity and advance demand information. Oper. Res. 52(6):988–1000.LinkGoogle Scholar
  • Pang Z , Chen FY , Feng Y (2012) A note on the structure of joint inventory-pricing control with leadtimes. Oper. Res. 60(3):581–587.LinkGoogle Scholar
  • Perakis G , Roels G (2008) Regret in the newsvendor model with partial information. Oper. Res. 56(1):188–203.LinkGoogle Scholar
  • Powell W , Ruszczyński A , Topaloglu H (2004) Learning algorithms for separable approximations of discrete stochastic optimization problems. Math. Oper. Res. 29(4):814–836.LinkGoogle Scholar
  • Ross SM (1996) Stochastic Processes , 2nd ed. (John Wiley & Sons, New York).Google Scholar
  • Scarf H (1959) Bayes solutions of the statistical inventory problem. Ann. Math. Statist. 30(2):490–508.CrossrefGoogle Scholar
  • Scarf H (1960) The Optimality of (S, s) Policies in the Dynamic Inventory Problem , Mathematical Methods in the Social Sciences (Stanford Univ. Press, CA).Google Scholar
  • Sethi SP , Cheng F (1997) Optimality of (s, S) policies in inventory models with markovian demand. Oper. Res. 45(6):931–939.LinkGoogle Scholar
  • Shalev-Shwartz S (2012) Online learning and online convex optimization. Foundations Trends Machine Learn. 4(2):107–194.CrossrefGoogle Scholar
  • Shi C , Chen W , Duenyas I (2016) Nonparametric data-driven algorithms for multiproduct inventory systems with censored demand. Oper. Res. 64(2):362–370.LinkGoogle Scholar
  • Shi C , Zhang H , Chao X , Levi R (2014) Approximation algorithms for capacitated stochastic inventory systems with setup costs. Naval Res. Logist. 61(4):304–319.CrossrefGoogle Scholar
  • Simchi-Levi D , Chen X , Bramel J (2014) The Logic of Logistics: Theory, Algorithms, and Applications for Logistics and Supply Chain Management , 3rd ed., Springer Series in Operations Research and Financial Engineering (Springer, New York).CrossrefGoogle Scholar
  • Veinott AF Jr (1966) The status of mathematical inventory theory. Management Sci. 12(11):745–777.LinkGoogle Scholar
  • Veinott AF Jr , Wagner HM (1965) Computing optimal (s, S) inventory policies. Management Sci. 11(5):525–552.LinkGoogle Scholar
  • Wainwright MJ (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint (Cambridge Univ. Press, UK).CrossrefGoogle Scholar
  • Zhang H , Chao X , Shi C (2018) Perishable inventory systems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Res. 66(5):1276–1286.LinkGoogle Scholar
  • Zhang H , Chao X , Shi C (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.LinkGoogle Scholar
  • Zheng YS (1991) A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems. J. Appl. Probab. 28(4):802–810.CrossrefGoogle Scholar
  • Zheng YS , Federgruen A (1991) Finding optimal (s, S) policies is about as simple as evaluating a single policy. Oper. Res. 39(4):654–665.LinkGoogle Scholar
  • Zipkin P (2000) Foundations of Inventory Management (McGraw-Hill, New York).Google Scholar
  • Zipkin P (2008) On the structure of lost-sales inventory models. Oper. Res. 56(4):937–944.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.