Data-Driven Compositional Optimization in Misspecified Regimes

Published Online:https://doi.org/10.1287/opre.2021.0295

References

  • Ahmadi H, Shanbhag UV (2020) On the resolution of misspecified convex optimization and monotone variational inequality problems. Comput. Optim. Appl. 77(1):125–161.CrossrefGoogle Scholar
  • Ahmadi H, Aybat NS, Shanbhag UV (2016) On the rate analysis of inexact augmented Lagrangian schemes for convex optimization problems with misspecified constraints. 2016 Amer. Control Conf. (IEEE, New York), 4841–4846.Google Scholar
  • Ahmed S, Çakmak U, Shapiro A (2007) Coherent risk measures in inventory problems. Eur. J. Oper. Res. 182(1):226–238.CrossrefGoogle Scholar
  • Arimoto S, Kawamura S, Miyazaki F (1984) Bettering operation of robots by learning. J. Robot. Systems 1:123–140.CrossrefGoogle Scholar
  • Arrow KJ (2002) The genesis of “optimal inventory policy.” Oper. Res. 50(1):1–2.LinkGoogle Scholar
  • Arrow KJ, Harris T, Marschak J (1951) Optimal inventory policy. Econometrica 19(3):250–272.CrossrefGoogle Scholar
  • Aybat NS, Ahmadi H, Shanbhag UV (2022) On the analysis of inexact augmented Lagrangian schemes for misspecified conic convex programs. IEEE Trans. Automat. Control 67(8):3981–3996.CrossrefGoogle Scholar
  • Balasubramanian K, Ghadimi S, Nguyen A (2022) Stochastic multilevel composition optimization algorithms with level-independent convergence rates. SIAM J. Optim. 32(2):519–544.CrossrefGoogle Scholar
  • Beck A (2017) First-Order Methods in Optimization (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust Optimization. Princeton Series in Applied Mathematics (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Bertsimas D, Sim M (2003) Robust discrete optimization and network flows. Math. Programming 98(1):49–71.CrossrefGoogle Scholar
  • Bertsimas D, Sim M (2004) The price of robustness. Oper. Res. 52(1):35–53.LinkGoogle Scholar
  • Besbes O, Zeevi AJ (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.LinkGoogle Scholar
  • Besbes O, Zeevi AJ (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.LinkGoogle Scholar
  • Borkar VS (2008) Stochastic Approximation: A Dynamical Systems Viewpoint (Hindustan Book Agency, New Delhi, India; Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Chao GH (2013) Production and availability policies through the Markov decision process and myopic methods for contractual and selective orders. Eur. J. Oper. Res. 225(3):383–392.CrossrefGoogle Scholar
  • Cheevaprawatdomrong T, Smith RL (2004) Infinite horizon production scheduling in time-varying systems under stochastic demand. Oper. Res. 52(1):105–115.LinkGoogle Scholar
  • Chen T, Sun Y, Yin W (2021a) Closing the gap: Tighter analysis of alternating stochastic gradient methods for bilevel problems. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 25294–25307.Google Scholar
  • Chen T, Sun Y, Yin W (2021b) Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization. IEEE Trans. Signal Processing 69:4937–4948.CrossrefGoogle Scholar
  • Dai W, Gai Y, Krishnamachari B (2014) Online learning for multi-channel opportunistic access over unknown Markovian channels. 2014 Eleventh Annual IEEE Internat. Conf. Sensing Comm. Networking (SECON) (IEEE, New York), 64–71.Google Scholar
  • Ermoliev Y (1976) Methods of Stochastic Programming. Monographs in Optimization and OR (Nauka, Moscow).Google Scholar
  • Ghadimi S, Ruszczynski A, Wang M (2020) A single timescale stochastic approximation method for nested stochastic optimization. SIAM J. Optim. 30(1):960–979.CrossrefGoogle Scholar
  • Hadley G, Whitin T (1963) Analysis of Inventory Systems. Prentice-Hall International Series in Management (Prentice-Hall, Upper Saddle River, NJ).Google Scholar
  • Ho CP, Petrik M, Wiesemann W (2018) Fast Bellman updates for robust MDPs. Proc. Machine Learn. Res. 80:1979–1988.Google Scholar
  • Ho-Nguyen N, Kilinç-Karzan F (2019) Exploiting problem structure in optimization under uncertainty via online convex optimization. Math. Programming 177(1–2):113–147.CrossrefGoogle Scholar
  • Ho-Nguyen N, Kilinç-Karzan F (2021) Dynamic data-driven estimation of nonparametric choice models. Oper. Res. 69(4):1228–1239.LinkGoogle Scholar
  • Hovakimyan N, Cao C (2010) L1 Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. Advances in Design and Control, vol. 21 (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Jiang H, Shanbhag UV (2013) On the solution of stochastic optimization problems in imperfect information regimes. Proc. 2013 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 821–832.Google Scholar
  • Jiang H, Shanbhag UV (2015) Data-driven schemes for resolving misspecified MDPs: Asymptotics and error analysis. Proc. 2015 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 3801–3812.Google Scholar
  • Jiang H, Shanbhag UV (2016) On the solution of stochastic optimization and variational problems in imperfect information regimes. SIAM J. Optim. 26(4):2394–2429.CrossrefGoogle Scholar
  • Jiang H, Shanbhag UV, Meyn SP (2018) Distributed computation of equilibria in misspecified convex stochastic Nash games. IEEE Trans. Automat. Control 63(2):360–371.CrossrefGoogle Scholar
  • Kirman AP (1975) Learning by firms about demand conditions. Day RH, Groves T, eds. Adaptive Economic Models (Academic Press, New York), 137–156.CrossrefGoogle Scholar
  • Kushner H, Yin G (2003) Stochastic Approximation and Recursive Algorithms and Applications, Stochastic Modeling and Applied Probability, 2nd ed. (Springer, New York).Google Scholar
  • Lei J, Shanbhag UV (2020) Asynchronous schemes for stochastic and misspecified potential games and nonconvex optimization. Oper. Res. 68(6):1742–1766.LinkGoogle Scholar
  • Li X (2013) Managing dynamic inventory systems with product returns: A Markov decision process. J. Optim. Theory Appl. 157(2):577–592.CrossrefGoogle Scholar
  • Lian X, Wang M, Liu J (2017) Finite-sum composition optimization via variance reduced gradient descent. Proc. Machine Learn. Res. 54:1159–1167.Google Scholar
  • Mankowitz DJ, Levine N, Jeong R, Shi Y, Kay J, Abdolmaleki A, Springenberg JT, Mann T, Hester T, Riedmiller M (2020) Robust reinforcement learning for continuous control with model misspecification. 8th Internat. Conf. Learn. Representations ICLR 2020 (International Conference on Learning Representations, Appleton, WI), 1–28.Google Scholar
  • Mannor S, Mebel O, Xu H (2012) Lightning does not strike twice: Robust MDPs with coupled uncertainty. Proc. 29th Internat. Conf. Machine Learn. (International Conference on Machine Learning, San Diego), 451–458.Google Scholar
  • Mannor S, Mebel O, Xu H (2016) Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4):1484–1509.LinkGoogle Scholar
  • Miyaguchi K (2021) Asymptotically exact error characterization of offline policy evaluation with misspecified linear models. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 28573–28584.Google Scholar
  • Okuguchi K (1976) Expectations and Stability in Oligopoly Models, Lecture Notes in Economics and Mathematical Systems, vol. 138 (Springer-Verlag, Berlin).CrossrefGoogle Scholar
  • Okuguchi K, Szidarovszky F (1990) The Theory of Oligopoly with Multi-Product Firms, Lecture Notes in Economics and Mathematical Systems, vol. 342 (Springer-Verlag, Berlin).CrossrefGoogle Scholar
  • Porteus EL (1990) Stochastic inventory theory. Heyman DP, Sobel MJ, eds. Stochastic Models, Handbooks in Operations Research and Management Science, vol. 2 (Elsevier, Amsterdam), 605–652.CrossrefGoogle Scholar
  • Puranam KS, Katehakis MN (2014) On optimal bidding and inventory control in sequential procurement auctions: The multi period case. Ann. Oper. Res. 217(1):447–462.CrossrefGoogle Scholar
  • Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics (Wiley, Hoboken, NJ).CrossrefGoogle Scholar
  • Qi M, Grigas P, Shen M (2021) Integrated conditional estimation-optimization. Preprint, submitted October 24, https://arxiv.org/abs/2110.12351.Google Scholar
  • Rigamonti A, Lučivjanská K (2022) Mean-semivariance portfolio optimization using minimum average partial. Ann. Oper. Res. 1–19.Google Scholar
  • Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.CrossrefGoogle Scholar
  • Ross S (1983) Introduction to Stochastic Dynamic Programming, Probability and Mathematical Statistics (Academic Press, Inc., New York).Google Scholar
  • Ruszczynski A (2021) A stochastic subgradient method for nonsmooth nonconvex multilevel composition optimization. SIAM J. Control Optim. 59(3):2301–2320.CrossrefGoogle Scholar
  • Ruszczyński A, Shapiro A (2006) Optimization of convex risk functions. Math. Oper. Res. 31(3):433–452.LinkGoogle Scholar
  • Satheesh Kumar R, Elango C (2012) Markov decision processes in service facilities holding perishable inventory. Opsearch 49(4):348–365.CrossrefGoogle Scholar
  • Sato M, Abe K, Takeda H (1982) Learning control of finite Markov chains with unknown transition probabilities. IEEE Trans. Automat. Control 27(2):502–505.CrossrefGoogle Scholar
  • Shapiro A, Dentcheva D, Ruszczyński A (2009) Lectures on Stochastic Programming, MPS/SIAM Series on Optimization, vol. 9 (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Uchiyama M (1978) Formation of high-speed motion pattern of a mechanical arm by trial. Trans. Soc. Instrument Control Engineers 14(6):706–712.CrossrefGoogle Scholar
  • Wang M, Fang EX, Liu H (2017a) Stochastic compositional gradient descent: Algorithms for minimizing compositions of expected-value functions. Math. Programming 161(1–2):419–449.CrossrefGoogle Scholar
  • Wang M, Liu J, Fang EX (2017b) Accelerating stochastic composition optimization. J. Machine Learn. Res. 18(1):3721–3743.Google Scholar
  • Wu M, Zhu SX, Teunter RH (2013) Newsvendor problem with random shortage cost under a risk criterion. Internat. J. Production Econom. 145(2):790–798.CrossrefGoogle Scholar
  • Yang S, Wang M, Fang EX (2019) Multilevel stochastic gradient methods for nested composition optimization. SIAM J. Optim. 29(1):616–659.CrossrefGoogle Scholar
  • Yang S, Zhang X, Wang M (2022) Decentralized gossip-based stochastic bilevel optimization over communication networks. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K Oh A, eds. Adv. Neural Inform. Processing Systems 35 (NeurIPS 2022) (Neural Information Processing Systems, San Diego), 238–252.Google Scholar
  • Zhang Z, Lan G (2020) Optimal algorithms for convex nested stochastic composite optimization. Preprint, submitted November 19, https://arxiv.org/abs/2011.10076.Google Scholar
  • Zhu H, Xu T, Paschalidis IC (2019) Learning parameterized prescription policies and disease progression dynamics using Markov decision processes. 2019 Amer. Control Conf. (ACC) (IEEE Press, Piscataway, NJ), 3438–3443.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.