Data-Driven Compositional Optimization in Misspecified Regimes
References
- (2020) On the resolution of misspecified convex optimization and monotone variational inequality problems. Comput. Optim. Appl. 77(1):125–161.Crossref, Google Scholar
- (2016) On the rate analysis of inexact augmented Lagrangian schemes for convex optimization problems with misspecified constraints. 2016 Amer. Control Conf. (IEEE, New York), 4841–4846.Google Scholar
- (2007) Coherent risk measures in inventory problems. Eur. J. Oper. Res. 182(1):226–238.Crossref, Google Scholar
- (1984) Bettering operation of robots by learning. J. Robot. Systems 1:123–140.Crossref, Google Scholar
- (2002) The genesis of “optimal inventory policy.” Oper. Res. 50(1):1–2.Link, Google Scholar
- (1951) Optimal inventory policy. Econometrica 19(3):250–272.Crossref, Google Scholar
- (2022) On the analysis of inexact augmented Lagrangian schemes for misspecified conic convex programs. IEEE Trans. Automat. Control 67(8):3981–3996.Crossref, Google Scholar
- (2022) Stochastic multilevel composition optimization algorithms with level-independent convergence rates. SIAM J. Optim. 32(2):519–544.Crossref, Google Scholar
- (2017) First-Order Methods in Optimization (SIAM, Philadelphia).Crossref, Google Scholar
- (2009) Robust Optimization. Princeton Series in Applied Mathematics (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
- (2003) Robust discrete optimization and network flows. Math. Programming 98(1):49–71.Crossref, Google Scholar
- (2004) The price of robustness. Oper. Res. 52(1):35–53.Link, Google Scholar
- (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
- (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
- (2008) Stochastic Approximation: A Dynamical Systems Viewpoint (Hindustan Book Agency, New Delhi, India; Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2013) Production and availability policies through the Markov decision process and myopic methods for contractual and selective orders. Eur. J. Oper. Res. 225(3):383–392.Crossref, Google Scholar
- (2004) Infinite horizon production scheduling in time-varying systems under stochastic demand. Oper. Res. 52(1):105–115.Link, Google Scholar
- (2021a) Closing the gap: Tighter analysis of alternating stochastic gradient methods for bilevel problems. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 25294–25307.Google Scholar
- (2021b) Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization. IEEE Trans. Signal Processing 69:4937–4948.Crossref, Google Scholar
- (2014) Online learning for multi-channel opportunistic access over unknown Markovian channels. 2014 Eleventh Annual IEEE Internat. Conf. Sensing Comm. Networking (SECON) (IEEE, New York), 64–71.Google Scholar
- (1976) Methods of Stochastic Programming. Monographs in Optimization and OR (Nauka, Moscow).Google Scholar
- (2020) A single timescale stochastic approximation method for nested stochastic optimization. SIAM J. Optim. 30(1):960–979.Crossref, Google Scholar
- (1963) Analysis of Inventory Systems. Prentice-Hall International Series in Management (Prentice-Hall, Upper Saddle River, NJ).Google Scholar
- (2018) Fast Bellman updates for robust MDPs. Proc. Machine Learn. Res. 80:1979–1988.Google Scholar
- (2019) Exploiting problem structure in optimization under uncertainty via online convex optimization. Math. Programming 177(1–2):113–147.Crossref, Google Scholar
- (2021) Dynamic data-driven estimation of nonparametric choice models. Oper. Res. 69(4):1228–1239.Link, Google Scholar
- (2010) L1 Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. Advances in Design and Control, vol. 21 (SIAM, Philadelphia).Crossref, Google Scholar
- (2013) On the solution of stochastic optimization problems in imperfect information regimes. Proc. 2013 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 821–832.Google Scholar
- (2015) Data-driven schemes for resolving misspecified MDPs: Asymptotics and error analysis. Proc. 2015 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 3801–3812.Google Scholar
- (2016) On the solution of stochastic optimization and variational problems in imperfect information regimes. SIAM J. Optim. 26(4):2394–2429.Crossref, Google Scholar
- (2018) Distributed computation of equilibria in misspecified convex stochastic Nash games. IEEE Trans. Automat. Control 63(2):360–371.Crossref, Google Scholar
- (1975) Learning by firms about demand conditions. Day RH, Groves T, eds. Adaptive Economic Models (Academic Press, New York), 137–156.Crossref, Google Scholar
- (2003) Stochastic Approximation and Recursive Algorithms and Applications, Stochastic Modeling and Applied Probability, 2nd ed. (Springer, New York).Google Scholar
- (2020) Asynchronous schemes for stochastic and misspecified potential games and nonconvex optimization. Oper. Res. 68(6):1742–1766.Link, Google Scholar
- (2013) Managing dynamic inventory systems with product returns: A Markov decision process. J. Optim. Theory Appl. 157(2):577–592.Crossref, Google Scholar
- (2017) Finite-sum composition optimization via variance reduced gradient descent. Proc. Machine Learn. Res. 54:1159–1167.Google Scholar
- (2020) Robust reinforcement learning for continuous control with model misspecification. 8th Internat. Conf. Learn. Representations ICLR 2020 (International Conference on Learning Representations, Appleton, WI), 1–28.Google Scholar
- (2012) Lightning does not strike twice: Robust MDPs with coupled uncertainty. Proc. 29th Internat. Conf. Machine Learn. (International Conference on Machine Learning, San Diego), 451–458.Google Scholar
- (2016) Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4):1484–1509.Link, Google Scholar
- (2021) Asymptotically exact error characterization of offline policy evaluation with misspecified linear models. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 28573–28584.Google Scholar
- (1976) Expectations and Stability in Oligopoly Models, Lecture Notes in Economics and Mathematical Systems, vol. 138 (Springer-Verlag, Berlin).Crossref, Google Scholar
- (1990) The Theory of Oligopoly with Multi-Product Firms, Lecture Notes in Economics and Mathematical Systems, vol. 342 (Springer-Verlag, Berlin).Crossref, Google Scholar
- (1990) Stochastic inventory theory. Heyman DP, Sobel MJ, eds. Stochastic Models, Handbooks in Operations Research and Management Science, vol. 2 (Elsevier, Amsterdam), 605–652.Crossref, Google Scholar
- (2014) On optimal bidding and inventory control in sequential procurement auctions: The multi period case. Ann. Oper. Res. 217(1):447–462.Crossref, Google Scholar
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics (Wiley, Hoboken, NJ).Crossref, Google Scholar
- (2021) Integrated conditional estimation-optimization. Preprint, submitted October 24, https://arxiv.org/abs/2110.12351.Google Scholar
- (2022) Mean-semivariance portfolio optimization using minimum average partial. Ann. Oper. Res. 1–19.Google Scholar
- (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
- (1983) Introduction to Stochastic Dynamic Programming, Probability and Mathematical Statistics (Academic Press, Inc., New York).Google Scholar
- (2021) A stochastic subgradient method for nonsmooth nonconvex multilevel composition optimization. SIAM J. Control Optim. 59(3):2301–2320.Crossref, Google Scholar
- (2006) Optimization of convex risk functions. Math. Oper. Res. 31(3):433–452.Link, Google Scholar
- (2012) Markov decision processes in service facilities holding perishable inventory. Opsearch 49(4):348–365.Crossref, Google Scholar
- (1982) Learning control of finite Markov chains with unknown transition probabilities. IEEE Trans. Automat. Control 27(2):502–505.Crossref, Google Scholar
- (2009) Lectures on Stochastic Programming, MPS/SIAM Series on Optimization, vol. 9 (SIAM, Philadelphia).Crossref, Google Scholar
- (1978) Formation of high-speed motion pattern of a mechanical arm by trial. Trans. Soc. Instrument Control Engineers 14(6):706–712.Crossref, Google Scholar
- (2017a) Stochastic compositional gradient descent: Algorithms for minimizing compositions of expected-value functions. Math. Programming 161(1–2):419–449.Crossref, Google Scholar
- (2017b) Accelerating stochastic composition optimization. J. Machine Learn. Res. 18(1):3721–3743.Google Scholar
- (2013) Newsvendor problem with random shortage cost under a risk criterion. Internat. J. Production Econom. 145(2):790–798.Crossref, Google Scholar
- (2019) Multilevel stochastic gradient methods for nested composition optimization. SIAM J. Optim. 29(1):616–659.Crossref, Google Scholar
- (2022) Decentralized gossip-based stochastic bilevel optimization over communication networks. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K Oh A, eds. Adv. Neural Inform. Processing Systems 35 (NeurIPS 2022) (Neural Information Processing Systems, San Diego), 238–252.Google Scholar
- (2020) Optimal algorithms for convex nested stochastic composite optimization. Preprint, submitted November 19, https://arxiv.org/abs/2011.10076.Google Scholar
- (2019) Learning parameterized prescription policies and disease progression dynamics using Markov decision processes. 2019 Amer. Control Conf. (ACC) (IEEE Press, Piscataway, NJ), 3438–3443.Google Scholar

