Data-Driven Compositional Optimization in Misspecified Regimes

Shuoguang Yang
Shuoguang Yang
[email protected]
https://orcid.org/0000-0002-0915-196X
Department of Industrial Engineering & Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China
Search for more papers by this author
,
Ethan X. Fang
Ethan X. Fang
[email protected]
https://orcid.org/0000-0003-3762-9155
Department of Biostatistics & Bioinformatics, Duke University, Durham, North Carolina 27705
Search for more papers by this author
,
Uday V. Shanbhag
Corresponding Author
Uday V. Shanbhag
[email protected]
https://orcid.org/0000-0001-5869-9561
Department of Industrial & Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16803
Search for more papers by this author

Department of Industrial Engineering & Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China

Search for more papers by this author

Ethan X. Fang

[email protected]

https://orcid.org/0000-0003-3762-9155

Department of Biostatistics & Bioinformatics, Duke University, Durham, North Carolina 27705

Search for more papers by this author

Uday V. Shanbhag

Corresponding Author

Uday V. Shanbhag

[email protected]

https://orcid.org/0000-0001-5869-9561

Department of Industrial & Manufacturing Engineering, Pennsylvania State University, University Park, Pennsylvania 16803

Search for more papers by this author

Published Online:16 Feb 2024https://doi.org/10.1287/opre.2021.0295

References

Ahmadi H, Shanbhag UV (2020) On the resolution of misspecified convex optimization and monotone variational inequality problems. Comput. Optim. Appl. 77(1):125–161.Crossref, Google Scholar
Ahmadi H, Aybat NS, Shanbhag UV (2016) On the rate analysis of inexact augmented Lagrangian schemes for convex optimization problems with misspecified constraints. 2016 Amer. Control Conf. (IEEE, New York), 4841–4846.Google Scholar
Ahmed S, Çakmak U, Shapiro A (2007) Coherent risk measures in inventory problems. Eur. J. Oper. Res. 182(1):226–238.Crossref, Google Scholar
Arimoto S, Kawamura S, Miyazaki F (1984) Bettering operation of robots by learning. J. Robot. Systems 1:123–140.Crossref, Google Scholar
Arrow KJ (2002) The genesis of “optimal inventory policy.” Oper. Res. 50(1):1–2.Link, Google Scholar
Arrow KJ, Harris T, Marschak J (1951) Optimal inventory policy. Econometrica 19(3):250–272.Crossref, Google Scholar
Aybat NS, Ahmadi H, Shanbhag UV (2022) On the analysis of inexact augmented Lagrangian schemes for misspecified conic convex programs. IEEE Trans. Automat. Control 67(8):3981–3996.Crossref, Google Scholar
Balasubramanian K, Ghadimi S, Nguyen A (2022) Stochastic multilevel composition optimization algorithms with level-independent convergence rates. SIAM J. Optim. 32(2):519–544.Crossref, Google Scholar
Beck A (2017) First-Order Methods in Optimization (SIAM, Philadelphia).Crossref, Google Scholar
Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust Optimization. Princeton Series in Applied Mathematics (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Bertsimas D, Sim M (2003) Robust discrete optimization and network flows. Math. Programming 98(1):49–71.Crossref, Google Scholar
Bertsimas D, Sim M (2004) The price of robustness. Oper. Res. 52(1):35–53.Link, Google Scholar
Besbes O, Zeevi AJ (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi AJ (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Borkar VS (2008) Stochastic Approximation: A Dynamical Systems Viewpoint (Hindustan Book Agency, New Delhi, India; Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Chao GH (2013) Production and availability policies through the Markov decision process and myopic methods for contractual and selective orders. Eur. J. Oper. Res. 225(3):383–392.Crossref, Google Scholar
Cheevaprawatdomrong T, Smith RL (2004) Infinite horizon production scheduling in time-varying systems under stochastic demand. Oper. Res. 52(1):105–115.Link, Google Scholar
Chen T, Sun Y, Yin W (2021a) Closing the gap: Tighter analysis of alternating stochastic gradient methods for bilevel problems. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 25294–25307.Google Scholar
Chen T, Sun Y, Yin W (2021b) Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization. IEEE Trans. Signal Processing 69:4937–4948.Crossref, Google Scholar
Dai W, Gai Y, Krishnamachari B (2014) Online learning for multi-channel opportunistic access over unknown Markovian channels. 2014 Eleventh Annual IEEE Internat. Conf. Sensing Comm. Networking (SECON) (IEEE, New York), 64–71.Google Scholar
Ermoliev Y (1976) Methods of Stochastic Programming. Monographs in Optimization and OR (Nauka, Moscow).Google Scholar
Ghadimi S, Ruszczynski A, Wang M (2020) A single timescale stochastic approximation method for nested stochastic optimization. SIAM J. Optim. 30(1):960–979.Crossref, Google Scholar
Hadley G, Whitin T (1963) Analysis of Inventory Systems. Prentice-Hall International Series in Management (Prentice-Hall, Upper Saddle River, NJ).Google Scholar
Ho CP, Petrik M, Wiesemann W (2018) Fast Bellman updates for robust MDPs. Proc. Machine Learn. Res. 80:1979–1988.Google Scholar
Ho-Nguyen N, Kilinç-Karzan F (2019) Exploiting problem structure in optimization under uncertainty via online convex optimization. Math. Programming 177(1–2):113–147.Crossref, Google Scholar
Ho-Nguyen N, Kilinç-Karzan F (2021) Dynamic data-driven estimation of nonparametric choice models. Oper. Res. 69(4):1228–1239.Link, Google Scholar
Hovakimyan N, Cao C (2010) L1 Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation. Advances in Design and Control, vol. 21 (SIAM, Philadelphia).Crossref, Google Scholar
Jiang H, Shanbhag UV (2013) On the solution of stochastic optimization problems in imperfect information regimes. Proc. 2013 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 821–832.Google Scholar
Jiang H, Shanbhag UV (2015) Data-driven schemes for resolving misspecified MDPs: Asymptotics and error analysis. Proc. 2015 Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 3801–3812.Google Scholar
Jiang H, Shanbhag UV (2016) On the solution of stochastic optimization and variational problems in imperfect information regimes. SIAM J. Optim. 26(4):2394–2429.Crossref, Google Scholar
Jiang H, Shanbhag UV, Meyn SP (2018) Distributed computation of equilibria in misspecified convex stochastic Nash games. IEEE Trans. Automat. Control 63(2):360–371.Crossref, Google Scholar
Kirman AP (1975) Learning by firms about demand conditions. Day RH, Groves T, eds. Adaptive Economic Models (Academic Press, New York), 137–156.Crossref, Google Scholar
Kushner H, Yin G (2003) Stochastic Approximation and Recursive Algorithms and Applications, Stochastic Modeling and Applied Probability, 2nd ed. (Springer, New York).Google Scholar
Lei J, Shanbhag UV (2020) Asynchronous schemes for stochastic and misspecified potential games and nonconvex optimization. Oper. Res. 68(6):1742–1766.Link, Google Scholar
Li X (2013) Managing dynamic inventory systems with product returns: A Markov decision process. J. Optim. Theory Appl. 157(2):577–592.Crossref, Google Scholar
Lian X, Wang M, Liu J (2017) Finite-sum composition optimization via variance reduced gradient descent. Proc. Machine Learn. Res. 54:1159–1167.Google Scholar
Mankowitz DJ, Levine N, Jeong R, Shi Y, Kay J, Abdolmaleki A, Springenberg JT, Mann T, Hester T, Riedmiller M (2020) Robust reinforcement learning for continuous control with model misspecification. 8th Internat. Conf. Learn. Representations ICLR 2020 (International Conference on Learning Representations, Appleton, WI), 1–28.Google Scholar
Mannor S, Mebel O, Xu H (2012) Lightning does not strike twice: Robust MDPs with coupled uncertainty. Proc. 29th Internat. Conf. Machine Learn. (International Conference on Machine Learning, San Diego), 451–458.Google Scholar
Mannor S, Mebel O, Xu H (2016) Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4):1484–1509.Link, Google Scholar
Miyaguchi K (2021) Asymptotically exact error characterization of offline policy evaluation with misspecified linear models. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 28573–28584.Google Scholar
Okuguchi K (1976) Expectations and Stability in Oligopoly Models, Lecture Notes in Economics and Mathematical Systems, vol. 138 (Springer-Verlag, Berlin).Crossref, Google Scholar
Okuguchi K, Szidarovszky F (1990) The Theory of Oligopoly with Multi-Product Firms, Lecture Notes in Economics and Mathematical Systems, vol. 342 (Springer-Verlag, Berlin).Crossref, Google Scholar
Porteus EL (1990) Stochastic inventory theory. Heyman DP, Sobel MJ, eds. Stochastic Models, Handbooks in Operations Research and Management Science, vol. 2 (Elsevier, Amsterdam), 605–652.Crossref, Google Scholar
Puranam KS, Katehakis MN (2014) On optimal bidding and inventory control in sequential procurement auctions: The multi period case. Ann. Oper. Res. 217(1):447–462.Crossref, Google Scholar
Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics (Wiley, Hoboken, NJ).Crossref, Google Scholar
Qi M, Grigas P, Shen M (2021) Integrated conditional estimation-optimization. Preprint, submitted October 24, https://arxiv.org/abs/2110.12351.Google Scholar
Rigamonti A, Lučivjanská K (2022) Mean-semivariance portfolio optimization using minimum average partial. Ann. Oper. Res. 1–19.Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
Ross S (1983) Introduction to Stochastic Dynamic Programming, Probability and Mathematical Statistics (Academic Press, Inc., New York).Google Scholar
Ruszczynski A (2021) A stochastic subgradient method for nonsmooth nonconvex multilevel composition optimization. SIAM J. Control Optim. 59(3):2301–2320.Crossref, Google Scholar
Ruszczyński A, Shapiro A (2006) Optimization of convex risk functions. Math. Oper. Res. 31(3):433–452.Link, Google Scholar
Satheesh Kumar R, Elango C (2012) Markov decision processes in service facilities holding perishable inventory. Opsearch 49(4):348–365.Crossref, Google Scholar
Sato M, Abe K, Takeda H (1982) Learning control of finite Markov chains with unknown transition probabilities. IEEE Trans. Automat. Control 27(2):502–505.Crossref, Google Scholar
Shapiro A, Dentcheva D, Ruszczyński A (2009) Lectures on Stochastic Programming, MPS/SIAM Series on Optimization, vol. 9 (SIAM, Philadelphia).Crossref, Google Scholar
Uchiyama M (1978) Formation of high-speed motion pattern of a mechanical arm by trial. Trans. Soc. Instrument Control Engineers 14(6):706–712.Crossref, Google Scholar
Wang M, Fang EX, Liu H (2017a) Stochastic compositional gradient descent: Algorithms for minimizing compositions of expected-value functions. Math. Programming 161(1–2):419–449.Crossref, Google Scholar
Wang M, Liu J, Fang EX (2017b) Accelerating stochastic composition optimization. J. Machine Learn. Res. 18(1):3721–3743.Google Scholar
Wu M, Zhu SX, Teunter RH (2013) Newsvendor problem with random shortage cost under a risk criterion. Internat. J. Production Econom. 145(2):790–798.Crossref, Google Scholar
Yang S, Wang M, Fang EX (2019) Multilevel stochastic gradient methods for nested composition optimization. SIAM J. Optim. 29(1):616–659.Crossref, Google Scholar
Yang S, Zhang X, Wang M (2022) Decentralized gossip-based stochastic bilevel optimization over communication networks. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K Oh A, eds. Adv. Neural Inform. Processing Systems 35 (NeurIPS 2022) (Neural Information Processing Systems, San Diego), 238–252.Google Scholar
Zhang Z, Lan G (2020) Optimal algorithms for convex nested stochastic composite optimization. Preprint, submitted November 19, https://arxiv.org/abs/2011.10076.Google Scholar
Zhu H, Xu T, Paschalidis IC (2019) Learning parameterized prescription policies and disease progression dynamics using Markov decision processes. 2019 Amer. Control Conf. (ACC) (IEEE Press, Piscataway, NJ), 3438–3443.Google Scholar

Volume 73, Issue 3

May-June 2025

Pages iii-viii, 1151-1722, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:May 13, 2021
Accepted:August 28, 2023
Published Online:February 16, 2024

Cite as

Shuoguang Yang; , Ethan X. Fang; , Uday V. Shanbhag (2024) Data-Driven Compositional Optimization in Misspecified Regimes. Operations Research 73(3):1395-1411.

https://doi.org/10.1287/opre.2021.0295

Keywords

Acknowledgments

The authors thank the area editor, the associate editor, and the reviewers for invaluable comments and feedback, which led to substantial improvements to earlier versions of this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Data-Driven Compositional Optimization in Misspecified Regimes

References

Volume 73, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News