Contextual Inverse Optimization: Offline and Online Learning

Omar Besbes
Omar Besbes
[email protected]
https://orcid.org/0000-0002-2982-3794
Decision, Risk and Operations, Graduate School of Business, Columbia University, New York, New York 10027
Search for more papers by this author
,
Yuri Fonseca
Yuri Fonseca
[email protected]
https://orcid.org/0000-0002-7364-5729
Decision, Risk and Operations, Graduate School of Business, Columbia University, New York, New York 10027
Search for more papers by this author
,
Ilan Lobel
Corresponding Author
Ilan Lobel
[email protected]
https://orcid.org/0000-0002-5396-8117
Technology, Operations, and Statistics, Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author

Decision, Risk and Operations, Graduate School of Business, Columbia University, New York, New York 10027

Search for more papers by this author

Yuri Fonseca

[email protected]

https://orcid.org/0000-0002-7364-5729

Decision, Risk and Operations, Graduate School of Business, Columbia University, New York, New York 10027

Search for more papers by this author

Ilan Lobel

Corresponding Author

Ilan Lobel

[email protected]

https://orcid.org/0000-0002-5396-8117

Technology, Operations, and Statistics, Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Published Online:2 Aug 2023https://doi.org/10.1287/opre.2021.0369

References

Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. Proc. 21st Internat. Conf. on Machine Learn., 1.Google Scholar
Ahuja RK, Orlin JB (2001) Inverse optimization. Oper. Res. 49(5):771–783.Link, Google Scholar
Amin K, Jiang N, Singh S (2017) Repeated inverse reinforcement learning. Von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Adv. Neural Inform. Processing Systems (NeurIPS, Curran Associates, Inc., Red Hook, NY), 1813–1822.Google Scholar
Amin K, Cummings R, Dworkin L, Kearns M, Roth A (2015) Online learning and profit maximization from revealed preferences. Proc. AAAI Conf. on Artificial Intelligence, vol. 29 (AAAI Press, AAAI Press, Palo Alto, CA), 770–776.Google Scholar
Arora S, Doshi P (2021) A survey of inverse reinforcement learning: Challenges, methods and progress. Artificial Intelligence 297(297):1–28.Crossref, Google Scholar
Aswani A, Shen Z-J, Siddiq A (2018) Inverse optimization with noisy data. Oper. Res. 66(3):870–892.Link, Google Scholar
Balcan M-F, Daniely A, Mehta R, Urner R, Vazirani VV (2014) Learning economic parameters from revealed preferences. Proc. Internat. Conf. on Web and Internet Econom. (Springer, Berlin), 338–353.Google Scholar
Bärmann A, Pokutta S, Schneider O (2017) Emulating the expert: Inverse optimization through online learning. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. on Machine Learn. (JMLR.org), 400–410.Google Scholar
Bärmann A, Martin A, Pokutta S, Schneider O (2018) An online-learning approach to inverse optimization. Preprint, submitted October 30, https://arxiv.org/abs/1810.12997.Google Scholar
Bastani H, Bastani O, Sinchaisri WP (2021) Learning best practices: Can machine learning improve human decision-making? Acad. Management Proc. 2021, vol. 1 (Academy of Management, Briarcliff Manor, NY), 14006.Google Scholar
Beigman E, Vohra R (2006) Learning from revealed preference. Feigenbaum J, General Chair; Chuang J, Pennock DM, Program Chairs, eds. Proc. 7th ACM Conf. on Electronic Commerce (Association for Computing Machinery, New York), 36–42.Google Scholar
Bertsimas D, Gupta V, Paschalidis IC (2015) Data-driven estimation in equilibrium using inverse optimization. Math. Programming 153(2):595–633.Crossref, Google Scholar
Björck Å (1994) Numerics of gram-schmidt orthogonalization. Linear Algebra Appl. 197:297–316.Crossref, Google Scholar
Chan TC, Lee T, Terekhov D (2019) Inverse optimization: Closed-form solutions, geometry, and goodness of fit. Management Sci. 65(3):1115–1135.Link, Google Scholar
Chen N, Cire A, Hu M, Lagzi S (2021) Model-free assortment pricing with transaction data. Preprint, submitted January 6, https://arxiv.org/abs/2101.02251.Google Scholar
Cohen M, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
Dong C, Chen Y, Zeng B (2018) Generalized inverse optimization through online learning. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 86–95.Google Scholar
Esfahani PM, Shafieezadeh-Abadeh S, Hanasusanto GA, Kuhn D (2018) Data-driven inverse optimization with imperfect information. Math. Programming 167(1):191–234.Crossref, Google Scholar
Feng Y, Caldentey R, Ryan CT (2018) Learning customer preferences from personalized assortments. Preprint, submitted August 7, http://dx.doi.org/10.2139/ssrn.3215614.Google Scholar
Grötschel M, Lovász L, Schrijver A (1993) The ellipsoid method. Geometric Algorithms and Combinatorial Optimization (Springer, Berlin), 64–101.Crossref, Google Scholar
Henrion R, Seeger A (2010) Inradius and circumradius of various convex cones arising in applications. Set-Valued Variance Anal. 18(3–4):483–511.Crossref, Google Scholar
Jabbari S, Rogers RM, Roth A, Wu SZ (2017) Learning from rational behavior: Predicting solutions to unknown linear programs. Lee DD, von Luxburg U, Garnett R, Sugiyama M, Guyon I, eds. Adv. Neural Inform. Processing Systems, vol. 29 (Curran Associates, Inc., Red Hook, NY), 1570–1578.Google Scholar
Keshavarz A, Wang Y, Boyd S (2011) Imputing a convex objective function. Proc. IEEE Internat. Sympos. on Intelligent Control (IEEE, New York), 613–619.Google Scholar
Khachiyan LG (1979) A polynomial algorithm in linear programming. Dokl. Akad. Nauk. 244(5):1093–1096.Google Scholar
Krishnamurthy A, Lykouris T, Podimata C, Schapire R (2021) Contextual search in the presence of irrational agents. Khuller S, Vassilevska Williams V, eds. Proc. 53rd Annual ACM SIGACT Sympos. Theory Comput. (Association for Computing Machinery, New York), 910–918.Google Scholar
Liu A, Paes Leme R, Schneider J (2021) Optimal contextual pricing and extensions. Proc. ACM-SIAM Sympos. on Discrete Algorithms (SIAM, Philadelphia), 1059–1078.Google Scholar
Lobel I, Paes Leme R, Vladu A (2018) Multidimensional binary search for contextual decision-making. Oper. Res. 66(5):1346–1361.Link, Google Scholar
Nowozin S, Lampert CH (2011) Structured Learning and Prediction in Computer Vision, vol. 6 (Now Publishers).Google Scholar
Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J (2018) An algorithmic perspective on imitation learning. Foundations Trends® Robotics 7(1–2):1–79.Google Scholar
Osokin A, Bach F, Lacoste-Julien S (2017) On structured prediction theory with calibrated convex surrogate losses. Von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 301–312.Google Scholar
Paes Leme R, Schneider J (2018) Contextual search via intrinsic volumes. Proc. IEEE 59th Annual Sympos. on Foundations of Computer Sci. (IEEE, New York), 268–282.Google Scholar
Ratliff ND, Bagnell JA, Zinkevich MA (2006) Maximum margin planning. Cohen W, Moore A, eds. Proc. 23rd Internat. Conf. on Machine Learn. (Association for Computing Machinery, New York), 729–736.Google Scholar
Roth A, Ullman J, Wu ZS (2016) Watch and learn: Optimizing from revealed preferences feedback. Proc. 48th Annual ACM Sympos. on Theory of Comput., 949–962.Google Scholar
Sauré D, Vielma JP (2019) Ellipsoidal methods for adaptive choice-based conjoint analysis. Oper. Res. 67(2):315–338.Abstract, Google Scholar
Sutton C, McCallum A (2012) An introduction to conditional random fields. Foundations Trends® Machine Learn. 4(4):267–373.Google Scholar
Taskar B, Chatalbashev V, Koller D, Guestrin C (2005) Learning structured prediction models: A large margin approach. Dzeroski S, De Raedt L, Wrobel S, eds. Proc. 22nd Internat. Conf. on Machine Learn. (Association for Computing Machinery, New York), 896–903.Google Scholar
Thai J, Bayen AM (2018) Imputing a variational inequality function or a convex objective function: A robust approach. J. Math. Anal. Appl. 457(2):1675–1695.Crossref, Google Scholar
Toubia O, Hauser J, Garcia R (2007) Probabilistic polyhedral methods for adaptive choice-based conjoint analysis: Theory and application. Marketing Sci. 26(5):596–610.Link, Google Scholar
Ward A, Master N, Bambos N (2019) Learning to emulate an expert projective cone scheduler. Proc. American Control Conf. (IEEE, New York), 292–297.Google Scholar
Zadimoghaddam M, Roth A (2012) Efficiently learning from revealed preference. Proc. Internat. Workshop on Internet and Network Econom. (Springer, Berlin), 114–127.Google Scholar
Ziebart BD, Maas AL, Bagnell JA, Dey AK (2008) Maximum entropy inverse reinforcement learning. Proc. AAAI, vol. 8 (AAAI Press, Palo Alto, CA), 1433–1438.Google Scholar

Volume 73, Issue 1

January-February 2025

Pages iii-vii, 1-582, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:June 09, 2021
Accepted:June 06, 2023
Published Online:August 02, 2023

Cite as

Omar Besbes; , Yuri Fonseca; , Ilan Lobel (2023) Contextual Inverse Optimization: Offline and Online Learning. Operations Research 73(1):424-443.

https://doi.org/10.1287/opre.2021.0369

Keywords

Acknowledgments

An early version of some of the results in present paper appeared in 34th Annual Conference on Learning Theory COLT’21 with the title “Online Learning from Optimal Actions.” Only the abstract appeared in the conference proceedings. The authors thank Alberto Seeger, the COLT program committee members, the area editor, the associate editor, and the referees for valuable feedback.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Contextual Inverse Optimization: Offline and Online Learning

References

Volume 73, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News