Contextual Inverse Optimization: Offline and Online Learning

Published Online:https://doi.org/10.1287/opre.2021.0369

References

  • Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. Proc. 21st Internat. Conf. on Machine Learn., 1.Google Scholar
  • Ahuja RK, Orlin JB (2001) Inverse optimization. Oper. Res. 49(5):771–783.LinkGoogle Scholar
  • Amin K, Jiang N, Singh S (2017) Repeated inverse reinforcement learning. Von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Adv. Neural Inform. Processing Systems (NeurIPS, Curran Associates, Inc., Red Hook, NY), 1813–1822.Google Scholar
  • Amin K, Cummings R, Dworkin L, Kearns M, Roth A (2015) Online learning and profit maximization from revealed preferences. Proc. AAAI Conf. on Artificial Intelligence, vol. 29 (AAAI Press, AAAI Press, Palo Alto, CA), 770–776.Google Scholar
  • Arora S, Doshi P (2021) A survey of inverse reinforcement learning: Challenges, methods and progress. Artificial Intelligence 297(297):1–28.CrossrefGoogle Scholar
  • Aswani A, Shen Z-J, Siddiq A (2018) Inverse optimization with noisy data. Oper. Res. 66(3):870–892.LinkGoogle Scholar
  • Balcan M-F, Daniely A, Mehta R, Urner R, Vazirani VV (2014) Learning economic parameters from revealed preferences. Proc. Internat. Conf. on Web and Internet Econom. (Springer, Berlin), 338–353.Google Scholar
  • Bärmann A, Pokutta S, Schneider O (2017) Emulating the expert: Inverse optimization through online learning. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. on Machine Learn. (JMLR.org), 400–410.Google Scholar
  • Bärmann A, Martin A, Pokutta S, Schneider O (2018) An online-learning approach to inverse optimization. Preprint, submitted October 30, https://arxiv.org/abs/1810.12997.Google Scholar
  • Bastani H, Bastani O, Sinchaisri WP (2021) Learning best practices: Can machine learning improve human decision-making? Acad. Management Proc. 2021, vol. 1 (Academy of Management, Briarcliff Manor, NY), 14006.Google Scholar
  • Beigman E, Vohra R (2006) Learning from revealed preference. Feigenbaum J, General Chair; Chuang J, Pennock DM, Program Chairs, eds. Proc. 7th ACM Conf. on Electronic Commerce (Association for Computing Machinery, New York), 36–42.Google Scholar
  • Bertsimas D, Gupta V, Paschalidis IC (2015) Data-driven estimation in equilibrium using inverse optimization. Math. Programming 153(2):595–633.CrossrefGoogle Scholar
  • Björck Å (1994) Numerics of gram-schmidt orthogonalization. Linear Algebra Appl. 197:297–316.CrossrefGoogle Scholar
  • Chan TC, Lee T, Terekhov D (2019) Inverse optimization: Closed-form solutions, geometry, and goodness of fit. Management Sci. 65(3):1115–1135.LinkGoogle Scholar
  • Chen N, Cire A, Hu M, Lagzi S (2021) Model-free assortment pricing with transaction data. Preprint, submitted January 6, https://arxiv.org/abs/2101.02251.Google Scholar
  • Cohen M, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.LinkGoogle Scholar
  • Dong C, Chen Y, Zeng B (2018) Generalized inverse optimization through online learning. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 86–95.Google Scholar
  • Esfahani PM, Shafieezadeh-Abadeh S, Hanasusanto GA, Kuhn D (2018) Data-driven inverse optimization with imperfect information. Math. Programming 167(1):191–234.CrossrefGoogle Scholar
  • Feng Y, Caldentey R, Ryan CT (2018) Learning customer preferences from personalized assortments. Preprint, submitted August 7, http://dx.doi.org/10.2139/ssrn.3215614.Google Scholar
  • Grötschel M, Lovász L, Schrijver A (1993) The ellipsoid method. Geometric Algorithms and Combinatorial Optimization (Springer, Berlin), 64–101.CrossrefGoogle Scholar
  • Henrion R, Seeger A (2010) Inradius and circumradius of various convex cones arising in applications. Set-Valued Variance Anal. 18(3–4):483–511.CrossrefGoogle Scholar
  • Jabbari S, Rogers RM, Roth A, Wu SZ (2017) Learning from rational behavior: Predicting solutions to unknown linear programs. Lee DD, von Luxburg U, Garnett R, Sugiyama M, Guyon I, eds. Adv. Neural Inform. Processing Systems, vol. 29 (Curran Associates, Inc., Red Hook, NY), 1570–1578.Google Scholar
  • Keshavarz A, Wang Y, Boyd S (2011) Imputing a convex objective function. Proc. IEEE Internat. Sympos. on Intelligent Control (IEEE, New York), 613–619.Google Scholar
  • Khachiyan LG (1979) A polynomial algorithm in linear programming. Dokl. Akad. Nauk. 244(5):1093–1096.Google Scholar
  • Krishnamurthy A, Lykouris T, Podimata C, Schapire R (2021) Contextual search in the presence of irrational agents. Khuller S, Vassilevska Williams V, eds. Proc. 53rd Annual ACM SIGACT Sympos. Theory Comput. (Association for Computing Machinery, New York), 910–918.Google Scholar
  • Liu A, Paes Leme R, Schneider J (2021) Optimal contextual pricing and extensions. Proc. ACM-SIAM Sympos. on Discrete Algorithms (SIAM, Philadelphia), 1059–1078.Google Scholar
  • Lobel I, Paes Leme R, Vladu A (2018) Multidimensional binary search for contextual decision-making. Oper. Res. 66(5):1346–1361.LinkGoogle Scholar
  • Nowozin S, Lampert CH (2011) Structured Learning and Prediction in Computer Vision, vol. 6 (Now Publishers).Google Scholar
  • Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J (2018) An algorithmic perspective on imitation learning. Foundations Trends® Robotics 7(1–2):1–79.Google Scholar
  • Osokin A, Bach F, Lacoste-Julien S (2017) On structured prediction theory with calibrated convex surrogate losses. Von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 301–312.Google Scholar
  • Paes Leme R, Schneider J (2018) Contextual search via intrinsic volumes. Proc. IEEE 59th Annual Sympos. on Foundations of Computer Sci. (IEEE, New York), 268–282.Google Scholar
  • Ratliff ND, Bagnell JA, Zinkevich MA (2006) Maximum margin planning. Cohen W, Moore A, eds. Proc. 23rd Internat. Conf. on Machine Learn. (Association for Computing Machinery, New York), 729–736.Google Scholar
  • Roth A, Ullman J, Wu ZS (2016) Watch and learn: Optimizing from revealed preferences feedback. Proc. 48th Annual ACM Sympos. on Theory of Comput., 949–962.Google Scholar
  • Sauré D, Vielma JP (2019) Ellipsoidal methods for adaptive choice-based conjoint analysis. Oper. Res. 67(2):315–338.AbstractGoogle Scholar
  • Sutton C, McCallum A (2012) An introduction to conditional random fields. Foundations Trends® Machine Learn. 4(4):267–373.Google Scholar
  • Taskar B, Chatalbashev V, Koller D, Guestrin C (2005) Learning structured prediction models: A large margin approach. Dzeroski S, De Raedt L, Wrobel S, eds. Proc. 22nd Internat. Conf. on Machine Learn. (Association for Computing Machinery, New York), 896–903.Google Scholar
  • Thai J, Bayen AM (2018) Imputing a variational inequality function or a convex objective function: A robust approach. J. Math. Anal. Appl. 457(2):1675–1695.CrossrefGoogle Scholar
  • Toubia O, Hauser J, Garcia R (2007) Probabilistic polyhedral methods for adaptive choice-based conjoint analysis: Theory and application. Marketing Sci. 26(5):596–610.LinkGoogle Scholar
  • Ward A, Master N, Bambos N (2019) Learning to emulate an expert projective cone scheduler. Proc. American Control Conf. (IEEE, New York), 292–297.Google Scholar
  • Zadimoghaddam M, Roth A (2012) Efficiently learning from revealed preference. Proc. Internat. Workshop on Internet and Network Econom. (Springer, Berlin), 114–127.Google Scholar
  • Ziebart BD, Maas AL, Bagnell JA, Dey AK (2008) Maximum entropy inverse reinforcement learning. Proc. AAAI, vol. 8 (AAAI Press, Palo Alto, CA), 1433–1438.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.