Transformer Choice Net: A Transformer Neural Network for Choice Prediction

Published Online:https://doi.org/10.1287/ijds.2025.0116

References

  • Aarts S, Shmoys DB, Coy A (2023) An interpretable determinantal choice model for subset selection. Preprint, submitted February 22, https://arxiv.org/abs/2302.11477.Google Scholar
  • Alptekinoğlu A, Semple JH (2016) The exponomial choice model: A new alternative for assortment and price optimization. Oper. Res. 64(1):79–93.LinkGoogle Scholar
  • An L, Li AA, Nemala V, Visotsky G (2025) Real-time personalization with simple transformers. ACM SIGMETRICS Perform. Evaluation Rev. 53(2):116–118.Google Scholar
  • Aouad A, Désir A (2022) Representing random utility choice models with neural networks. Management Sci., ePub ahead of print November 17, https://doi.org/10.1287/mnsc.2023.02189.Google Scholar
  • Arkoudi I, Krueger R, Azevedo CL, Pereira FC (2023) Combining discrete choice models and neural networks through embeddings: Formulation, interpretability and performance. Transportation Res. Part B Methodological 175:102783. Google Scholar
  • Bai Y, Feldman J, Segev D, Topaloglu H, Wagner L (2023) Assortment optimization under the multi-purchase multinomial logit choice model. Oper. Res. 72(6):2631–2664.LinkGoogle Scholar
  • Batsell RR, Polking JC (1985) A new class of market share models. Marketing Sci. 4(3):177–198.LinkGoogle Scholar
  • Belinkov Y (2022) Probing classifiers: Promises, shortcomings, and advances. Comput. Linguistics 48(1):207–219.Google Scholar
  • Ben-Akiva M, Lerman S (1985) Discrete-Choice Analysis: Theory and Application to Travel Demand (MIT Press, Cambridge, MA).Google Scholar
  • Benson AR, Kumar R, Tomkins A (2018) A discrete choice model for subset selection. Proc. 11th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 37–45.Google Scholar
  • Bentz Y, Merunka D (2000) Neural networks and the multinomial logit for brand choice modelling: A hybrid approach. J. Forecasting 19(3):177–200.Google Scholar
  • Blanchet J, Gallego G, Goyal V (2016) A Markov chain approximation to choice modeling. Oper. Res. 64(4):886–905.LinkGoogle Scholar
  • Block HD, Marschak J (1959) Random orderings and stochastic theories of response. Cowles Foundation Discussion Paper No. 66, Cowles Foundation for Research in Economics, Yale University, New Haven, CT. 1–68.Google Scholar
  • Bodea T, Ferguson M, Garrow L (2009) Data set—Choice-based revenue management: Data from a major hotel chain. Manufacturing Service Oper. Management 11(2):356–361.LinkGoogle Scholar
  • Bower A, Balzano L (2020) Preference modeling with context-dependent salient features. Daumé H, Singh A, eds. Proc 37th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 119 (PMLR, New York), 1067–1077.Google Scholar
  • Chen YC, Mišić VV (2022) Decision forest: A nonparametric approach to modeling irrational choice. Management Sci. 68(10):7090–7111.LinkGoogle Scholar
  • Chen N, Gallego G, Tang Z (2021) Estimating discrete choice models with random forests. Qiu R, Lyons K, Chen W, eds. INFORMS Internat. Conf. Service Sci. (Springer, Cham, Switzerland), 184–196.Google Scholar
  • Chen N, Gao P, Wang C, Wang Y (2023) Assortment optimization for the multinomial logit model with repeated customer interactions. Preprint, submitted August 2, https://doi.org/10.2139/ssrn.4526247.Google Scholar
  • Daganzo C (2014) Multinomial Probit: The Theory and Its Application to Demand Forecasting (Elsevier, Amsterdam).Google Scholar
  • Dong L, Xu S, Xu B (2018) Speech-transformer: A no-recurrence sequence-to-sequence model for speech recognition. 2018 IEEE Internat. Conf. Acoustics Speech Signal Processing (Institute of Electrical and Electronics Engineers, Piscataway, NJ), 5884–5888.Google Scholar
  • Farias VF, Jagabathula S, Shah D (2013) A nonparametric approach to modeling choice with limited data. Management Sci. 59(2):305–322.LinkGoogle Scholar
  • Gabel S, Timoshenko A (2022) Product choice with large assortments: A scalable deep-learning model. Management Sci. 68(3):1808–1827.LinkGoogle Scholar
  • Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102(477):359–378.Google Scholar
  • Golowich N, Rakhlin A, Shamir O (2018) Size-independent sample complexity of neural networks. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. Learn. Theory. Proceedings of Machine Learning Research, vol. 75 (PMLR, New York), 297–299.Google Scholar
  • Gupta S (1988) Impact of sales promotions on when, what, and how much to buy. J. Marketing Res. 25(4):342–355.Google Scholar
  • Han Y, Zegras C, Pereira FC, Ben-Akiva M (2020) A neural-embedded choice model: Tastenet-MNL modeling taste heterogeneity with flexibility and interpretability. Preprint, submitted February 3, https://arxiv.org/abs/2002.00922.Google Scholar
  • Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, et al. (2023) A survey on vision transformer. IEEE Trans. Pattern Anal. Machine Intelligence 45(1):87–110. Google Scholar
  • Jacobs B, Fok D, Donkers B (2021) Understanding large-scale dynamic purchase behavior. Marketing Sci. 40(5):844–870.LinkGoogle Scholar
  • Jasin S, Lyu C, Najafi S, Zhang H (2024) Assortment optimization with multi-item basket purchase under multivariate MNL model. Manufacturing Service Oper. Management 26(1):215–232. LinkGoogle Scholar
  • Jiang Q (2024) Choice modeling and assortment optimization on the transformer model. Unpublished PhD thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  • Kamishima T (2003) Nantonac collaborative filtering: Recommendation based on order responses. Proc. Ninth ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 583–588.Google Scholar
  • Koppelman FS, Bhat C (2006) A self instructing course in mode choice modeling: Multinomial and nested logit models. U.S. Department of Transportation, Federal Transit Administration, Washington, DC. Google Scholar
  • Lee J, Lee Y, Kim J, Kosiorek A, Choi S, Teh YW (2019) Set transformer: A framework for attention-based permutation-invariant neural networks. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 97 (PMLR, New York), 3744–3753.Google Scholar
  • Lin H, Li X, Wu L (2022a) Multi-choice preferences learning and assortment recommendation in e-commerce. Preprint, submitted March 8, https://doi.org/10.2139/ssrn.4035033.Google Scholar
  • Lin T, Wang Y, Liu X, Qiu X (2022b) A survey of transformers. AI Open 3:111–132.Google Scholar
  • Luan S, Wang R, Xu X, Xue W (2020) Operations management under consumer choice models with multiple purchases. Preprint, submitted October 15, https://doi.org/10.2139/ssrn.3679699.Google Scholar
  • Manchanda P, Ansari A, Gupta S (1999) The “shopping basket”: A model for multicategory purchase incidence decisions. Marketing Sci. 18(2):95–114.LinkGoogle Scholar
  • Manski CF (1977) The structure of random utility models. Theory Decision 8(3):229–254.Google Scholar
  • Maragheh R, Chen X, Davis J, Cho J, Kumar S, Achan K (2020) Choice modeling and assortment optimization in the presence of context effects. Preprint, submitted January 29, https://dx.doi.org/10.2139/ssrn.3747354.Google Scholar
  • McFadden D, Train K (2000) Mixed MNL models for discrete response. J. Appl. Econometrics 15(5):447–470.Google Scholar
  • Najafi S, Jasin S, Uichanco J, Zhao J (2023) Assortment and price optimization under a multi-attribute (contextual) choice model. Preprint, submitted July 19, https://doi.org/10.2139/ssrn.4505644.Google Scholar
  • OpenAI (2023) Gpt-4 technical report. Accessed March 27, 2023, https://cdn.openai.com/papers/gpt-4.pdf.Google Scholar
  • Peng Z, Rong Y, Zhu T (2024) Transformer-based choice model: A tool for assortment optimization evaluation. Naval Res. Logist. 71(6):854–877.Google Scholar
  • Pfannschmidt K, Gupta P, Haddenhorst B, Hüllermeier E (2022) Learning context-dependent choice functions. Internat. J. Approximate Reasoning 140:116–155.Google Scholar
  • Qiu S, Qin G, Wong M, Sun J (2024) RoutesFormer: A sequence-based route choice transformer for efficient path inference from sparse trajectories. Transportation Res. Part C Emerging Tech. 162:104552.Google Scholar
  • Ramesh A, Pavlov M, Goh G, Gray S, Voss C, Radford A, Chen M, Sutskever I (2021) Zero-shot text-to-image generation. Marina M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8821–8831.Google Scholar
  • Rosenfeld N, Oshiba K, Singer Y (2020) Predicting choice with set-dependent aggregation. Daumé H, Singh A, eds. Proc 37th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 119 (PMLR, New York), 8220–8229.Google Scholar
  • Ruiz FJR, Athey S, Blei DM (2020) SHOPPER: A probabilistic model of consumer choice with substitutes and complements. Ann. Appl. Statist. 14(1):1–27.Google Scholar
  • Seshadri A, Peysakhovich A, Ugander J (2019) Discovering context effects from raw choice data. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 97 (PMLR, New York), 5660–5669.Google Scholar
  • Sifringer B, Lurkin V, Alahi A (2020) Enhancing discrete choice models with representation learning. Transportation Res. Part B Methodological 140:236–261.Google Scholar
  • Tomlinson K, Benson AR (2021) Learning interpretable feature context effects in discrete choice. Proc. 27th ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1582–1592.Google Scholar
  • Trauger J, Tewari A (2024) Sequence length independent norm-based generalization bounds for transformers. Dasgupta S, Mandt S, Li Y, eds. Proc. 27th Internat. Conf. Artificial Intelligence Statist. Proceedings of Machine Learning Research, vol. 238 (PMLR, New York), 1405–1413.Google Scholar
  • Tulabandhula T, Sinha D, Karra SR, Patidar P (2023) Multi-purchase behavior: Modeling, estimation, and optimization. Manufacturing Service Oper. Management 25(6):2298–2313. AbstractGoogle Scholar
  • Van Cranenburgh S, Wang S, Vij A, Pereira F, Walker J (2022) Choice modelling in the age of machine learning-discussion paper. J. Choice Model. 42:100340.Google Scholar
  • Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J. Machine Learn. Res. 9(11):2579–2605.Google Scholar
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY).Google Scholar
  • Wagstaff E, Fuchs F, Engelcke M, Posner I, Osborne MA (2019) On the limitations of representing functions on sets. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research, vol. 97 (PMLR, New York), 6487–6494.Google Scholar
  • Wang S, Mo B, Zhao J (2020) Deep neural networks for choice analysis: Architecture design with alternative-specific utility functions. Transportation Res. Part C Emerging Tech. 112:234–251.Google Scholar
  • Wang H, Cai Z, Li X, Talluri K (2023) A neural network based choice model for assortment optimization. Preprint, submitted August 10, https://arxiv.org/abs/2308.05617.Google Scholar
  • Wong M, Farooq B (2021) ResLogit: A residual neural network logit model for data-driven choice modelling. Transportation Res. Part C Emerging Tech. 126:103050.Google Scholar
  • Zaheer M, Kottur S, Ravanbakhsh S, Poczos B, Salakhutdinov RR, Smola AJ (2017) Deep sets. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY).Google Scholar
  • Zhang S, Wang Z, Gao R, Li S (2025) Deep context-dependent choice model. Proc. 39th Annual Conf. Neural Inform. Processing Systems (Neural Information Processing Systems Foundation, Inc., Vancouver, BC).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.