Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing

Junhui Cai
Junhui Cai
[email protected]
https://orcid.org/0009-0005-2740-3840
Department of Information Technology, Analytics, and Operations, University of Notre Dame, Notre Dame, Indiana 46556
Search for more papers by this author
,
Ran Chen
Ran Chen
[email protected]
https://orcid.org/0009-0005-0695-3210
Department of Statistics and Data Science, Washington University in St. Louis, St. Louis, Missouri 63130
Search for more papers by this author
,
Martin J. Wainwright
Martin J. Wainwright
[email protected]
https://orcid.org/0000-0002-8760-2236
Laboratory for Information and Decision Systems, Statistics and Data Science Center, Departments of Electrical Engineering and Computer Science and Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Linda Zhao
Corresponding Author
Linda Zhao
[email protected]
https://orcid.org/0009-0002-2752-7294
Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Search for more papers by this author

Department of Information Technology, Analytics, and Operations, University of Notre Dame, Notre Dame, Indiana 46556

Search for more papers by this author

Ran Chen

[email protected]

https://orcid.org/0009-0005-0695-3210

Department of Statistics and Data Science, Washington University in St. Louis, St. Louis, Missouri 63130

Search for more papers by this author

Martin J. Wainwright

[email protected]

https://orcid.org/0000-0002-8760-2236

Laboratory for Information and Decision Systems, Statistics and Data Science Center, Departments of Electrical Engineering and Computer Science and Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Linda Zhao

Corresponding Author

Linda Zhao

[email protected]

https://orcid.org/0009-0002-2752-7294

Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Search for more papers by this author

Published Online:9 Jun 2026https://doi.org/10.1287/mnsc.2024.08311

Abstract

Key challenges in running a business include deciding which products or services to present to consumers (the assortment problem) and how to price products (the pricing problem) to maximize revenue or profit. Instead of considering these problems in isolation, we address assortment-pricing jointly and tackle the intrinsic doubly high dimensionality—both actions and contextual vectors can take continuous value in high-dimensional spaces. We propose a doubly high-dimensional contextual bandit model to formulate this problem. To circumvent the curse of dimensionality, our model is simple, yet flexible, capturing the interaction effects between covariates (context) and actions on the reward via a low-rank representation matrix. The resulting class of models is reasonably expressive while remaining interpretable through latent factors and includes various bandit and pricing models as special cases, making it suitable for applications involving simultaneous multiple decision-making beyond joint assortment-pricing. We develop a computationally tractable procedure that combines an exploration/exploitation protocol with an efficient low-rank matrix estimator. We provide a nonasymptotic instance-dependent regret bound involving dimensions and rank in addition to the time horizon. Simulations on standard bandit and pricing models—special cases of our model—demonstrate that our method yields lower regret than state-of-the-art methods. Real-world assortment-pricing case studies, from an industry-leading instant noodle manufacturer to an emerging beauty start-up, underscore the gains achievable using our method, showing at least three-fold gains in revenue/profit and the interpretability of the latent factor models that are learned.

This paper was accepted by J. George Shanthikumar, data science.

Funding: Financial support from The Wharton School [Wharton AI & Analytics Initiative Fund], the National Science Foundation [Grants DMS-2311072 and DMS-2515896), and the Office of Naval Research [Grant N00014-21-1-2842] is gratefully acknowledged.

Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2024.08311.

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:October 29, 2024
Accepted:October 30, 2025
Published Online:June 09, 2026

Cite as

Junhui Cai, Ran Chen, Martin J. Wainwright, Linda Zhao (2026) Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing. Management Science 0(0).

https://doi.org/10.1287/mnsc.2024.08311

Keywords

Acknowledgments

The authors thank the department editor, associate editor, and referees for their feedback and suggestions, which significantly improved the paper. The authors also thank the industry partners. Authors are listed in alphabetical order.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News