October 18, 2022 in 2022 INFORMS Annual Meeting

Online Optimization and Learning for Sequential Decision-making

Jessica Leung

SHARE: PRINT ARTICLE:

https://doi.org/10.1287/orms.2022.05.30n

Professor Patrick Jaillet took the stage on Tuesday, October 18, to deliver his plenary on the topic of online optimization and learning for sequential decision.

Patrick began his talk with examples of applications involving making sequential decisions with incomplete knowledge of what comes next. These examples brought us to the fundamental question of how we evaluate strategies operating under uncertainty. Patrick explained that in many cases, uncertainty about the future input stream can be modeled using stochastic concepts and treated using learning techniques. In online learning problems, one typically measures good algorithms using the notions of regret instead of competitive ratio. Combining online optimization and online learning techniques can lead to the design of better algorithms.

Patrick then focused his talk on four areas:

Online Resource Allocation Problems with Partial Learnability
Online Resource Allocation Problems with Samples
Online Linear Programming without Knowing the Horizon
Universal Online Learning and Regression

Online Resource Allocation Problems with Partial Learnability

Patrick first gave the problem statement of a basic simple online resource allocation problem and presented what the firm can achieve with an offline solution, the online adversarial approach solution and the online stochastic approach solution. He then compared the two online approaches by their expected value and discussed the limitations of the existing approaches. Given the limitations, he introduced a partially learnable model that has both adversarial and stochastic components, with adaptive and nonadaptive dynamic threshold policies that outperform existing algorithms.

Online Resource Allocation Problems with Samples

Next, Patrick set up a similar problem, except the expected reward in this case is unknown. Hence, more information is needed to arrive at a good algorithm design. To do so, we assume access to some sample where customers in the sample can be used to learn the expected rewards and information about the initial sequence. Patrick demonstrated that even limited stochastic information significantly improves the performance, particularly for a large number of customers.

Online Linear Programming without Knowing the Horizon

The next part of the talk was motivated by the online packing linear programming problem, where columns of the constraint matrix are randomly drawn without replacement and revealed one at a time, and decisions are made sequentially. Among all existing algorithms to solve this problem, the knowledge of the number of candidates is essential. This leads to the question: what if such number is unknown?

Patrick then presented a number of theorems showing that 1) the simple strategy of “rejecting the first s numbers, then accepting the first number that is best seen so far” is not too bad, and 2) beyond the worst case, most distributions are good.

Universal Online Learning and Regression

The final part of Patrick’s talk was a teaser for his co-author Moïse Blanchard’s talk that took place on Monday afternoon. Patrick motivated the study by comparing supervised learning and online learning and posed the question, “Can we characterize online learnability with provably minimal assumptions?” The goal is to learn the best function f* that describes the relationship between observed instance X and response Y.

However, there are three main challenges:

1. Noisy data, corrupted or even adversarial
2. “Best” f* may be complex: which function class?
3. How to construct general/robust algorithms

Given these challenges, Patrick introduced the concept of universal learning with no assumption on responses. An algorithm is universally consistent under X if it is consistent and any (X, Y). Patrick then posed more questions on minimal assumptions on X for universal learnability and whether we can construct algorithms that learn whenever possible.

Jessica Leung

Jessica Leung is a lecturer in business analytics (U.S. equivalent of an assistant professor) at the Econometrics and Business Statistics Department at Monash University, Australia. Her research focuses on convex and combinatorial optimization in business applications. Jessica teaches undergraduate and postgraduate courses in business statistics, management science, optimization, applied linear algebra and visual data analytics.

Keywords: