Beyond Regret: Decoupling Learning and Decision Making in Online Linear Programming
Abstract
Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than , which is suboptimal compared with the bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes a general framework that improves on the result when the LP dual problem exhibits certain error bound conditions. For the first time, we show that first-order learning algorithms achieve regret in the continuous support setting and regret in the finite support setting beyond the nondegeneracy assumption. Our results significantly improve the state-of-the-art regret results and provide new insights for sequential decision making.
Funding: This research was supported by the National Natural Science Foundation of China [Grants 72225009, 72394360, and 72394365].
Supplemental Material: All supplemental materials, including the code, data, and files required to reproduce the results, are available at https://doi.org/10.1287/opre.2024.1575.

