Artificial Collusion: Examining Supracompetitive Pricing by Q-Learning Algorithms
Abstract
We examine concerns that pricing algorithms, employing reinforcement learning, used by competitors would autonomously and systematically learn to collude. Findings of supracompetitive prices with Q-learning have recently raised that alarm. However, a detailed analysis of the inner workings of this algorithm type reveals that it often does not satisfy conditions for what constitutes “autonomous” algorithmic collusion that would be a cartel risk in practice. We find that Q-learning can learn collusive equilibria only on timescales irrelevant to the firm’s objective. Competitors are committed to using the same Q-learning algorithm, starting at the same moment, with the same hyperparameters and action spaces, although it is outperformed by the first alternative pricing rule. This level of synchronization suggests the need for an explicit cartel agreement. Our analysis gives criteria for practically relevant, explicitly and tacitly colluding pricing algorithms that would constitute a threat to competition. Whether autonomous algorithmic collusion is a potential threat to competition remains to be seen. There is not yet reason for competition agencies to be overly suspicious of pricing algorithms, other than of “collusion by algorithm,” in which pricing software is used to implement cartel agreements or is coded with collusive intent.
This paper was accepted by Martin Bichler, market design, platform, and demand analytics.
Supplemental Material: The data files are available at https://doi.org/10.1287/mnsc.2024.08557.

