A Sampling-Based Gittins Index Approximation

Stef Baas
Corresponding Author
Stef Baas
[email protected]
https://orcid.org/0000-0002-5890-1165
Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands
Search for more papers by this author
,
Richard J. Boucherie
Richard J. Boucherie
[email protected]
https://orcid.org/0000-0002-1046-2044
Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands
Search for more papers by this author
,
Aleida Braaksma
Aleida Braaksma
[email protected]
https://orcid.org/0000-0003-2296-0144
Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands
Search for more papers by this author

Corresponding Author

Stef Baas

Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands

Search for more papers by this author

Richard J. Boucherie

[email protected]

https://orcid.org/0000-0002-1046-2044

Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands

Search for more papers by this author

Aleida Braaksma

[email protected]

https://orcid.org/0000-0003-2296-0144

Stochastic Operations Research Group, Department of Applied Mathematics, University of Twente, 7500 AE Enschede, Netherlands

Search for more papers by this author

Published Online:19 Mar 2026https://doi.org/10.1287/moor.2023.0225

Abstract

A sampling-based method is introduced to approximate the Gittins index for a general family of alternative bandit processes. The approximation consists of a truncation of the optimization horizon and support for the immediate rewards, an optimal stopping value approximation, and a stochastic approximation procedure. Finite-time error bounds are given for the three approximations, leading to a procedure to construct a confidence interval for the Gittins index using a finite number of Monte Carlo samples as well as an epsilon-optimal policy for the family of alternative bandit processes. Proofs are given for almost sure convergence and a central limit theorem for the sampling-based Gittins index approximation. In a numerical study, the quality of the approximation is verified for the Bernoulli bandit and the Gaussian bandit with known variance, and the method is shown to significantly outperform Thompson sampling and the Bayesian upper-confidence-bound algorithms for a novel random effects multi-armed bandit.

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Metrics

Information

Received:July 21, 2023
Accepted:February 08, 2026
Published Online:March 19, 2026

Cite as

Stef Baas, Richard J. Boucherie, Aleida Braaksma (2026) A Sampling-Based Gittins Index Approximation. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2023.0225

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

A Sampling-Based Gittins Index Approximation

Abstract

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News