Relaxed Indexability and Index Policy for Partially Observable Restless Bandits

Published Online:https://doi.org/10.1287/mnsc.2022.02831

References

  • Bertsekas DP (1987) Dynamic Programming: Deterministic and Stochastic Models (Prentice Hall, Hoboken, NJ).Google Scholar
  • Bertsimas D, Niño-Mora J (1996) Conservation laws, extended polymatroids and multi-armed bandit problems. Math. Oper. Res. 21(2):257–306.LinkGoogle Scholar
  • Bertsimas D, Niño-Mora J (2000) Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Oper. Res. 48(1):80–90.LinkGoogle Scholar
  • Brown DB, Smith JE (2020) Index policies and performance bounds for dynamic selection problems. Management Sci. 66(7):3029–3050.LinkGoogle Scholar
  • Elmaghraby HM, Liu K, Ding Z (2018) Femtocell scheduling as a restless multiarmed bandit problem using partial channel state observation. Proc. IEEE Internat. Conf. Comm. (ICC) (IEEE, Piscataway, NJ), 1–6.Google Scholar
  • Frostig E, Weiss G (2016) Four proofs of Gittins’ multiarmed bandit theorem. Ann. Oper. Res. 241:127–165.CrossrefGoogle Scholar
  • Gast N, Gaujal B, Yan C (2021) (Close to) Optimal policies for finite horizon restless bandits. Working paper. https://hal.inria.fr/hal-03262307/file/LP_paper.pdf.Google Scholar
  • Gittins JC (1979) Bandit processes and dynamic allocation indices. J. Roy. Statist. Soc. 41(2):148–177.CrossrefGoogle Scholar
  • Gittins JC, Glazebrook KD, Weber RR (2011) Multi-Armed Bandit Allocation Indices (Wiley, Chichester, UK).CrossrefGoogle Scholar
  • Glazebrook KD, Kirkbride C, Ouenniche J (2009) Index policies for the admission control and routing of impatient customers to heterogeneous service stations. Oper. Res. 57(4):975–989.LinkGoogle Scholar
  • Hodge DJ, Glazebrook KD (2011) Dynamic resource allocation in a multi-product make-to-stock production system. Queueing Systems 67:333–364.CrossrefGoogle Scholar
  • Hu W, Frazier PI (2017) An asymptotically optimal index policy for finite-horizon restless bandits. Preprint, submitted July 1, https://arxiv.org/abs/1707.00205.Google Scholar
  • Lapiccirella FE, Liu K, Ding Z (2011) Multi-channel opportunistic access based on primary ARQ messages overhearing. Proc. IEEE Internat. Conf. Comm. (ICC) (IEEE, Piscataway, NJ), 1–5.Google Scholar
  • Le Ny J, Dahleh M, Feron E (2008) Multi-UAV dynamic routing with partial observations using restless bandit allocation indices. Proc. Amer. Control Conf. (IEEE, Piscataway, NJ), 4220–4225.Google Scholar
  • Liu K (2020) Whittle index for restless bandits with expanding state spaces. Numer. Math. 42(4):372–384.Google Scholar
  • Liu K, Weber RR, Zhang C (2024) Low-complexity algorithm for restless bandits with imperfect observations. Math. Methods Oper. Res. 100(2):467–508.CrossrefGoogle Scholar
  • Liu K, Weber RR, Zhao Q (2011) Indexability and Whittle index for restless bandit problems involving reset processes. Proc. 50th IEEE Conf. Decision Control (IEEE, Piscataway, NJ), 7690–7696.Google Scholar
  • Liu K, Zhao Q (2008) A restless bandit formulation of opportunistic access: Indexability and index Policy. Proc. IEEE Workshop Networking Technol. Software Defined Radio (SDR) Networks (IEEE, Piscataway, NJ), 1–5.Google Scholar
  • Liu K, Zhao Q (2010) Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access. IEEE Trans. Inform. Theory 56(11):5547–5567.CrossrefGoogle Scholar
  • Liu K, Zhao Q (2012) Dynamic intrusion detection in resource-constrained cyber networks. IEEE Internat. Symposium Inform. Theory Proc. (IEEE, Piscataway, NJ), 970–974.Google Scholar
  • Liu K, Zhao Q, Krishnamachari B (2010) Dynamic multichannel access with imperfect channel state detection. IEEE Trans. Signal Process. 58(5):2795–2808.CrossrefGoogle Scholar
  • Niño-Mora J (2001) Restless bandits, partial conservation laws and indexability. Adv. Appl. Probab. 33:76–98.CrossrefGoogle Scholar
  • Niño-Mora J (2007) Dynamic priority allocation via restless bandit marginal productivity indices. TOP 15:161–198.CrossrefGoogle Scholar
  • Papadimitriou CH, Tsitsiklis JN (1999) The complexity of optimal queueing network control. Math. Oper. Res. 24(2):293–305.LinkGoogle Scholar
  • Sondik EJ (1978) The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. 26(2):282–304.LinkGoogle Scholar
  • Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):275–294.CrossrefGoogle Scholar
  • Verloop IM (2016) Asymptotically optimal priority policies for indexable and nonindexable restless bandits. Ann. Appl. Probab. 26(4):1947–1995.CrossrefGoogle Scholar
  • Wang K, Chen L, Liu Q (2014) On optimality of myopic policy for opportunistic access with nonidentical channels and imperfect sensing. IEEE Trans. Veh. Technol. 63(5):2478–2483.CrossrefGoogle Scholar
  • Weber RR (1992) On the Gittins index for multiarmed bandits. Ann. Probab. 2:1024–1033.Google Scholar
  • Weber RR, Weiss G (1990) On an index policy for restless bandits. J. Appl. Probab. 27(3):637–648.CrossrefGoogle Scholar
  • Weber RR, Weiss G (1991) Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Probab. 23:429–430.CrossrefGoogle Scholar
  • Whittle P (1980) Multi-armed bandits and the Gittins index. J. Roy. Statist. Soc. Ser. B 42(2):143–149.CrossrefGoogle Scholar
  • Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25(a):287–298.CrossrefGoogle Scholar
  • Zayas-Cabán G, Jasin S, Wang G (2019) An asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action bandits. Adv. Appl. Probab. 51(3):745–772.CrossrefGoogle Scholar
  • Zhao Q (2019) Multi-Armed Bandits: Theory and Applications to Online Learning in Networks (Morgan & Claypool, San Rafael, CA).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.