An Approximation Approach for Response-Adaptive Clinical Trial Design

Published Online:https://doi.org/10.1287/ijoc.2020.0969

References

  • Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.LinkGoogle Scholar
  • Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 25th Annual Conf. Learn. Theory, vol. 23 (PMLR), 39.1–39.26.Google Scholar
  • Ahuja V, Birge JR (2016) Response-adaptive designs for clinical trials: Simultaneous learning from multiple patients. Eur. J. Oper. Res. 248(2):619–633.CrossrefGoogle Scholar
  • Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2):235–256.CrossrefGoogle Scholar
  • Aviv Y, Pazgal A (2005) A partially observed Markov decision process for dynamic pricing. Management Sci. 51(9):1400–1416.LinkGoogle Scholar
  • Berry DA, Fristedt B (1985) Bandit Problems: Sequential Allocation of Experiments (Chapman and Hall, London).CrossrefGoogle Scholar
  • Berry SM, Carlin BP, Lee JJ, Muller P (2010) Bayesian Adaptive Methods for Clinical Trials, vol. 38 (CRC Press, Boca Raton, FL).CrossrefGoogle Scholar
  • Bertsekas DP (1995) Dynamic Programming and Optimal Control, vol. 1 (Athena Scientific, Belmont, MA).Google Scholar
  • Bertsekas DP (2011) Dynamic Programming and Optimal Control, vol. 2, 3rd ed. (Athena Scientific, Belmont, MA).Google Scholar
  • Bertsimas D, Mersereau AJ (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.LinkGoogle Scholar
  • Birge JR (2007) Optimization methods in dynamic portfolio management. Handbook Oper. Res. Management Sci. 15:845–865.Google Scholar
  • Birge JR, Zhao G (2007) Successive linear approximation solution of infinite-horizon dynamic stochastic programs. SIAM J. Optim. 18(4):1165–1186.CrossrefGoogle Scholar
  • Brafman RI (1997) A heuristic variable grid solution method for POMDPS. Proc. Natl. Conf. Artificial Intelligence (John Wiley & Sons, New York), 727–733.Google Scholar
  • Brown DB, Smith JE (2011) Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds. Management Sci. 57(10):1752–1770.LinkGoogle Scholar
  • Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.LinkGoogle Scholar
  • Chang HS, Hu J, Fu MC, Marcus SI (2013) Simulation-Based Algorithms for Markov Decision Processes (Springer Science & Business Media, New York).CrossrefGoogle Scholar
  • Chick S, Forster M, Pertile P (2017) A Bayesian decision theoretic model of sequential experimentation with delayed response. J. Royal Statist. Soc. Ser. B: Statist. Methodology 79(5):1439–1462.CrossrefGoogle Scholar
  • Davies S (1997) Multidimensional triangulation and interpolation for reinforcement learning. Advances in Neural Information Processing Systems, vol. 9 (Curran Associates, Red Hook, NY), 1005–1011.Google Scholar
  • de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.LinkGoogle Scholar
  • DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econom. 47:20–33.CrossrefGoogle Scholar
  • Egbewale BE (2015) Statistical issues in randomised controlled trials: A narrative synthesis. Asian Pacific J. Tropical Biomedicine 5(5):354–359.CrossrefGoogle Scholar
  • English R, Lebovitz Y, Griffin R (2010) Transforming Clinical Research in the United States: Challenges and Opportunities: Workshop Summary (National Academies Press, Washington, DC).Google Scholar
  • Even-Dar E, Kearns M, Wortman J (2006) Risk-sensitive online learning. Internat. Conf. Algorithmic Learn. Theory (Springer, New York), 199–213.Google Scholar
  • Gittins JC (1979) Bandit processes and dynamic allocation indices. J. Royal Statist. Soc. B 41(2):148–164.Google Scholar
  • Hauskrecht M (1997) Incremental methods for computing bounds in partially observable Markov decision processes. Proc. 14th Natl. Conf. Artificial Intelligence (ACM, New York), 734–739.Google Scholar
  • Hauskrecht M (2000) Value-function approximations for partially observable Markov decision processes. J. Artificial Intelligence Res. 13:33–94.CrossrefGoogle Scholar
  • Katehakis MN, Derman C (1986) Computing Optimal Sequential Allocation Rules in Clinical Trials, Lecture Notes–Monograph Series (Institute of Mathematical Statistics, Beachwood, OH), 29–39.CrossrefGoogle Scholar
  • Katehakis MN, Veinott AF (1987) The multi-armed bandit problem: Decomposition and computation. Math. Oper. Res. 12(2):262–268.LinkGoogle Scholar
  • Lai TL (1987) Adaptive treatment allocation and the multi-armed bandit problem. Ann. Statist. 15(3):1091–1114.CrossrefGoogle Scholar
  • Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.CrossrefGoogle Scholar
  • Lewis RJ, Berry DA (1994) Group sequential clinical trials: A classical evaluation of Bayesian decision-theoretic designs. J. Amer. Statist. Assoc. 89(428):1528–1534.CrossrefGoogle Scholar
  • Lovejoy WS (1991a) Computationally feasible bounds for partially observed Markov decision processes. Oper. Res. 39(1):162–175.LinkGoogle Scholar
  • Lovejoy WS (1991b) A survey of algorithmic methods for partially observed Markov decision processes. Ann. Oper. Res. 28(1):47–65.CrossrefGoogle Scholar
  • Mara PS (1976) Triangulations for the cube. J. Combinatorial Theory Ser. A 20(2):170–177.CrossrefGoogle Scholar
  • Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG (2010) Consort 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. BMJ 340:c869.CrossrefGoogle Scholar
  • Monahan GE (1982) State of the art—A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Sci. 28(1):1–16.LinkGoogle Scholar
  • Munos R, Moore A (2002) Variable resolution discretization in optimal control. Machine Learn. 49(2–3):291–323.CrossrefGoogle Scholar
  • Papadimitriou CH, Tsitsiklis JN (1999) The complexity of optimal queuing network control. Math. Oper. Res. 24(2):293–305.LinkGoogle Scholar
  • Powell WB (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley-Interscience, New York).CrossrefGoogle Scholar
  • Powell WB (2010) The Knowledge Gradient for Optimal Learning (Wiley, New York).Google Scholar
  • Rapoport B, Chua D, Poma A, Arora S, Wang Y, Fein LE (2015a) Study of rolapitant, a novel, long-acting, nk-1 receptor antagonist, for the prevention of chemotherapy-induced nausea and vomiting (CINV) due to highly emetogenic chemotherapy (HEC). Supportive Care Cancer 23(11):3281–3288.CrossrefGoogle Scholar
  • Rapoport BL, Chasen MR, Gridelli C, Urban K, Modiano MR, Schnadig ID, Poma A, et al.. (2015b) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of cisplatin-based highly emetogenic chemotherapy in patients with cancer: Two randomised, active-controlled, double-blind, phase 3 trials. Lancet Oncology 16(9):1079–1089.CrossrefGoogle Scholar
  • Rauch G, Kieser M (2015) Adaptive designs for clinical trials with multiple endpoints. Clinical Investigations (London) 5(5):433–435.CrossrefGoogle Scholar
  • Ross S, Pineau J, Paquet S, Chaib-Draa B (2008) Online planning algorithms for POMDPs. J. Artificial Intelligence Res. 32:663–704.CrossrefGoogle Scholar
  • Schwartzberg LS, Modiano MR, Rapoport BL, Chasen MR, Gridelli C, Urban L, Poma A, Arora S, Navari RM, Schnadig ID (2015) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of moderately emetogenic chemotherapy or anthracycline and cyclophosphamide regimens in patients with cancer: A randomised, active-controlled, double-blind, phase 3 trial. Lancet Oncology 16(9):1071–1078.CrossrefGoogle Scholar
  • Smallwood RD, Sondik EJ (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5):1071–1088.LinkGoogle Scholar
  • Thall PF, Nguyen HQ (2012) Adaptive randomization to improve utility-based dose-finding with bivariate ordinal outcomes. J. Biopharmacology Statist. 22(4):785–801.CrossrefGoogle Scholar
  • Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.CrossrefGoogle Scholar
  • U.S. Food and Drug Administration (2018) Adaptive designs for clinical trials of drugs and biologics (draft guidance). Center for Biologics Evaluation and Research (CBER). Accessed June 3, 2019, https://www.regulations.gov/contentStreamer?documentId=FDA-2018-D-3124-0002&attachmentNumber=1&contentType=pdf.Google Scholar
  • Vakili S, Zhao Q (2016) Risk-averse multi-armed bandit problems under mean-variance measure. IEEE J. Selected Topics Signal Processing 10(6):1093–1111.CrossrefGoogle Scholar
  • Ventz S, Trippa L (2015) Bayesian designs and the control of frequentist characteristics: A practical solution. Biometrics 71(1):218–226.CrossrefGoogle Scholar
  • Villar SS (2018) Bandit strategies evaluated in the context of clinical trials in rare life-threatening diseases. Probab. Engrg. Inform. Sci. 32(2):229–245.CrossrefGoogle Scholar
  • Villar SS, Bowden J, Wason J (2015) Multi-armed bandit models for the optimal design of clinical trials: Benefits and challenges. Statist. Sci. 30(2):199–215.CrossrefGoogle Scholar
  • Woodcock J (2005) FDA introductory comments: Clinical studies design and evaluation issues. Clinical Trials 2(4):273–275.CrossrefGoogle Scholar
  • Yang F, Ramdas A, Jamieson KG, Wainwright MJ (2017) A framework for multi-A(rmed)/B(andit) testing with online FDR control. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 5959–5968.Google Scholar
  • Yin G, Chen N, Lee JJ (2012) Phase II trial design with Bayesian adaptive randomization and predictive probability. J. Royal Statist. Soc. Ser. C: Appl. Statist. 61(2):219–235.CrossrefGoogle Scholar
  • Zhou R, Hansen EA (2001) An improved grid-based approximation algorithm for POMDPs. Proc. Internat. Joint Conf. Artificial Intelligence (ACM, New York), 17:707–716.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.