An Approximation Approach for Response-Adaptive Clinical Trial Design
Published Online:28 May 2020https://doi.org/10.1287/ijoc.2020.0969
References
- (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.Link, Google Scholar
- (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 25th Annual Conf. Learn. Theory, vol. 23 (PMLR), 39.1–39.26.Google Scholar
- (2016) Response-adaptive designs for clinical trials: Simultaneous learning from multiple patients. Eur. J. Oper. Res. 248(2):619–633.Crossref, Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2):235–256.Crossref, Google Scholar
- (2005) A partially observed Markov decision process for dynamic pricing. Management Sci. 51(9):1400–1416.Link, Google Scholar
- (1985) Bandit Problems: Sequential Allocation of Experiments (Chapman and Hall, London).Crossref, Google Scholar
- (2010) Bayesian Adaptive Methods for Clinical Trials, vol. 38 (CRC Press, Boca Raton, FL).Crossref, Google Scholar
- (1995) Dynamic Programming and Optimal Control, vol. 1 (Athena Scientific, Belmont, MA).Google Scholar
- (2011) Dynamic Programming and Optimal Control, vol. 2, 3rd ed. (Athena Scientific, Belmont, MA).Google Scholar
- (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.Link, Google Scholar
- (2007) Optimization methods in dynamic portfolio management. Handbook Oper. Res. Management Sci. 15:845–865.Google Scholar
- (2007) Successive linear approximation solution of infinite-horizon dynamic stochastic programs. SIAM J. Optim. 18(4):1165–1186.Crossref, Google Scholar
- (1997) A heuristic variable grid solution method for POMDPS. Proc. Natl. Conf. Artificial Intelligence (John Wiley & Sons, New York), 727–733.Google Scholar
- (2011) Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds. Management Sci. 57(10):1752–1770.Link, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (2013) Simulation-Based Algorithms for Markov Decision Processes (Springer Science & Business Media, New York).Crossref, Google Scholar
- (2017) A Bayesian decision theoretic model of sequential experimentation with delayed response. J. Royal Statist. Soc. Ser. B: Statist. Methodology 79(5):1439–1462.Crossref, Google Scholar
- (1997) Multidimensional triangulation and interpolation for reinforcement learning. Advances in Neural Information Processing Systems, vol. 9 (Curran Associates, Red Hook, NY), 1005–1011.Google Scholar
- (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.Link, Google Scholar
- (2016) Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econom. 47:20–33.Crossref, Google Scholar
- (2015) Statistical issues in randomised controlled trials: A narrative synthesis. Asian Pacific J. Tropical Biomedicine 5(5):354–359.Crossref, Google Scholar
- (2010) Transforming Clinical Research in the United States: Challenges and Opportunities: Workshop Summary (National Academies Press, Washington, DC).Google Scholar
- (2006) Risk-sensitive online learning. Internat. Conf. Algorithmic Learn. Theory (Springer, New York), 199–213.Google Scholar
- (1979) Bandit processes and dynamic allocation indices. J. Royal Statist. Soc. B 41(2):148–164.Google Scholar
- (1997) Incremental methods for computing bounds in partially observable Markov decision processes. Proc. 14th Natl. Conf. Artificial Intelligence (ACM, New York), 734–739.Google Scholar
- (2000) Value-function approximations for partially observable Markov decision processes. J. Artificial Intelligence Res. 13:33–94.Crossref, Google Scholar
- (1986) Computing Optimal Sequential Allocation Rules in Clinical Trials, Lecture Notes–Monograph Series (Institute of Mathematical Statistics, Beachwood, OH), 29–39.Crossref, Google Scholar
- (1987) The multi-armed bandit problem: Decomposition and computation. Math. Oper. Res. 12(2):262–268.Link, Google Scholar
- (1987) Adaptive treatment allocation and the multi-armed bandit problem. Ann. Statist. 15(3):1091–1114.Crossref, Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (1994) Group sequential clinical trials: A classical evaluation of Bayesian decision-theoretic designs. J. Amer. Statist. Assoc. 89(428):1528–1534.Crossref, Google Scholar
- (1991a) Computationally feasible bounds for partially observed Markov decision processes. Oper. Res. 39(1):162–175.Link, Google Scholar
- (1991b) A survey of algorithmic methods for partially observed Markov decision processes. Ann. Oper. Res. 28(1):47–65.Crossref, Google Scholar
- (1976) Triangulations for the cube. J. Combinatorial Theory Ser. A 20(2):170–177.Crossref, Google Scholar
- (2010) Consort 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. BMJ 340:c869.Crossref, Google Scholar
- (1982) State of the art—A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Sci. 28(1):1–16.Link, Google Scholar
- (2002) Variable resolution discretization in optimal control. Machine Learn. 49(2–3):291–323.Crossref, Google Scholar
- (1999) The complexity of optimal queuing network control. Math. Oper. Res. 24(2):293–305.Link, Google Scholar
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley-Interscience, New York).Crossref, Google Scholar
- (2010) The Knowledge Gradient for Optimal Learning (Wiley, New York).Google Scholar
- (2015a) Study of rolapitant, a novel, long-acting, nk-1 receptor antagonist, for the prevention of chemotherapy-induced nausea and vomiting (CINV) due to highly emetogenic chemotherapy (HEC). Supportive Care Cancer 23(11):3281–3288.Crossref, Google Scholar
- . (2015b) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of cisplatin-based highly emetogenic chemotherapy in patients with cancer: Two randomised, active-controlled, double-blind, phase 3 trials. Lancet Oncology 16(9):1079–1089.Crossref, Google Scholar
- (2015) Adaptive designs for clinical trials with multiple endpoints. Clinical Investigations (London) 5(5):433–435.Crossref, Google Scholar
- (2008) Online planning algorithms for POMDPs. J. Artificial Intelligence Res. 32:663–704.Crossref, Google Scholar
- (2015) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of moderately emetogenic chemotherapy or anthracycline and cyclophosphamide regimens in patients with cancer: A randomised, active-controlled, double-blind, phase 3 trial. Lancet Oncology 16(9):1071–1078.Crossref, Google Scholar
- (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5):1071–1088.Link, Google Scholar
- (2012) Adaptive randomization to improve utility-based dose-finding with bivariate ordinal outcomes. J. Biopharmacology Statist. 22(4):785–801.Crossref, Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
- U.S. Food and Drug Administration (2018) Adaptive designs for clinical trials of drugs and biologics (draft guidance). Center for Biologics Evaluation and Research (CBER). Accessed June 3, 2019, https://www.regulations.gov/contentStreamer?documentId=FDA-2018-D-3124-0002&attachmentNumber=1&contentType=pdf.Google Scholar
- (2016) Risk-averse multi-armed bandit problems under mean-variance measure. IEEE J. Selected Topics Signal Processing 10(6):1093–1111.Crossref, Google Scholar
- (2015) Bayesian designs and the control of frequentist characteristics: A practical solution. Biometrics 71(1):218–226.Crossref, Google Scholar
- (2018) Bandit strategies evaluated in the context of clinical trials in rare life-threatening diseases. Probab. Engrg. Inform. Sci. 32(2):229–245.Crossref, Google Scholar
- (2015) Multi-armed bandit models for the optimal design of clinical trials: Benefits and challenges. Statist. Sci. 30(2):199–215.Crossref, Google Scholar
- (2005) FDA introductory comments: Clinical studies design and evaluation issues. Clinical Trials 2(4):273–275.Crossref, Google Scholar
- (2017) A framework for multi-A(rmed)/B(andit) testing with online FDR control. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 5959–5968.Google Scholar
- (2012) Phase II trial design with Bayesian adaptive randomization and predictive probability. J. Royal Statist. Soc. Ser. C: Appl. Statist. 61(2):219–235.Crossref, Google Scholar
- (2001) An improved grid-based approximation algorithm for POMDPs. Proc. Internat. Joint Conf. Artificial Intelligence (ACM, New York), 17:707–716.Google Scholar

