An Approximation Approach for Response-Adaptive Clinical Trial Design

Vishal Ahuja
Corresponding Author
Vishal Ahuja
[email protected]
https://orcid.org/0000-0001-6497-8444
Cox School of Business, Southern Methodist University, Dallas, Texas 75275;
Search for more papers by this author
,
John R. Birge
Corresponding Author
John R. Birge
[email protected]
Booth School of Business, University of Chicago, Chicago, Illinois 60637
Search for more papers by this author

Vishal Ahuja

Corresponding Author

Vishal Ahuja

[email protected]

https://orcid.org/0000-0001-6497-8444

Cox School of Business, Southern Methodist University, Dallas, Texas 75275;

Search for more papers by this author

John R. Birge

Corresponding Author

John R. Birge

[email protected]

Booth School of Business, University of Chicago, Chicago, Illinois 60637

Search for more papers by this author

Published Online:28 May 2020https://doi.org/10.1287/ijoc.2020.0969

References

Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.Link, Google Scholar
Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 25th Annual Conf. Learn. Theory, vol. 23 (PMLR), 39.1–39.26.Google Scholar
Ahuja V, Birge JR (2016) Response-adaptive designs for clinical trials: Simultaneous learning from multiple patients. Eur. J. Oper. Res. 248(2):619–633.Crossref, Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2):235–256.Crossref, Google Scholar
Aviv Y, Pazgal A (2005) A partially observed Markov decision process for dynamic pricing. Management Sci. 51(9):1400–1416.Link, Google Scholar
Berry DA, Fristedt B (1985) Bandit Problems: Sequential Allocation of Experiments (Chapman and Hall, London).Crossref, Google Scholar
Berry SM, Carlin BP, Lee JJ, Muller P (2010) Bayesian Adaptive Methods for Clinical Trials, vol. 38 (CRC Press, Boca Raton, FL).Crossref, Google Scholar
Bertsekas DP (1995) Dynamic Programming and Optimal Control, vol. 1 (Athena Scientific, Belmont, MA).Google Scholar
Bertsekas DP (2011) Dynamic Programming and Optimal Control, vol. 2, 3rd ed. (Athena Scientific, Belmont, MA).Google Scholar
Bertsimas D, Mersereau AJ (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.Link, Google Scholar
Birge JR (2007) Optimization methods in dynamic portfolio management. Handbook Oper. Res. Management Sci. 15:845–865.Google Scholar
Birge JR, Zhao G (2007) Successive linear approximation solution of infinite-horizon dynamic stochastic programs. SIAM J. Optim. 18(4):1165–1186.Crossref, Google Scholar
Brafman RI (1997) A heuristic variable grid solution method for POMDPS. Proc. Natl. Conf. Artificial Intelligence (John Wiley & Sons, New York), 727–733.Google Scholar
Brown DB, Smith JE (2011) Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds. Management Sci. 57(10):1752–1770.Link, Google Scholar
Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
Chang HS, Hu J, Fu MC, Marcus SI (2013) Simulation-Based Algorithms for Markov Decision Processes (Springer Science & Business Media, New York).Crossref, Google Scholar
Chick S, Forster M, Pertile P (2017) A Bayesian decision theoretic model of sequential experimentation with delayed response. J. Royal Statist. Soc. Ser. B: Statist. Methodology 79(5):1439–1462.Crossref, Google Scholar
Davies S (1997) Multidimensional triangulation and interpolation for reinforcement learning. Advances in Neural Information Processing Systems, vol. 9 (Curran Associates, Red Hook, NY), 1005–1011.Google Scholar
de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.Link, Google Scholar
DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econom. 47:20–33.Crossref, Google Scholar
Egbewale BE (2015) Statistical issues in randomised controlled trials: A narrative synthesis. Asian Pacific J. Tropical Biomedicine 5(5):354–359.Crossref, Google Scholar
English R, Lebovitz Y, Griffin R (2010) Transforming Clinical Research in the United States: Challenges and Opportunities: Workshop Summary (National Academies Press, Washington, DC).Google Scholar
Even-Dar E, Kearns M, Wortman J (2006) Risk-sensitive online learning. Internat. Conf. Algorithmic Learn. Theory (Springer, New York), 199–213.Google Scholar
Gittins JC (1979) Bandit processes and dynamic allocation indices. J. Royal Statist. Soc. B 41(2):148–164.Google Scholar
Hauskrecht M (1997) Incremental methods for computing bounds in partially observable Markov decision processes. Proc. 14th Natl. Conf. Artificial Intelligence (ACM, New York), 734–739.Google Scholar
Hauskrecht M (2000) Value-function approximations for partially observable Markov decision processes. J. Artificial Intelligence Res. 13:33–94.Crossref, Google Scholar
Katehakis MN, Derman C (1986) Computing Optimal Sequential Allocation Rules in Clinical Trials, Lecture Notes–Monograph Series (Institute of Mathematical Statistics, Beachwood, OH), 29–39.Crossref, Google Scholar
Katehakis MN, Veinott AF (1987) The multi-armed bandit problem: Decomposition and computation. Math. Oper. Res. 12(2):262–268.Link, Google Scholar
Lai TL (1987) Adaptive treatment allocation and the multi-armed bandit problem. Ann. Statist. 15(3):1091–1114.Crossref, Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
Lewis RJ, Berry DA (1994) Group sequential clinical trials: A classical evaluation of Bayesian decision-theoretic designs. J. Amer. Statist. Assoc. 89(428):1528–1534.Crossref, Google Scholar
Lovejoy WS (1991a) Computationally feasible bounds for partially observed Markov decision processes. Oper. Res. 39(1):162–175.Link, Google Scholar
Lovejoy WS (1991b) A survey of algorithmic methods for partially observed Markov decision processes. Ann. Oper. Res. 28(1):47–65.Crossref, Google Scholar
Mara PS (1976) Triangulations for the cube. J. Combinatorial Theory Ser. A 20(2):170–177.Crossref, Google Scholar
Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG (2010) Consort 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. BMJ 340:c869.Crossref, Google Scholar
Monahan GE (1982) State of the art—A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Sci. 28(1):1–16.Link, Google Scholar
Munos R, Moore A (2002) Variable resolution discretization in optimal control. Machine Learn. 49(2–3):291–323.Crossref, Google Scholar
Papadimitriou CH, Tsitsiklis JN (1999) The complexity of optimal queuing network control. Math. Oper. Res. 24(2):293–305.Link, Google Scholar
Powell WB (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley-Interscience, New York).Crossref, Google Scholar
Powell WB (2010) The Knowledge Gradient for Optimal Learning (Wiley, New York).Google Scholar
Rapoport B, Chua D, Poma A, Arora S, Wang Y, Fein LE (2015a) Study of rolapitant, a novel, long-acting, nk-1 receptor antagonist, for the prevention of chemotherapy-induced nausea and vomiting (CINV) due to highly emetogenic chemotherapy (HEC). Supportive Care Cancer 23(11):3281–3288.Crossref, Google Scholar
Rapoport BL, Chasen MR, Gridelli C, Urban K, Modiano MR, Schnadig ID, Poma A, et al.. (2015b) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of cisplatin-based highly emetogenic chemotherapy in patients with cancer: Two randomised, active-controlled, double-blind, phase 3 trials. Lancet Oncology 16(9):1079–1089.Crossref, Google Scholar
Rauch G, Kieser M (2015) Adaptive designs for clinical trials with multiple endpoints. Clinical Investigations (London) 5(5):433–435.Crossref, Google Scholar
Ross S, Pineau J, Paquet S, Chaib-Draa B (2008) Online planning algorithms for POMDPs. J. Artificial Intelligence Res. 32:663–704.Crossref, Google Scholar
Schwartzberg LS, Modiano MR, Rapoport BL, Chasen MR, Gridelli C, Urban L, Poma A, Arora S, Navari RM, Schnadig ID (2015) Safety and efficacy of rolapitant for prevention of chemotherapy-induced nausea and vomiting after administration of moderately emetogenic chemotherapy or anthracycline and cyclophosphamide regimens in patients with cancer: A randomised, active-controlled, double-blind, phase 3 trial. Lancet Oncology 16(9):1071–1078.Crossref, Google Scholar
Smallwood RD, Sondik EJ (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5):1071–1088.Link, Google Scholar
Thall PF, Nguyen HQ (2012) Adaptive randomization to improve utility-based dose-finding with bivariate ordinal outcomes. J. Biopharmacology Statist. 22(4):785–801.Crossref, Google Scholar
Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
U.S. Food and Drug Administration (2018) Adaptive designs for clinical trials of drugs and biologics (draft guidance). Center for Biologics Evaluation and Research (CBER). Accessed June 3, 2019, https://www.regulations.gov/contentStreamer?documentId=FDA-2018-D-3124-0002&attachmentNumber=1&contentType=pdf.Google Scholar
Vakili S, Zhao Q (2016) Risk-averse multi-armed bandit problems under mean-variance measure. IEEE J. Selected Topics Signal Processing 10(6):1093–1111.Crossref, Google Scholar
Ventz S, Trippa L (2015) Bayesian designs and the control of frequentist characteristics: A practical solution. Biometrics 71(1):218–226.Crossref, Google Scholar
Villar SS (2018) Bandit strategies evaluated in the context of clinical trials in rare life-threatening diseases. Probab. Engrg. Inform. Sci. 32(2):229–245.Crossref, Google Scholar
Villar SS, Bowden J, Wason J (2015) Multi-armed bandit models for the optimal design of clinical trials: Benefits and challenges. Statist. Sci. 30(2):199–215.Crossref, Google Scholar
Woodcock J (2005) FDA introductory comments: Clinical studies design and evaluation issues. Clinical Trials 2(4):273–275.Crossref, Google Scholar
Yang F, Ramdas A, Jamieson KG, Wainwright MJ (2017) A framework for multi-A(rmed)/B(andit) testing with online FDR control. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 5959–5968.Google Scholar
Yin G, Chen N, Lee JJ (2012) Phase II trial design with Bayesian adaptive randomization and predictive probability. J. Royal Statist. Soc. Ser. C: Appl. Statist. 61(2):219–235.Crossref, Google Scholar
Zhou R, Hansen EA (2001) An improved grid-based approximation algorithm for POMDPs. Proc. Internat. Joint Conf. Artificial Intelligence (ACM, New York), 17:707–716.Google Scholar

cover image INFORMS Journal on Computing

Volume 32, Issue 4

Fall 2020

Pages 855-1186, C2

Article Information

Supplemental Material

Metrics

Information

Received:May 08, 2018
Accepted:January 14, 2020
Published Online:May 28, 2020

Cite as

Vishal Ahuja, John R. Birge (2020) An Approximation Approach for Response-Adaptive Clinical Trial Design. INFORMS Journal on Computing 32(4):877-894.

https://doi.org/10.1287/ijoc.2020.0969

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

An Approximation Approach for Response-Adaptive Clinical Trial Design

References

Volume 32, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News