Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes

Di Wang
Di Wang
[email protected]
https://orcid.org/0000-0003-0435-0609
Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China
Search for more papers by this author
,
Yao Wang
Yao Wang
[email protected]
https://orcid.org/0000-0003-4207-5273
Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China
Search for more papers by this author
,
Shao-Bo Lin
Corresponding Author
Shao-Bo Lin
[email protected]
https://orcid.org/0000-0001-5122-9153
Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China
Search for more papers by this author

Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China

Search for more papers by this author

Yao Wang

[email protected]

https://orcid.org/0000-0003-4207-5273

Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China

Search for more papers by this author

Shao-Bo Lin

Corresponding Author

Shao-Bo Lin

[email protected]

https://orcid.org/0000-0001-5122-9153

Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University, Xi’an 710049, China

Search for more papers by this author

Published Online:2 Jun 2026https://doi.org/10.1287/ijoc.2025.1183

Abstract

In recent years, large amounts of electronic health records (EHRs) concerning chronic diseases have been collected to facilitate medical diagnosis. Modeling the dynamic properties of EHRs related to chronic diseases can be efficiently done using dynamic treatment regimes (DTRs). Although reinforcement learning (RL) is a widely used method for creating DTRs, there is ongoing research in developing RL algorithms that can effectively handle large amounts of data. In this paper, we present a scalable kernel-based distributed Q-learning algorithm for generating DTRs. We perform both theoretical assessments and numerical analysis for the proposed approach. The results demonstrate that our algorithm significantly reduces the computational complexity associated with the state-of-the-art deep reinforcement learning methods while maintaining comparable generalization performance in terms of accumulated rewards, such as survival time or cumulative survival probability.

History: Accepted by J. Paul Brooks, Area Editor for Applications in Biology, Medicine, & Healthcare.

Funding: This research was supported by the National Natural Science Foundation of China [Grants 12471486, 62276209, and 12371513].

Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information (https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2025.1183) as well as from the IJOC GitHub software repository (https://github.com/INFORMSJoC/2025.1183). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/.

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:February 20, 2025
Accepted:April 02, 2026
Published Online:June 02, 2026

Cite as

Di Wang , Yao Wang , Shao-Bo Lin (2026) Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2025.1183

Keywords

Acknowledgments

The authors thank the associate editor and two anonymous referees for invaluable comments and suggestions and Dr. Shaojie Tang for insightful suggestions and significant contributions to this work.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News