Efficiency of Parallel and Restart Exploration Strategies in Model-Free Stochastic Simulations

Published Online:https://doi.org/10.1287/stsy.2025.0108

We analyze the efficiency of parallelization and restart mechanisms for stochastic simulations in model-free settings, where the underlying system dynamics are unknown. Such settings are common in Reinforcement Learning (RL) and rare-event estimation, where standard variance-reduction techniques like importance sampling are inapplicable. Focusing on the challenge of reaching rare states under a finite computational budget, we model exploration via random walks and Lévy processes. Based on rigorous probability analysis, our work reveals a phase transition in the success probability as a function of the number of parallel simulations: an optimal number N* exists, balancing exploration diversity and time allocation per simulation. Beyond this threshold, performance degrades exponentially. Furthermore, we demonstrate that a restart strategy, which reallocates resources from stagnant trajectories to promising regions, can yield an exponential improvement in success probability. In the context of RL, these strategies can improve policy gradient methods by enabling more efficient state-space exploration, leading to more accurate policy gradient estimates.

Funding: This research was supported by the SticAmsud LAGOON project, ANR EPLER, IRL-CNRS IFUMI-2030, and Action international CNRS.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.