Sumit Kunnumkal, Huseyin Topaloglu, (2008) Exploiting the Structural Properties of the Underlying Markov Decision Problem in the Q-Learning Algorithm. INFORMS Journal on Computing 20(2):288-301.
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.