A Sequential Stochastic Assignment Problem in a Partially Observable Markov Chain
Abstract
A sequential stochastic assignment problem in a stationary Markov chain, where the states are not known explicitly, is considered. This is an optimization problem in a partially observable Markov chain, and an optimal policy and the total expected reward under this policy are obtained. Here we specify the learning procedure by the Bayes' theorem, and the optimal policy is not always a critical number policy. As a special case of this problem, a problem of optimal selections is considered, and a relation to former results of a sequential stochastic assignment problem is observed.

