Computationally Feasible Bounds for Partially Observed Markov Decision Processes

William S. Lovejoy
William S. Lovejoy
Stanford University, Stanford, California
Search for more papers by this author

Stanford University, Stanford, California

Published Online:1 Feb 1991https://doi.org/10.1287/opre.39.1.162

Abstract

A partially observed Markov decision process (POMDP) is a sequential decision problem where information concerning parameters of interest is incomplete, and possible actions include sampling, surveying, or otherwise collecting additional information. Such problems can theoretically be solved as dynamic programs, but the relevant state space is infinite, which inhibits algorithmic solution. This paper explains how to approximate the state space by a finite grid of points, and use that grid to construct upper and lower value function bounds, generate approximate nonstationary and stationary policies, and bound the value loss relative to optimal for using these policies in the decision problem. A numerical example illustrates the methodology.

Cited by
- Homeostasis after injury: How intertwined inference and control underpin post-injury pain and behaviour
  22 January 2026 | PLOS Computational Biology, Vol. 22, No. 1
- Forward and reverse engineering the pain system: from computational neuroscience to neuro-engineering
  Pain, Vol. 166, No. 11S
- Optimality of Base-Stock Policy Under Unknown General Demand Distributions: New Methods, New Results, and Computations
  9 June 2025 | Production and Operations Management, Vol. 34, No. 11
- Path integral control of partially observed systems via fully observable control approximations
  Systems & Control Letters, Vol. 204
- Optimal Sensor Scheduling for Remote State Estimation with Partial Channel Observation
  IEEE/CAA Journal of Automatica Sinica, Vol. 12, No. 7
- Ergodic Control of a Heterogeneous Population and Application to Electricity Pricing
  IEEE Transactions on Automatic Control, Vol. 70, No. 7
- Inventory model with incomplete information: sales and zero-balance signals
  12 April 2025 | Central European Journal of Operations Research, Vol. 33, No. 2
- Blockchain-Based Key Management and Security Decisions in Internet of Vehicles
  IEEE Internet of Things Journal, Vol. 12, No. 11
- State discretization for continuous-state MDPs in infectious disease control
  27 November 2024 | IISE Transactions on Healthcare Systems Engineering, Vol. 15, No. 1
- Tools at the Frontiers of Quantitative Verification
  1 November 2024
- Synthesising Robust Controllers for Robot Collectives with Recurrent Tasks: A Case Study
  21 November 2024 | Electronic Proceedings in Theoretical Computer Science, Vol. 411
- Joint Beamforming and Scheduling for Integrated Sensing and Communication Systems in URLLC: A POMDP Approach
  IEEE Transactions on Communications, Vol. 72, No. 10
- An efficient procedure for optimal maintenance intervention in partially observable multi-component systems
  Reliability Engineering & System Safety, Vol. 244
- Mertens conjectures in absorbing games with incomplete information
  The Annals of Applied Probability, Vol. 34, No. 2
- Adversarial Inference Control in Cyber-Physical Systems: A Bayesian Approach With Application to Smart Meters
  IEEE Access, Vol. 12
- Exact Solution to a Machine Maintenance Problem with Multiple Unobservable States
  1 January 2024 | SSRN Electronic Journal, Vol. 19
- Linear programming-based solution methods for constrained partially observable Markov decision processes
  10 June 2023 | Applied Intelligence, Vol. 53, No. 19
- A multi-objective constrained partially observable Markov decision process model for breast cancer screening
  29 April 2023 | Operational Research, Vol. 23, No. 2
- Modelling mood updating: a proof of principle study
  13 December 2022 | The British Journal of Psychiatry, Vol. 222, No. 3
- Efficient Discovery of Cost-effective Policies in Sequential, Medical Decision-Making Problems
  1 January 2023 | SSRN Electronic Journal, Vol. 9
- Ergodic control of a heterogeneous population and application to electricity pricing
- Vigilance, arousal, and acetylcholine: Optimal control of attention in a simple detection task
  31 October 2022 | PLOS Computational Biology, Vol. 18, No. 10
- On low-complexity quickest intervention of mutated diffusion processes through local approximation
  3 October 2022
- The probabilistic model checker Storm
  6 July 2021 | International Journal on Software Tools for Technology Transfer, Vol. 24, No. 4
- Probabilistic Model Checking and Autonomy
  Annual Review of Control, Robotics, and Autonomous Systems, Vol. 5, No. 1
- A POMDP-Based Antenna Selection for Massive MIMO Communication
  IEEE Transactions on Communications, Vol. 70, No. 3
- Scalable grid‐based approximation algorithms for partially observable Markov decision processes
  7 December 2021 | Concurrency and Computation: Practice and Experience, Vol. 34, No. 5
- Gradient-Descent for Randomized Controllers Under Partial Observability
  14 January 2022
- Under-Approximating Expected Total Rewards in POMDPs
  30 March 2022
- POMDP Controllers with Optimal Budget
  11 September 2022
- Partially Observable Minimum-Age Scheduling: The Greedy Policy
  IEEE Transactions on Communications, Vol. 70, No. 1
- Model-Based Performance Evaluation of Safety-Critical POMDPs
- A primer on partially observable Markov decision processes (POMDPs)
  14 September 2021 | Methods in Ecology and Evolution, Vol. 12, No. 11
- An Algorithm for Making Regime-Changing Markov Decisions
  4 October 2021 | Algorithms, Vol. 14, No. 10
- Managing mobile production-inventory systems influenced by a modulation process
  6 July 2021 | Annals of Operations Research, Vol. 304, No. 1-2
- Learning Teamwork Based on Leader’s Instructions and Coercion in the Continuous Space Pursuit Problem
  Transactions of the Japanese Society for Artificial Intelligence, Vol. 36, No. 5
- A reinforcement learning control approach for underwater manipulation under position and torque constraints
- An Approximation Approach for Response-Adaptive Clinical Trial Design
  Vishal Ahuja,
  John R. Birge
  28 May 2020 | INFORMS Journal on Computing, Vol. 32, No. 4
- Integrated optimization of maintenance interventions and spare part selection for a partially observable multi-component system
  Reliability Engineering & System Safety, Vol. 200
- Partially observable Markov decision processes for optimal operations of gas transmission networks
  Reliability Engineering & System Safety, Vol. 199
- Data Collection Versus Data Estimation: A Fundamental Trade-Off in Dynamic Networks
  IEEE Transactions on Network Science and Engineering, Vol. 7, No. 3
- Verification of Indefinite-Horizon POMDPs
  12 October 2020
- Fast Accurate Beam and Channel Tracking for Two-Dimensional Phased Antenna Arrays
  IEEE Access, Vol. 8
- Asymptotic Optimality of Finite Model Approximations for Partially Observed Markov Decision Processes With Discounted Cost
  IEEE Transactions on Automatic Control, Vol. 65, No. 1
- Structural Results for Average‐Cost Inventory Models with Markov‐Modulated Demand and Partial Information
  1 January 2020 | Production and Operations Management, Vol. 29, No. 1
- Joint optimization of monitoring quality and replacement decisions in condition-based maintenance
  Reliability Engineering & System Safety, Vol. 189
- Generalized Controllers in POMDP Decision-Making
- Deep Recurrent Policy Networks for Planning Under Partial Observability
  9 September 2019
- A Decentralized Optimization Framework for Energy Harvesting Devices
  IEEE Transactions on Mobile Computing, Vol. 17, No. 11
- Markov Decision Processes
  3 August 2018
- Near-Optimal Design for Fault-Tolerant Systems with Homogeneous Components under Incomplete Information
- Analysis of Mammography Screening Policies under Resource Constraints
  1 May 2018 | Production and Operations Management, Vol. 27, No. 5
- Exploiting submodular value functions for scaling up active perception
  29 August 2017 | Autonomous Robots, Vol. 42, No. 2
- Introduction and Summary
  12 May 2018
- Approximations for Partially Observed Markov Decision Processes
  12 May 2018
- An Approximation Approach for Response Adaptive Clinical Trial Design
  SSRN Electronic Journal, Vol. 56
- Approximating reachable belief points in POMDPs
- Optimal maintenance policies for a safety‐critical system and its deteriorating sensor
  26 September 2017 | Naval Research Logistics (NRL), Vol. 64, No. 5
- Verification and control of partially observable probabilistic systems
  8 March 2017 | Real-Time Systems, Vol. 53, No. 3
- Rational quantitative attribution of beliefs, desires and percepts in human mentalizing
  13 March 2017 | Nature Human Behaviour, Vol. 1, No. 4
- Optimization methods to solve adaptive management problems
  24 October 2016 | Theoretical Ecology, Vol. 10, No. 1
- Decentralized Transmission Policies for Energy Harvesting Devices
- Markov Decision Processes for Screening and Treatment of Chronic Diseases
  11 March 2017
- Value of information in sequential decision making: Component inspection, permanent monitoring and system-level scheduling
  Reliability Engineering & System Safety, Vol. 154
- Discrete recursive Bayesian filtering on intervals and the unit circle
- Improved active sensing performance in wireless sensor networks via channel state information
- DaVe: Offloading Delay-Tolerant Data Traffic to Connected Vehicle Networks
  IEEE Transactions on Vehicular Technology, Vol. 65, No. 6
- Integrated Inspection Scheduling and Maintenance Planning for Infrastructure Systems
  Computer-Aided Civil and Infrastructure Engineering, Vol. 31, No. 6
- Partially observable Markov decision processes for risk-based screening
- INDEXABILITY AND OPTIMAL INDEX POLICIES FOR A CLASS OF REINITIALISING RESTLESS BANDITS
  16 October 2015 | Probability in the Engineering and Informational Sciences, Vol. 30, No. 1
- Managing Inventory with Cash Register Information: Sales Recorded but Not Demands
  1 January 2016 | Production and Operations Management, Vol. 25, No. 1
- Partially‐observable stochastic hybrid systems (poshss) state estimation and optimal control
  20 March 2015 | Asian Journal of Control, Vol. 17, No. 6
- Optimal Planning and Learning in Uncertain Environments for the Management of Wind Farms
  Journal of Computing in Civil Engineering, Vol. 29, No. 5
- Toward Generalization of Automated Temporal Abstraction to Partially Observable Reinforcement Learning
  IEEE Transactions on Cybernetics, Vol. 45, No. 8
- Rollout Algorithms for Wireless Sensor Network-Assisted Target Search
  IEEE Sensors Journal, Vol. 15, No. 7
- Verification and Control of Partially Observable Probabilistic Real-Time Systems
  22 August 2015
- Cost-sensitive Bayesian control policy in human active sensing
  3 December 2014 | Frontiers in Human Neuroscience, Vol. 8
- Learning manipulative skills using a POMDP framework
- Markovian‐based framework for cooperative channel selection in cognitive radio networks
  1 September 2014 | IET Communications, Vol. 8, No. 14
- Monitoring as a partially observable decision problem
  Resource and Energy Economics, Vol. 37
- Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy
  Pervasive and Mobile Computing, Vol. 10
- Structural and Observational Uncertainty in Environmental and Natural Resource Management
  7 January 2014 | International Review of Environmental and Resource Economics, Vol. 7, No. 2
- Modeling Human Plan Recognition Using Bayesian Theory of Mind
- BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM
  17 December 2013 | International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 21, No. 06
- Cognitive radio transmission strategies for primary erasure channels
- Alleviating the Patient's Price of Privacy Through a Partially Observable Waiting List
  Burhaneddin Sandıkçı,
  Lisa M. Maillart,
  Andrew J. Schaefer,
  Mark S. Roberts,
  4 March 2013 | Management Science, Vol. 59, No. 8
- A survey of point-based POMDP solvers
  8 June 2012 | Autonomous Agents and Multi-Agent Systems, Vol. 27, No. 1
- Confronting dynamics and uncertainty in optimal decision making for conservation
  11 April 2013 | Environmental Research Letters, Vol. 8, No. 2
- Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy
- Tractable POMDP representations for intelligent tutoring systems
  3 April 2013 | ACM Transactions on Intelligent Systems and Technology, Vol. 4, No. 2
- Partially Observable Markov Decision Processes
- A POMDP Formulation of Multistep Failure Model with Software Rejuvenation
- A generalised partially observable Markov decision process updated by decision trees for maintenance optimisation
  4 June 2009 | Structure and Infrastructure Engineering, Vol. 7, No. 10
- Bayesian Sequential Detection With Phase-Distributed Change Time and Nonlinear Penalty—A POMDP Lattice Programming Approach
  IEEE Transactions on Information Theory, Vol. 57, No. 10
- On Solving Optimal Policies for Finite-Stage Event-Based Optimization
  IEEE Transactions on Automatic Control, Vol. 56, No. 9
- Resolving structural uncertainty in natural resources management using POMDP approaches
  Ecological Modelling, Vol. 222, No. 5
- Application layer QoS optimization for multimedia transmission over cognitive radio networks
  2 October 2010 | Wireless Networks, Vol. 17, No. 2
- Partially Observable MDPs ( POMDPS ): Introduction and Examples
  15 February 2011
- Reduction of a POMDP to an MDP
  15 February 2011
- Structural Results for POMDPs
  15 August 2011
- Optimal resurfacing decisions for road maintenance: A POMDP perspective
- Inventory Control with a Cash Register: Sales Recorded but Not Demand or Shrinkage
  SSRN Electronic Journal, Vol. 2
- Alleviating the Patient’S Price of Privacy Through a Partially Observable Waiting List
  SSRN Electronic Journal, Vol. 42
- Simulating human-like decisions in a memory-based agent model
  20 October 2010 | Computational and Mathematical Organization Theory, Vol. 16, No. 4
- Dynamic Allocation of Pharmaceutical Detailing and Sampling for Long-Term Profitability
  Ricardo Montoya,
  Oded Netzer,
  Kamel Jedidi,
  27 May 2010 | Marketing Science, Vol. 29, No. 5
- Efficient vision-based navigation
  23 April 2010 | Autonomous Robots, Vol. 29, No. 2
- Evaluating Point-Based POMDP Solvers on Multicore Machines
  IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 40, No. 4
- CWS
  14 June 2010
- CWS
  14 June 2010 | ACM SIGMETRICS Performance Evaluation Review, Vol. 38, No. 1
- Planning and Scheduling under Uncertainty: A Review Across Multiple Sectors
  8 April 2010 | Industrial & Engineering Chemistry Research, Vol. 49, No. 9
- Cognitive User Interfaces
  IEEE Signal Processing Magazine, Vol. 27, No. 3
- Partially Observable Markov Decision Processes: A Geometric Technique and Analysis
  Hao Zhang,
  29 July 2009 | Operations Research, Vol. 58, No. 1
- Multi-agent Reinforcement Learning: An Overview
- Partially Observable Markov Decision Process Approximations for Adaptive Sensing
  28 May 2009 | Discrete Event Dynamic Systems, Vol. 19, No. 3
- Distributed sender scheduling for multimedia transmission in wireless mobile peer-to-peer networks
  IEEE Transactions on Wireless Communications, Vol. 8, No. 9
- Partially Observed Markov Decision Process Multiarmed Bandits—Structural Results
  Vikram Krishnamurthy,
  Bo Wahlberg,
  10 April 2009 | Mathematics of Operations Research, Vol. 34, No. 2
- Probabilistic planning with clear preferences on missing information
  Artificial Intelligence, Vol. 173, No. 5-6
- A Cooperative Retransmission Scheme in Wireless Networks with Imperfect Channel State Information
- Opportunistic spectrum access for energy-constrained cognitive radios
  IEEE Transactions on Wireless Communications, Vol. 8, No. 3
- Optimal combined intrusion detection and biometric-based continuous authentication in high security mobile ad hoc networks
  IEEE Transactions on Wireless Communications, Vol. 8, No. 2
- The Rigidity of Labor: Processing Savings and Work Decisions through Shannon's Channels
  1 February 2009 | Finance and Economics Discussion Series, Vol. 2009.0, No. 2
- An Improved Algorithm for Waveform Scheduling
- The Rigidity of Labor: Processing Savings and Work Decisions Through Shannon's Channels
  SSRN Electronic Journal, Vol. 113
- Q-Learning-Based Adaptive Waveform Selection in Cognitive Radar
  International Journal of Communications, Network and System Sciences, Vol. 02, No. 07
- Prioritizing Point-Based POMDP Solvers
  IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 38, No. 6
- A Comprehensive Survey of Multiagent Reinforcement Learning
  IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 38, No. 2
- Biometric‐based user authentication in mobile ad hoc networks
  4 February 2008 | Security and Communication Networks, Vol. 1, No. 1
- A Framework of Combining Intrusion Detection and Continuous Authentication in Mobile Ad Hoc Networks
- Successive Linear Approximation Solution of Infinite-Horizon Dynamic Stochastic Programs
  SIAM Journal on Optimization, Vol. 18, No. 4
- Production to order and off‐line inspection when the production process is partially observable
  17 August 2007 | Naval Research Logistics (NRL), Vol. 54, No. 8
- Self-Organizing Neural Architectures and Cooperative Learning in a Multiagent Environment
  IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 37, No. 6
- Decentralized algorithms for netcentric force protection against antiship missiles
  IEEE Transactions on Aerospace and Electronic Systems, Vol. 43, No. 4
- Optimal Biometric-Based Continuous Authentication in Mobile Ad Hoc Networks
- Nonmyopic Multiaspect Sensing With Partially Observable Markov Decision Processes
  IEEE Transactions on Signal Processing, Vol. 55, No. 6
- Sensor Scheduling for Optimal Observability Using Estimation Entropy
- Dynamic Marketing Mix Allocation for Long-Term Profitability
  SSRN Electronic Journal, Vol. 40
- Parametric POMDPs for planning in continuous state spaces
  Robotics and Autonomous Systems, Vol. 54, No. 11
- On the Structure of Optimal Real-Time Encoders and Decoders in Noisy Communication
  IEEE Transactions on Information Theory, Vol. 52, No. 9
- An Optimal Lot-Sizing and Offline Inspection Policy in the Case of Nonrigid Demand
  Shoshana Anily,
  Abraham Grosfeld-Nir,
  1 April 2006 | Operations Research, Vol. 54, No. 2
- Opportunistic file transfer over a fading channel: A POMDP search theory formulation with optimal threshold policies
  IEEE Transactions on Wireless Communications, Vol. 5, No. 2
- Prioritizing Point-Based POMDP Solvers
- Navigation with memory in a partially observable environment
  Robotics and Autonomous Systems, Vol. 54, No. 1
- A Partially Observed Markov Decision Process for Dynamic Pricing
  Yossi Aviv,
  Amit Pazgal,
  1 September 2005 | Management Science, Vol. 51, No. 9
- Recursive Learning Automata for Control of Partially Observable Markov Decision Processes
- POMDP Multi-armed Bandit Formulation for Energy Minimization in Sensor Networks
- Optimal adaptive waveform selection for target tracking
- Planning with Continuous Actions in Partially Observable Environments
- Emission management for low probability intercept sensors in network centric warfare
  IEEE Transactions on Aerospace and Electronic Systems, Vol. 41, No. 1
- Dynamic Ion Channel Activation Scheduling in Patch Clamp on a Chip
  IEEE Transactions on Nanobioscience, Vol. 3, No. 3
- The optimal search for a Markovian target when the search path is constrained: the infinite-horizon case
  IEEE Transactions on Automatic Control, Vol. 48, No. 3
- A Genetic Algorithm Heuristic for Finite Horizon Partially Observed Markov Decision Problems
- Finite State and Action MDPS
- Formalizing multi-agent POMDP's in the context of network routing
- Algorithms for optimal scheduling and management of hidden Markov model sensors
  IEEE Transactions on Signal Processing, Vol. 50, No. 6
- Planning and Control in Artificial Intelligence: A Unifying Perspective
  Applied Intelligence, Vol. 14, No. 3
- Information-based inspection allocation for real-time inspection systems
  Journal of Manufacturing Systems, Vol. 20, No. 1
- Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking
  IEEE Transactions on Signal Processing, Vol. 49, No. 12
- Bounded-parameter Markov decision processes
  Artificial Intelligence, Vol. 122, No. 1-2
- Optimal Life-Cycle Costing with Partial Observability
  Journal of Infrastructure Systems, Vol. 6, No. 2
- A simple suboptimal algorithm for system maintenanceunder partial observability
  1 January 1999 | Annals of Operations Research, Vol. 91, No. 0
- Bearings-only tracking for maneuvering sources
  IEEE Transactions on Aerospace and Electronic Systems, Vol. 34, No. 1
- An efficient heuristic for a partially observable Markov decision process of machine replacement
  Computers & Operations Research, Vol. 24, No. 2
- Inspection, Maintenance, and Repair with Partial Observability
  Journal of Infrastructure Systems, Vol. 1, No. 2
- Criteria and approximate methods for path-constrained moving-target search problems
  Naval Research Logistics, Vol. 42, No. 1
- Bibliography
  27 May 2008
- FLEXIBLE INSPECTION SYSTEMS FOR SERIAL MULTI-STAGE PRODUCTION SYSTEMS
  IIE Transactions, Vol. 25, No. 3
- Replacement policy under partially observed Markov process
  International Journal of Production Economics, Vol. 29, No. 2
- Dynamic Maintenance of a Deteriorating System Under Uncertainty
- Optimization and search
- A survey of solution techniques for the partially observed Markov decision process
  Annals of Operations Research, Vol. 32, No. 1
- An approximate algorithm, with bounds, for composite-state partially observed Markov decision processes

Volume 39, Issue 1

January-February 1991

Pages 2-177

Article Information

Metrics

Information

Published Online:February 01, 1991

Cite as

William S. Lovejoy, (1991) Computationally Feasible Bounds for Partially Observed Markov Decision Processes. Operations Research 39(1):162-175.

https://doi.org/10.1287/opre.39.1.162

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Computationally Feasible Bounds for Partially Observed Markov Decision Processes

Abstract

Volume 39, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News