Bias and Variance Approximation in Value Function Estimates

Shie Mannor
Shie Mannor
[email protected]
Department of Electrical and Computer Engineering, McGill University, Montreal, Quebec, Canada H3A 2A7
Search for more papers by this author
,
Duncan Simester
Duncan Simester
[email protected]
Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Peng Sun
Peng Sun
[email protected]
Fuqua School of Business, Duke University, Durham, North Carolina 27708
Search for more papers by this author
,
John N. Tsitsiklis
John N. Tsitsiklis
[email protected]
Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Shie Mannor

[email protected]

Department of Electrical and Computer Engineering, McGill University, Montreal, Quebec, Canada H3A 2A7

Search for more papers by this author

Duncan Simester

[email protected]

Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Peng Sun

[email protected]

Fuqua School of Business, Duke University, Durham, North Carolina 27708

Search for more papers by this author

John N. Tsitsiklis

[email protected]

Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:1 Feb 2007https://doi.org/10.1287/mnsc.1060.0614

References

Barberis N. Investing for the long-run when returns are predictable. J. Finance (2000) 55:225–264Crossref, Google Scholar
Baukal-Gursoy M., Ross K. Variability sensitive Markov decision processes. Math. Oper. Res. (1992) 17(3):558–571Link, Google Scholar
Bertsekas D. P.Dynamic Programming and Optimal Control (2000) 12nd ed.(Athena Scientific, Belmont, MA) Google Scholar
Bertsekas D. P., Tsitsiklis J. N.Neuro-Dynamic Programming (1996) (Athena Scientific, Belmont, MA) Google Scholar
Bitran G. R., Mondschein S. V. Mailing decisions in the catalog sales industry. Management Sci. (1996) 42(9):1364–1381Link, Google Scholar
Bult J., Wansbeek T. Optimal selection for direct mail. Marketing Sci. (1995) 14(4):378–394Link, Google Scholar
Campbell J. Y., Viceira L. M.Strategic Asset Allocation: Portfolio Choice for Long-Term Investors (2002) (Oxford University Press, Oxford, UK) Crossref, Google Scholar
Clark C. The greatest of a finite set of random variables. Oper. Res. (1961) 9:145–162Link, Google Scholar
Dixit A. K., Pindyck R. S.Investment under Uncertainty (1994) (Princeton University Press, Princeton, NJ) Crossref, Google Scholar
Filar J. A., Kallenberg L., Lee H. Variance-penalized Markov decision processes. Math. Oper. Res. (1989) 14:147–161Link, Google Scholar
Godfrey G., Powell W. An adaptive dynamic programming algorithm for dynamic fleet management I: Single period travel time. Transportation Sci. (2002) 36(1):21–39Link, Google Scholar
Gönül F., Shi M. Optimal mailing of catalogs: A new methodology using estimable structural dynamic programming models. Management Sci. (1998) 44(9):1249–1262Link, Google Scholar
Hendel I., Nevo A. Measuring the implications of sales and consumer stockpiling behavior. Econometrica (2002) . ForthcomingGoogle Scholar
Keane M., Wolpin K. The solution and estimation of discrete choice dynamic programming models by simulation and interpolation: Monte Carlo evidence. Rev. Econom. Statist. (1994) 76(4):648–672Crossref, Google Scholar
Leadbetter M. R., Lindgren G., Rootzén H.Extremes and Related Properties of Random Sequences and Processes (1983) (Springer-Verlag, New York) Crossref, Google Scholar
Luenberger D. G.Investment Science (1997) (Oxford University Press, New York) Google Scholar
McGill J., van Ryzin G. Revenue management: Research overview and prospects. Transportation Sci. (1999) 33:233–256Link, Google Scholar
Rust J. Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica (1987) 55(5):999–1033Crossref, Google Scholar
Serfling R. J.Approximation Theorems of Mathematical Statistics (1980) (Wiley, New York) Crossref, Google Scholar
Simester D., Sun P., Tsitsiklis J. N. Dynamic catalog mailing policies. Management Sci. (2006) 52(5):683–696Link, Google Scholar
Sobel M. J. The variance of a discounted Markov decision process. J. Appl. Probab. (1982) 19:794–802Crossref, Google Scholar
Sutton R., Barto A.Reinforcement Learning (1998) (MIT Press, Cambridge, MA) Google Scholar
Xia Y. Learning about predictability: The effects of parameter uncertainty on dynamic asset allocation. J. Finance (2001) 56:205–246Crossref, Google Scholar
Zipkin P.Foundations of Inventory Management (2000) (McGraw-Hill/Irwin, Boston, MA) Google Scholar

Volume 53, Issue 2

February 2007

Pages iv-355

Article Information

Supplemental Material

Metrics

Information

Received:July 14, 2004
Published Online:February 01, 2007

Cite as

Shie Mannor, Duncan Simester, Peng Sun, John N. Tsitsiklis, (2007) Bias and Variance Approximation in Value Function Estimates. Management Science 53(2):308-322.

https://doi.org/10.1287/mnsc.1060.0614

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Bias and Variance Approximation in Value Function Estimates

References

Volume 53, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News