Skip to main content

An Approximate Dynamic Programming Algorithm for Monotone Value Functions

Author(s): Jiang, Daniel R; Powell, Warren B

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1gg5x
Full metadata record
DC FieldValueLanguage
dc.contributor.authorJiang, Daniel R-
dc.contributor.authorPowell, Warren B-
dc.date.accessioned2021-10-11T14:17:50Z-
dc.date.available2021-10-11T14:17:50Z-
dc.date.issued2015-12en_US
dc.identifier.citationJiang, Daniel R, Powell, Warren B. (2015). An Approximate Dynamic Programming Algorithm for Monotone Value Functions. Operations Research, 63 (6), 1489 - 1511. doi:10.1287/opre.2015.1425en_US
dc.identifier.issn0030-364X-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1gg5x-
dc.description.abstractMany sequential decision problems can be formulated as Markov decision processes (MDPs) where the optimal value function (or cost-to-go function) can be shown to satisfy a monotone structure in some or all of its dimensions. When the state space becomes large, traditional techniques, such as the backward dynamic programming algorithm (i.e., backward induction or value iteration), may no longer be effective in finding a solution within a reasonable time frame, and thus we are forced to consider other approaches, such as approximate dynamic programming (ADP). We propose a provably convergent ADP algorithm called Monotone-ADP that exploits the monotonicity of the value functions to increase the rate of convergence. In this paper, we describe a general finite-horizon problem setting where the optimal value function is monotone, present a convergence proof for Monotone-ADP under various technical assumptions, and show numerical results for three application domains: optimal stopping, energy storage/allocation, and glycemic control for diabetes patients. The empirical results indicate that by taking advantage of monotonicity, we can attain high quality solutions within a relatively small number of iterations, using up to two orders of magnitude less computation than is needed to compute the optimal solution exactly.en_US
dc.format.extent1489 - 1511en_US
dc.language.isoen_USen_US
dc.relation.ispartofOperations Researchen_US
dc.rightsAuthor's manuscripten_US
dc.titleAn Approximate Dynamic Programming Algorithm for Monotone Value Functionsen_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1287/opre.2015.1425-
dc.identifier.eissn1526-5463-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
An approximate dynamic programming algorithm for monotone value functions.pdf1.28 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.