Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes
2016
Conference Paper
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.
Author(s): | Grau-Moya, J and Leibfried, F and Genewein, T and Braun, DA |
Book Title: | Machine Learning and Knowledge Discovery in Databases |
Pages: | 475-491 |
Year: | 2016 |
Month: | September |
Series: | Lecture Notes in Computer Science; 9852 |
Publisher: | Springer |
Bibtex Type: | Conference Paper (conference) |
DOI: | 10.1007/978-3-319-46227-1_30 |
Event Name: | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD 2016) |
Event Place: | Riva del Garda, Italy |
Address: | Cham, Switzerland |
State: | Published |
BibTex @conference{GrauMoyaLGB2016, title = {Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes}, author = {Grau-Moya, J and Leibfried, F and Genewein, T and Braun, DA}, booktitle = {Machine Learning and Knowledge Discovery in Databases}, pages = {475-491}, series = {Lecture Notes in Computer Science; 9852}, publisher = {Springer}, address = {Cham, Switzerland}, month = sep, year = {2016}, doi = {10.1007/978-3-319-46227-1_30}, month_numeric = {9} } |