Header logo is


2017


On the Design of {LQR} Kernels for Efficient Controller Learning
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

2017


arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


no image
Optimal gamification can help people procrastinate less

Lieder, F., Griffiths, T. L.

Annual Meeting of the Society for Judgment and Decision Making, Annual Meeting of the Society for Judgment and Decision Making, November 2017 (conference)

re

Project Page [BibTex]

Project Page [BibTex]


Coupling Adaptive Batch Sizes with Learning Rates
Coupling Adaptive Batch Sizes with Learning Rates

Balles, L., Romero, J., Hennig, P.

In Proceedings Conference on Uncertainty in Artificial Intelligence (UAI) 2017, pages: 410-419, (Editors: Gal Elidan and Kristian Kersting), Association for Uncertainty in Artificial Intelligence (AUAI), Conference on Uncertainty in Artificial Intelligence (UAI), August 2017 (inproceedings)

Abstract
Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple empirical inspection. The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance of the gradient estimates. This variance also changes over the optimization process; when using a constant batch size, stability and convergence is thus often enforced by means of a (manually tuned) decreasing learning rate schedule. We propose a practical method for dynamic batch size adaptation. It estimates the variance of the stochastic gradients and adapts the batch size to decrease the variance proportionally to the value of the objective function, removing the need for the aforementioned learning rate decrease. In contrast to recent related work, our algorithm couples the batch size to the learning rate, directly reflecting the known relationship between the two. On three image classification benchmarks, our batch size adaptation yields faster optimization convergence, while simultaneously simplifying learning rate tuning. A TensorFlow implementation is available.

ps pn

Code link (url) Project Page [BibTex]

Code link (url) Project Page [BibTex]


no image
Dynamic Time-of-Flight

Schober, M., Adam, A., Yair, O., Mazor, S., Nowozin, S.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 170-179, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (conference)

ei pn

DOI [BibTex]

DOI [BibTex]


Virtual vs. {R}eal: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017), 54, pages: 528-536, Proceedings of Machine Learning Research, (Editors: Sign, Aarti and Zhu, Jerry), PMLR, April 2017 (conference)

pn

pdf link (url) Project Page [BibTex]

pdf link (url) Project Page [BibTex]


no image
An automatic method for discovering rational heuristics for risky choice

Lieder, F., Krueger, P. M., Griffiths, T. L.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society, 2017, Falk Lieder and Paul M. Krueger contributed equally to this publication. (inproceedings)

Abstract
What is the optimal way to make a decision given that your time is limited and your cognitive resources are bounded? To answer this question, we formalized the bounded optimal decision process as the solution to a meta-level Markov decision process whose actions are costly computations. We approximated the optimal solution and evaluated its predictions against human choice behavior in the Mouselab paradigm, which is widely used to study decision strategies. Our computational method rediscovered well-known heuristic strategies and the conditions under which they are used, as well as novel heuristics. A Mouselab experiment confirmed our model’s main predictions. These findings are a proof-of-concept that optimal cognitive strategies can be automatically derived as the rational use of finite time and bounded cognitive resources.

re

Project Page [BibTex]

Project Page [BibTex]


no image
A reward shaping method for promoting metacognitive learning

Lieder, F., Krueger, P. M., Callaway, F., Griffiths, T. L.

In Proceedings of the Third Multidisciplinary Conference on Reinforcement Learning and Decision-Making, 2017 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


no image
When does bounded-optimal metareasoning favor few cognitive systems?

Milli, S., Lieder, F., Griffiths, T. L.

In AAAI Conference on Artificial Intelligence, 31, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
The Structure of Goal Systems Predicts Human Performance

Bourgin, D., Lieder, F., Reichman, D., Talmon, N., Griffiths, T.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Learning to (mis) allocate control: maltransfer can lead to self-control failure

Bustamante, L., Lieder, F., Musslick, S., Shenhav, A., Cohen, J.

In The 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making. Ann Arbor, Michigan, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Mouselab-MDP: A new paradigm for tracing how people plan

Callaway, F., Lieder, F., Krueger, P. M., Griffiths, T. L.

In The 3rd multidisciplinary conference on reinforcement learning and decision making, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Enhancing metacognitive reinforcement learning using reward structures and feedback

Krueger, P. M., Lieder, F., Griffiths, T. L.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017 (inproceedings)

re

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
Helping people choose subgoals with sparse pseudo rewards

Callaway, F., Lieder, F., Griffiths, T. L.

In Proceedings of the Third Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2017 (inproceedings)

re

[BibTex]

[BibTex]

2014


Probabilistic Progress Bars
Probabilistic Progress Bars

Kiefel, M., Schuler, C., Hennig, P.

In Conference on Pattern Recognition (GCPR), 8753, pages: 331-341, Lecture Notes in Computer Science, (Editors: Jiang, X., Hornegger, J., and Koch, R.), Springer, GCPR, September 2014 (inproceedings)

Abstract
Predicting the time at which the integral over a stochastic process reaches a target level is a value of interest in many applications. Often, such computations have to be made at low cost, in real time. As an intuitive example that captures many features of this problem class, we choose progress bars, a ubiquitous element of computer user interfaces. These predictors are usually based on simple point estimators, with no error modelling. This leads to fluctuating behaviour confusing to the user. It also does not provide a distribution prediction (risk values), which are crucial for many other application areas. We construct and empirically evaluate a fast, constant cost algorithm using a Gauss-Markov process model which provides more information to the user.

ei ps pn

website+code pdf DOI [BibTex]

2014


website+code pdf DOI [BibTex]


Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics
Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

Hennig, P., Hauberg, S.

In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 33, pages: 347-355, JMLR: Workshop and Conference Proceedings, (Editors: S Kaski and J Corander), Microtome Publishing, Brookline, MA, AISTATS, April 2014 (inproceedings)

Abstract
We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.

ei ps pn

pdf Youtube Supplements Project page link (url) [BibTex]

pdf Youtube Supplements Project page link (url) [BibTex]


no image
Probabilistic ODE Solvers with Runge-Kutta Means

Schober, M., Duvenaud, D., Hennig, P.

In Advances in Neural Information Processing Systems 27, pages: 739-747, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei pn

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
Active Learning of Linear Embeddings for Gaussian Processes

Garnett, R., Osborne, M., Hennig, P.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 230-239, (Editors: NL Zhang and J Tian), AUAI Press , Corvallis, Oregon, UAI2014, 2014, another link: http://arxiv.org/abs/1310.6740 (inproceedings)

ei pn

PDF Web [BibTex]

PDF Web [BibTex]


no image
Probabilistic Shortest Path Tractography in DTI Using Gaussian Process ODE Solvers

Schober, M., Kasenburg, N., Feragen, A., Hennig, P., Hauberg, S.

In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, Lecture Notes in Computer Science Vol. 8675, pages: 265-272, (Editors: P. Golland, N. Hata, C. Barillot, J. Hornegger and R. Howe), Springer, Heidelberg, MICCAI, 2014 (inproceedings)

ei pn

DOI [BibTex]

DOI [BibTex]


no image
Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

Gunter, T., Osborne, M., Garnett, R., Hennig, P., Roberts, S.

In Advances in Neural Information Processing Systems 27, pages: 2789-2797, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei pn

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
Algorithm selection by rational metareasoning as a model of human strategy selection

Lieder, F., Plunkett, D., Hamrick, J. B., Russell, S. J., Hay, N. J., Griffiths, T. L.

In Advances in Neural Information Processing Systems 27, 2014 (inproceedings)

Abstract
Selecting the right algorithm is an important problem in computer science, because the algorithm often has to exploit the structure of the input to be efficient. The human mind faces the same challenge. Therefore, solutions to the algorithm selection problem can inspire models of human strategy selection and vice versa. Here, we view the algorithm selection problem as a special case of metareasoning and derive a solution that outperforms existing methods in sorting algorithm selection. We apply our theory to model how people choose between cognitive strategies and test its prediction in a behavioral experiment. We find that people quickly learn to adaptively choose between cognitive strategies. People's choices in our experiment are consistent with our model but inconsistent with previous theories of human strategy selection. Rational metareasoning appears to be a promising framework for reverse-engineering how people choose among cognitive strategies and translating the results into better solutions to the algorithm selection problem.

re

Project Page [BibTex]

Project Page [BibTex]


no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

am ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

am ei pn

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
The high availability of extreme events serves resource-rational decision-making

Lieder, F., Hsu, M., Griffiths, T. L.

In Proceedings of the 36th Annual Conference of the Cognitive Science Society, 2014 (inproceedings)

re

[BibTex]

[BibTex]


no image
Layers of Abstraction: (Neuro)computational models of learning local and global statistical regularities

Diaconescu, A., Lieder, F., Mathys, C., Stephan, K. E.

In 20th Annual Meeting of the Organization for Human Brain Mapping, 2014 (inproceedings)

re

[BibTex]

[BibTex]

2013


no image
The Randomized Dependence Coefficient

Lopez-Paz, D., Hennig, P., Schölkopf, B.

In Advances in Neural Information Processing Systems 26, pages: 1-9, (Editors: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger), 27th Annual Conference on Neural Information Processing Systems (NIPS), 2013 (inproceedings)

ei pn

PDF [BibTex]

2013


PDF [BibTex]


no image
Fast Probabilistic Optimization from Noisy Gradients

Hennig, P.

In Proceedings of The 30th International Conference on Machine Learning, JMLR W&CP 28(1), pages: 62–70, (Editors: S Dasgupta and D McAllester), ICML, 2013 (inproceedings)

ei pn

PDF [BibTex]

PDF [BibTex]


Nonparametric dynamics estimation for time periodic systems
Nonparametric dynamics estimation for time periodic systems

Klenske, E., Zeilinger, M., Schölkopf, B., Hennig, P.

In Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing, pages: 486-493 , 2013 (inproceedings)

ei pn

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Analytical probabilistic proton dose calculation and range uncertainties

Bangert, M., Hennig, P., Oelfke, U.

In 17th International Conference on the Use of Computers in Radiation Therapy, pages: 6-11, (Editors: A. Haworth and T. Kron), ICCR, 2013 (inproceedings)

ei pn

[BibTex]

[BibTex]


no image
Animating Samples from Gaussian Distributions

Hennig, P.

(8), Max Planck Institute for Intelligent Systems, Tübingen, Germany, 2013 (techreport)

ei pn

PDF [BibTex]

PDF [BibTex]


no image
Controllability and Resource-Rational Planning

Lieder, F., Goodman, N. D., Huys, Q. J.

In Computational and Systems Neuroscience (Cosyne), pages: 112, 2013 (inproceedings)

Abstract
Learned helplessness experiments involving controllable vs. uncontrollable stressors have shown that the perceived ability to control events has profound consequences for decision making. Normative models of decision making, however, do not naturally incorporate knowledge about controllability, and previous approaches to incorporating it have led to solutions with biologically implausible computational demands [1,2]. Intuitively, controllability bounds the differential rewards for choosing one strategy over another, and therefore believing that the environment is uncontrollable should reduce one’s willingness to invest time and effort into choosing between options. Here, we offer a normative, resource-rational account of the role of controllability in trading mental effort for expected gain. In this view, the brain not only faces the task of solving Markov decision problems (MDPs), but it also has to optimally allocate its finite computational resources to solve them efficiently. This joint problem can itself be cast as a MDP [3], and its optimal solution respects computational constraints by design. We start with an analytic characterisation of the influence of controllability on the use of computational resources. We then replicate previous results on the effects of controllability on the differential value of exploration vs. exploitation, showing that these are also seen in a cognitively plausible regime of computational complexity. Third, we find that controllability makes computation valuable, so that it is worth investing more mental effort the higher the subjective controllability. Fourth, we show that in this model the perceived lack of control (helplessness) replicates empirical findings [4] whereby patients with major depressive disorder are less likely to repeat a choice that led to a reward, or to avoid a choice that led to a loss. Finally, the model makes empirically testable predictions about the relationship between reaction time and helplessness.

re

[BibTex]

[BibTex]


no image
Learned helplessness and generalization

Lieder, F., Goodman, N. D., Huys, Q. J. M.

In 35th Annual Conference of the Cognitive Science Society, 2013 (inproceedings)

re

[BibTex]

[BibTex]


no image
Reverse-Engineering Resource-Efficient Algorithms

Lieder, F., Goodman, N. D., Griffiths, T. L.

In NIPS Workshop Resource-Efficient Machine Learning, 2013 (inproceedings)

re

[BibTex]

[BibTex]