Header logo is


2020


no image
Automatic Discovery of Interpretable Planning Strategies

Skirzyński, J., Becker, F., Lieder, F.

Machine Learning Journal, May 2020 (article) Submitted

Abstract
When making decisions, people often overlook critical information or are overly swayed by irrelevant information. A common approach to mitigate these biases is to provide decisionmakers, especially professionals such as medical doctors, with decision aids, such as decision trees and flowcharts. Designing effective decision aids is a difficult problem. We propose that recently developed reinforcement learning methods for discovering clever heuristics for good decision-making can be partially leveraged to assist human experts in this design process. One of the biggest remaining obstacles to leveraging the aforementioned methods for improving human decision-making is that the policies they learn are opaque to people. To solve this problem, we introduce AI-Interpret: a general method for transforming idiosyncratic policies into simple and interpretable descriptions. Our algorithm combines recent advances in imitation learning and program induction with a new clustering method for identifying a large subset of demonstrations that can be accurately described by a simple, high-performing decision rule. We evaluate our new AI-Interpret algorithm and employ it to translate information-acquisition policies discovered through metalevel reinforcement learning. The results of three large behavioral experiments showed that the provision of decision rules as flowcharts significantly improved people’s planning strategies and decisions across three different classes of sequential decision problems. Furthermore, a series of ablation studies confirmed that our AI-Interpret algorithm was critical to the discovery of interpretable decision rules and that it is ready to be applied to other reinforcement learning problems. We conclude that the methods and findings presented in this article are an important step towards leveraging automatic strategy discovery to improve human decision-making.

re

Automatic Discovery of Interpretable Planning Strategies The code for our algorithm and the experiments is available Project Page [BibTex]


no image
Advancing Rational Analysis to the Algorithmic Level

Lieder, F., Griffiths, T. L.

Behavioral and Brain Sciences, 43, E27, March 2020 (article)

Abstract
The commentaries raised questions about normativity, human rationality, cognitive architectures, cognitive constraints, and the scope or resource rational analysis (RRA). We respond to these questions and clarify that RRA is a methodological advance that extends the scope of rational modeling to understanding cognitive processes, why they differ between people, why they change over time, and how they could be improved.

re

Advancing rational analysis to the algorithmic level DOI [BibTex]

Advancing rational analysis to the algorithmic level DOI [BibTex]


no image
Learning to Overexert Cognitive Control in a Stroop Task

Bustamante, L., Lieder, F., Musslick, S., Shenhav, A., Cohen, J.

Febuary 2020, Laura Bustamante and Falk Lieder contributed equally to this publication. (article) In revision

Abstract
How do people learn when to allocate how much cognitive control to which task? According to the Learned Value of Control (LVOC) model, people learn to predict the value of alternative control allocations from features of a given situation. This suggests that people may generalize the value of control learned in one situation to other situations with shared features, even when the demands for cognitive control are different. This makes the intriguing prediction that what a person learned in one setting could, under some circumstances, cause them to misestimate the need for, and potentially over-exert control in another setting, even if this harms their performance. To test this prediction, we had participants perform a novel variant of the Stroop task in which, on each trial, they could choose to either name the color (more control-demanding) or read the word (more automatic). However only one of these tasks was rewarded, it changed from trial to trial, and could be predicted by one or more of the stimulus features (the color and/or the word). Participants first learned colors that predicted the rewarded task. Then they learned words that predicted the rewarded task. In the third part of the experiment, we tested how these learned feature associations transferred to novel stimuli with some overlapping features. The stimulus-task-reward associations were designed so that for certain combinations of stimuli the transfer of learned feature associations would incorrectly predict that more highly rewarded task would be color naming, which would require the exertion of control, even though the actually rewarded task was word reading and therefore did not require the engagement of control. Our results demonstrated that participants over-exerted control for these stimuli, providing support for the feature-based learning mechanism described by the LVOC model.

re

Learning to Overexert Cognitive Control in a Stroop Task DOI [BibTex]

Learning to Overexert Cognitive Control in a Stroop Task DOI [BibTex]


Toward a Formal Theory of Proactivity
Toward a Formal Theory of Proactivity

Lieder, F., Iwama, G.

January 2020 (article) Submitted

Abstract
Beyond merely reacting to their environment and impulses, people have the remarkable capacity to proactively set and pursue their own goals. But the extent to which they leverage this capacity varies widely across people and situations. The goal of this article is to make the mechanisms and variability of proactivity more amenable to rigorous experiments and computational modeling. We proceed in three steps. First, we develop and validate a mathematically precise behavioral measure of proactivity and reactivity that can be applied across a wide range of experimental paradigms. Second, we propose a formal definition of proactivity and reactivity, and develop a computational model of proactivity in the AX Continuous Performance Task (AX-CPT). Third, we develop and test a computational-level theory of meta-control over proactivity in the AX-CPT that identifies three distinct meta-decision-making problems: intention setting, resolving response conflict between intentions and automaticity, and deciding whether to recall context and intentions into working memory. People's response frequencies in the AX-CPT were remarkably well captured by a mixture between the predictions of our models of proactive and reactive control. Empirical data from an experiment varying the incentives and contextual load of an AX-CPT confirmed the predictions of our meta-control model of individual differences in proactivity. Our results suggest that proactivity can be understood in terms of computational models of meta-control. Our model makes additional empirically testable predictions. Future work will extend our models from proactive control in the AX-CPT to proactive goal creation and goal pursuit in the real world.

re

Toward a formal theory of proactivity DOI Project Page [BibTex]

2018


Probabilistic Solutions To Ordinary Differential Equations As Non-Linear Bayesian Filtering: A New Perspective
Probabilistic Solutions To Ordinary Differential Equations As Non-Linear Bayesian Filtering: A New Perspective

Tronarp, F., Kersting, H., Särkkä, S., Hennig, P.

ArXiv preprint 2018, arXiv:1810.03440 [stat.ME], October 2018 (article)

Abstract
We formulate probabilistic numerical approximations to solutions of ordinary differential equations (ODEs) as problems in Gaussian process (GP) regression with non-linear measurement functions. This is achieved by defining the measurement sequence to consists of the observations of the difference between the derivative of the GP and the vector field evaluated at the GP---which are all identically zero at the solution of the ODE. When the GP has a state-space representation, the problem can be reduced to a Bayesian state estimation problem and all widely-used approximations to the Bayesian filtering and smoothing problems become applicable. Furthermore, all previous GP-based ODE solvers, which were formulated in terms of generating synthetic measurements of the vector field, come out as specific approximations. We derive novel solvers, both Gaussian and non-Gaussian, from the Bayesian state estimation problem posed in this paper and compare them with other probabilistic solvers in illustrative experiments.

pn

link (url) Project Page [BibTex]

2018



no image
Convergence Rates of Gaussian ODE Filters

Kersting, H., Sullivan, T. J., Hennig, P.

arXiv preprint 2018, arXiv:1807.09737 [math.NA], July 2018 (article)

Abstract
A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives a priori as a Gauss--Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. We prove worst-case local convergence rates of order $h^{q+1}$ for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order $h^q$ in the case of $q=1$ and an integrated Brownian motion prior, and analyse how inaccurate information on $\dot{x}$ coming from approximate evaluations of $f$ affects these rates. Moreover, we present explicit formulas for the steady states and show that the posterior confidence intervals are well calibrated in all considered cases that exhibit global convergence---in the sense that they globally contract at the same rate as the truncation error.

pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences

Kanagawa, M., Hennig, P., Sejdinovic, D., Sriperumbudur, B. K.

Arxiv e-prints, arXiv:1805.08845v1 [stat.ML], 2018 (article)

Abstract
This paper is an attempt to bridge the conceptual gaps between researchers working on the two widely used approaches based on positive definite kernels: Bayesian learning or inference using Gaussian processes on the one side, and frequentist kernel methods based on reproducing kernel Hilbert spaces on the other. It is widely known in machine learning that these two formalisms are closely related; for instance, the estimator of kernel ridge regression is identical to the posterior mean of Gaussian process regression. However, they have been studied and developed almost independently by two essentially separate communities, and this makes it difficult to seamlessly transfer results between them. Our aim is to overcome this potential difficulty. To this end, we review several old and new results and concepts from either side, and juxtapose algorithmic quantities from each framework to highlight close similarities. We also provide discussions on subtle philosophical and theoretical differences between the two approaches.

pn ei

arXiv [BibTex]

arXiv [BibTex]


no image
Counterfactual Mean Embedding: A Kernel Method for Nonparametric Causal Inference

Muandet, K., Kanagawa, M., Saengkyongam, S., Marukata, S.

Arxiv e-prints, arXiv:1805.08845v1 [stat.ML], 2018 (article)

Abstract
This paper introduces a novel Hilbert space representation of a counterfactual distribution---called counterfactual mean embedding (CME)---with applications in nonparametric causal inference. Counterfactual prediction has become an ubiquitous tool in machine learning applications, such as online advertisement, recommendation systems, and medical diagnosis, whose performance relies on certain interventions. To infer the outcomes of such interventions, we propose to embed the associated counterfactual distribution into a reproducing kernel Hilbert space (RKHS) endowed with a positive definite kernel. Under appropriate assumptions, the CME allows us to perform causal inference over the entire landscape of the counterfactual distribution. The CME can be estimated consistently from observational data without requiring any parametric assumption about the underlying distributions. We also derive a rate of convergence which depends on the smoothness of the conditional mean and the Radon-Nikodym derivative of the underlying marginal distributions. Our framework can deal with not only real-valued outcome, but potentially also more complex and structured outcomes such as images, sequences, and graphs. Lastly, our experimental results on off-policy evaluation tasks demonstrate the advantages of the proposed estimator.

ei pn

arXiv [BibTex]

arXiv [BibTex]


no image
Model-based Kernel Sum Rule: Kernel Bayesian Inference with Probabilistic Models

Nishiyama, Y., Kanagawa, M., Gretton, A., Fukumizu, K.

Arxiv e-prints, arXiv:1409.5178v2 [stat.ML], 2018 (article)

Abstract
Kernel Bayesian inference is a powerful nonparametric approach to performing Bayesian inference in reproducing kernel Hilbert spaces or feature spaces. In this approach, kernel means are estimated instead of probability distributions, and these estimates can be used for subsequent probabilistic operations (as for inference in graphical models) or in computing the expectations of smooth functions, for instance. Various algorithms for kernel Bayesian inference have been obtained by combining basic rules such as the kernel sum rule (KSR), kernel chain rule, kernel product rule and kernel Bayes' rule. However, the current framework only deals with fully nonparametric inference (i.e., all conditional relations are learned nonparametrically), and it does not allow for flexible combinations of nonparametric and parametric inference, which are practically important. Our contribution is in providing a novel technique to realize such combinations. We introduce a new KSR referred to as the model-based KSR (Mb-KSR), which employs the sum rule in feature spaces under a parametric setting. Incorporating the Mb-KSR into existing kernel Bayesian framework provides a richer framework for hybrid (nonparametric and parametric) kernel Bayesian inference. As a practical application, we propose a novel filtering algorithm for state space models based on the Mb-KSR, which combines the nonparametric learning of an observation process using kernel mean embedding and the additive Gaussian noise model for a state transition process. While we focus on additive Gaussian noise models in this study, the idea can be extended to other noise models, such as the Cauchy and alpha-stable noise models.

pn

arXiv [BibTex]

arXiv [BibTex]


A probabilistic model for the numerical solution of initial value problems
A probabilistic model for the numerical solution of initial value problems

Schober, M., Särkkä, S., Philipp Hennig,

Statistics and Computing, Springer US, 2018 (article)

Abstract
We study connections between ordinary differential equation (ODE) solvers and probabilistic regression methods in statistics. We provide a new view of probabilistic ODE solvers as active inference agents operating on stochastic differential equation models that estimate the unknown initial value problem (IVP) solution from approximate observations of the solution derivative, as provided by the ODE dynamics. Adding to this picture, we show that several multistep methods of Nordsieck form can be recast as Kalman filtering on q-times integrated Wiener processes. Doing so provides a family of IVP solvers that return a Gaussian posterior measure, rather than a point estimate. We show that some such methods have low computational overhead, nontrivial convergence order, and that the posterior has a calibrated concentration rate. Additionally, we suggest a step size adaptation algorithm which completes the proposed method to a practically useful implementation, which we experimentally evaluate using a representative set of standard codes in the DETEST benchmark set.

pn

PDF Code DOI Project Page [BibTex]

2016


Gaussian Process-Based Predictive Control for Periodic Error Correction
Gaussian Process-Based Predictive Control for Periodic Error Correction

Klenske, E. D., Zeilinger, M., Schölkopf, B., Hennig, P.

IEEE Transactions on Control Systems Technology , 24(1):110-121, 2016 (article)

ei pn

PDF DOI [BibTex]

2016


PDF DOI [BibTex]


Dual Control for Approximate Bayesian Reinforcement Learning
Dual Control for Approximate Bayesian Reinforcement Learning

Klenske, E. D., Hennig, P.

Journal of Machine Learning Research, 17(127):1-30, 2016 (article)

ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]

2012


Entropy Search for Information-Efficient Global Optimization
Entropy Search for Information-Efficient Global Optimization

Hennig, P., Schuler, C.

Journal of Machine Learning Research, 13, pages: 1809-1837, -, June 2012 (article)

Abstract
Contemporary global optimization algorithms are based on local measures of utility, rather than a probability measure over location and value of the optimum. They thus attempt to collect low function values, not to learn about the optimum. The reason for the absence of probabilistic global optimizers is that the corresponding inference problem is intractable in several ways. This paper develops desiderata for probabilistic optimization algorithms, then presents a concrete algorithm which addresses each of the computational intractabilities with a sequence of approximations and explicitly adresses the decision problem of maximizing information gain from each evaluation.

ei pn

PDF Web Project Page [BibTex]

2012


PDF Web Project Page [BibTex]