Header logo is


2011


no image
Two-locus association mapping in subquadratic time

Achlioptas, P., Schölkopf, B., Borgwardt, K.

In pages: 726-734, (Editors: C Apté and J Ghosh and P Smyth), ACM Press, New York, NY, USA, 17th ACM SIGKKD Conference on Knowledge Discovery and Data Mining (KDD) , August 2011 (inproceedings)

Abstract
Genome-wide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2-locus genome-wide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is data-dependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ~ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.

ei

Web DOI [BibTex]

2011


Web DOI [BibTex]


no image
A Novel Active Learning Strategy for Domain Adaptation in the Classification of Remote Sensing Images

Persello, C., Bruzzone, L.

In pages: 3720-3723 , IEEE, Piscataway, NJ, USA, IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 2011 (inproceedings)

Abstract
We present a novel technique for addressing domain adaptation problems in the classification of remote sensing images with active learning. Domain adaptation is the important problem of adapting a supervised classifier trained on a given image (source domain) to the classification of another similar (but not identical) image (target domain) acquired on a different area, or on the same area at a different time. The main idea of the proposed approach is to iteratively labeling and adding to the training set the minimum number of the most informative samples from target domain, while removing the source-domain samples that does not fit with the distributions of the classes in the target domain. In this way, the classification system exploits already available information, i.e., the labeled samples of source domain, in order to minimize the number of target domain samples to be labeled, thus reducing the cost associated to the definition of the training set for the classification of the target domain. Experimental results obtained in the classification of a hyperspectral image confirm the effectiveness of the proposed technique.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Reinforcement Learning to adjust Robot Movements to New Situations

Kober, J., Oztop, E., Peters, J.

In pages: 2650-2655, (Editors: Walsh, T.), AAAI Press, Menlo Park, CA, USA, Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), July 2011 (inproceedings)

Abstract
Many complex robot motor skills can be represented using elementary movements, and there exist efficient techniques for learning parametrized motor plans using demonstrations and self-improvement. However with current techniques, in many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor plan exists that covers a related situation. A method is needed that modulates the elementary movement through the meta-parameters of its representation. In this paper, we describe how to learn such mappings from circumstances to meta-parameters using reinforcement learning. In particular we use a kernelized version of the reward-weighted regression. We show two robot applications of the presented setup in robotic domains; the generalization of throwing movements in darts, and of hitting movements in table tennis. We demonstrate that both tasks can be learned successfully using simulated and real robots.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Online submodular minimization for combinatorial structures

Jegelka, S., Bilmes, J.

In pages: 345-352, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonseparable. In particular, we show regret bounds for three Hannan-consistent strategies that capture different settings. Our results also tighten a regret bound for unconstrained online submodular minimization.

ei

PDF PDF Web [BibTex]

PDF PDF Web [BibTex]


no image
PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off

Seldin, Y., Cesa-Bianchi, N., Laviolette, F., Auer, P., Shawe-Taylor, J., Peters, J.

In pages: 1-8, ICML Workshop on Online Trading of Exploration and Exploitation 2, July 2011 (inproceedings)

Abstract
We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolving martingales.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Detecting low-complexity unobserved causes

Janzing, D., Sgouritsa, E., Stegle, O., Peters, J., Schölkopf, B.

In pages: 383-391, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a \direct" causal link or only due to a connecting causal path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y jx) in the simplex of all distributions of Y . We report encouraging results on semi-empirical data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Support Vector Machines as Probabilistic Models

Franc, V., Zien, A., Schölkopf, B.

In Proceedings of the 28th International Conference on Machine Learning, pages: 665-672, (Editors: L Getoor and T Scheffer), International Machine Learning Society, Madison, WI, USA, ICML, July 2011 (inproceedings)

Abstract
We show how the SVM can be viewed as a maximum likelihood estimate of a class of probabilistic models. This model class can be viewed as a reparametrization of the SVM in a similar vein to the v-SVM reparametrizing the classical (C-)SVM. It is not discriminative, but has a non-uniform marginal. We illustrate the benefits of this new view by rederiving and re-investigating two established SVM-related algorithms.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Identifiability of causal graphs using functional models

Peters, J., Mooij, J., Janzing, D., Schölkopf, B.

In pages: 589-598, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independencebased causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical fndings.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Pruning nearest neighbor cluster trees

Kpotufe, S., von Luxburg, U.

In pages: 225-232, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
Nearest neighbor ($k$-NN) graphs are widely used in machine learning and data mining applications, and our aim is to better understand what they reveal about the cluster structure of the unknown underlying distribution of points. Moreover, is it possible to identify spurious structures that might arise due to sampling variability? Our first contribution is a statistical analysis that reveals how certain subgraphs of a $k$-NN graph form a consistent estimator of the cluster tree of the underlying distribution of points. Our second and perhaps most important contribution is the following finite sample guarantee. We carefully work out the tradeoff between aggressive and conservative pruning and are able to guarantee the removal of all spurious cluster structures while at the same time guaranteeing the recovery of salient clusters. This is the first such finite sample result in the context of clustering.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Testing whether linear equations are causal: A free probability theory approach

Zscheischler, J., Janzing, D., Zhang, K.

In pages: 839-847, (Editors: Cozman, F.G. , A. Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
We propose a method that infers whether linear relations between two high-dimensional variables X and Y are due to a causal influence from X to Y or from Y to X. The earlier proposed so-called Trace Method is extended to the regime where the dimension of the observed variables exceeds the sample size. Based on previous work, we postulate conditions that characterize a causal relation between X and Y . Moreover, we describe a statistical test and argue that both causal directions are typically rejected if there is a common cause. A full theoretical analysis is presented for the deterministic case but our approach seems to be valid for the noisy case, too, for which we additionally present an approach based on a sparsity constraint. The discussed method yields promising results for both simulated and real world data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
On the information-theoretic structure of distributed measurements

Balduzzi, D.

In pages: 1-15, Elsevier Science, Amsterdam, Netherlands, 7th International Workshop on Developments of Computational Models (DCM), July 2011 (inproceedings)

Abstract
The internal structure of a measuring device, which depends on what its components are and how they are organized, determines how it categorizes its inputs. This paper presents a geometric approach to studying the internal structure of measurements performed by distributed systems such as probabilistic cellular automata. It constructs the quale, a family of sections of a suitably defined presheaf, whose elements correspond to the measurements performed by all subsystems of a distributed system. Using the quale we quantify (i) the information generated by a measurement; (ii) the extent to which a measurement is context-dependent; and (iii) whether a measurement is decomposable into independent submeasurements, which turns out to be equivalent to context-dependence. Finally, we show that only indecomposable measurements are more informative than the sum of their submeasurements.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Towards Brain-Robot Interfaces in Stroke Rehabilitation

Gomez Rodriguez, M., Grosse-Wentrup, M., Hill, J., Gharabaghi, A., Schölkopf, B., Peters, J.

In pages: 6, IEEE, Piscataway, NJ, USA, 12th International Conference on Rehabilitation Robotics (ICORR), July 2011 (inproceedings)

Abstract
A neurorehabilitation approach that combines robot-assisted active physical therapy and Brain-Computer Interfaces (BCIs) may provide an additional mileage with respect to traditional rehabilitation methods for patients with severe motor impairment due to cerebrovascular brain damage (e.g., stroke) and other neurological conditions. In this paper, we describe the design and modes of operation of a robot-based rehabilitation framework that enables artificial support of the sensorimotor feedback loop. The aim is to increase cortical plasticity by means of Hebbian-type learning rules. A BCI-based shared-control strategy is used to drive a Barret WAM 7-degree-of-freedom arm that guides a subject's arm. Experimental validation of our setup is carried out both with healthy subjects and stroke patients. We review the empirical results which we have obtained to date, and argue that they support the feasibility of future rehabilitative treatments employing this novel approach.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Uncovering the Temporal Dynamics of Diffusion Networks

Gomez Rodriguez, M., Balduzzi, D., Schölkopf, B.

In Proceedings of the 28th International Conference on Machine Learning, pages: 561-568, (Editors: L. Getoor and T. Scheffer), Omnipress, Madison, WI, USA, ICML, July 2011 (inproceedings)

Abstract
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected -- but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data -- observed infection times of nodes -- we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Risk-Based Generalizations of f-divergences

García-García, D., von Luxburg, U., Santos-Rodríguez, R.

In pages: 417-424, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
We derive a generalized notion of f-divergences, called (f,l)-divergences. We show that this generalization enjoys many of the nice properties of f-divergences, although it is a richer family. It also provides alternative definitions of standard divergences in terms of surrogate risks. As a first practical application of this theory, we derive a new estimator for the Kulback-Leibler divergence that we use for clustering sets of vectors.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel-based Conditional Independence Test and Application in Causal Discovery

Zhang, K., Peters, J., Janzing, D., Schölkopf, B.

In pages: 804-813, (Editors: FG Cozman and A Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)

Abstract
Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Approximation Bounds for Inference using Cooperative Cut

Jegelka, S., Bilmes, J.

In pages: 577-584, (Editors: Getoor, L. , T. Scheffer), International Machine Learning Society, Madison, WI, USA, 28th International Conference on Machine Learning (ICML), July 2011 (inproceedings)

Abstract
We analyze a family of probability distributions that are characterized by an embedded combinatorial structure. This family includes models having arbitrary treewidth and arbitrary sized factors. Unlike general models with such freedom, where the “most probable explanation” (MPE) problem is inapproximable, the combinatorial structure within our model, in particular the indirect use of submodularity, leads to several MPE algorithms that all have approximation guarantees.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Multi-label cooperative cuts

Jegelka, S., Bilmes, J.

In pages: 1-4, CVPR Workshop on Inference in Graphical Models with Structured Potentials, June 2011 (inproceedings)

Abstract
Recently, a family of global, non-submodular energy functions has been proposed that is expressed as coupling edges in a graph cut. This formulation provides a rich modelling framework and also leads to efficient approximate inference algorithms. So far, the results addressed binary random variables. Here, we extend these results to the multi-label case, and combine edge coupling with move-making algorithms.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Submodularity beyond submodular energies: coupling edges in graph cuts

Jegelka, S., Bilmes, J.

In pages: 1897-1904, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011 (inproceedings)

Abstract
We propose a new family of non-submodular global energy functions that still use submodularity internally to couple edges in a graph cut. We show it is possible to develop an efficient approximation algorithm that, thanks to the internal submodularity, can use standard graph cuts as a subroutine. We demonstrate the advantages of edge coupling in a natural setting, namely image segmentation. In particular, for finestructured objects and objects with shading variation, our structured edge coupling leads to significant improvements over standard approaches.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Finding dependencies between frequencies with the kernel cross-spectral density

Besserve, M., Janzing, D., Logothetis, N., Schölkopf, B.

In pages: 2080-2083 , IEEE, Piscataway, NJ, USA, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2011 (inproceedings)

Abstract
Cross-spectral density (CSD), is widely used to find linear dependency between two real or complex valued time series. We define a non-linear extension of this measure by mapping the time series into two Reproducing Kernel Hilbert Spaces. The dependency is quantified by the Hilbert Schmidt norm of a cross-spectral density operator between these two spaces. We prove that, by choosing a characteristic kernel for the mapping, this quantity detects any pairwise dependency between the time series. Then we provide a fast estimator for the Hilbert-Schmidt norm based on the Fast Fourier Trans form. We demonstrate the interest of this approach to quantify non-linear dependencies between frequency bands of simulated signals and intra-cortical neural recordings.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Trajectory Planning for Optimal Robot Catching in Real-Time

Lampariello, R., Nguyen-Tuong, D., Castellini, C., Hirzinger, G., Peters, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2011), pages: 3719-3726 , IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2011 (inproceedings)

Abstract
Many real-world tasks require fast planning of highly dynamic movements for their execution in real-time. The success often hinges on quickly finding one of the few plans that can achieve the task at all. A further challenge is to quickly find a plan which optimizes a desired cost. In this paper, we will discuss this problem in the context of catching small flying targets efficiently. This can be formulated as a non-linear optimization problem where the desired trajectory is encoded by an adequate parametric representation. The optimizer generates an energy-optimal trajectory by efficiently using the robot kinematic redundancy while taking into account maximal joint motion, collision avoidance and local minima. To enable the resulting method to work in real-time, examples of the global planner are generalized using nearest neighbour approaches, Support Vector Machines and Gaussian process regression, which are compared in this context. Evaluations indicate that the presented method is highly efficient in complex tasks such as ball-catching.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Statistical Learning Theory: Models, Concepts, and Results

von Luxburg, U., Schölkopf, B.

In Handbook of the History of Logic, Vol. 10: Inductive Logic, 10, pages: 651-706, (Editors: Gabbay, D. M., Hartmann, S. and Woods, J. H.), Elsevier North Holland, Amsterdam, Netherlands, May 2011 (inbook)

Abstract
Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms and is arguably one of the most beautifully developed branches of artificial intelligence in general. It originated in Russia in the 1960s and gained wide popularity in the 1990s following the development of the so-called Support Vector Machine (SVM), which has become a standard tool for pattern recognition in a variety of domains ranging from computer vision to computational biology. Providing the basis of new learning algorithms, however, was not the only motivation for developing statistical learning theory. It was just as much a philosophical one, attempting to answer the question of what it is that allows us to draw valid conclusions from empirical data. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We do not assume that the reader has a deep background in mathematics, statistics, or computer science. Given the nature of the subject matter, however, some familiarity with mathematical concepts and notations and some intuitive understanding of basic probability is required. There exist many excellent references to more technical surveys of the mathematics of statistical learning theory: the monographs by one of the founders of statistical learning theory ([Vapnik, 1995], [Vapnik, 1998]), a brief overview over statistical learning theory in Section 5 of [Sch{\"o}lkopf and Smola, 2002], more technical overview papers such as [Bousquet et al., 2003], [Mendelson, 2003], [Boucheron et al., 2005], [Herbrich and Williamson, 2002], and the monograph [Devroye et al., 1996].

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Fronto-Parietal Gamma-Oscillations are a Cause of Performance Variation in Brain-Computer Interfacing

Grosse-Wentrup, M.

In pages: 384-387, IEEE, Piscataway, NJ, USA, 5th International IEEE/EMBS Conference on Neural Engineering (NER) , May 2011 (inproceedings)

Abstract
In recent work, we have provided evidence that fronto-parietal γ-oscillations of the electromagnetic field of the brain modulate the sensorimotor-rhythm. It is unclear, however, what impact this effect may have on explaining and addressing within-subject performance variations of brain-computer interfaces (BCIs). In this paper, we provide evidence that on a group-average classification accuracies in a two-class motor-imagery paradigm differ by up to 22.2% depending on the state of fronto-parietal γ-power. As such, this effect may have a large impact on the design of future BCI-systems. We further investigate whether adapting classification procedures to the current state of γ-power improves classification accuracy, and discuss other approaches to exploiting this effect.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Flexible Hybrid Framework for Modeling Complex Manipulation Tasks

Kroemer, O., Peters, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2011), pages: 1856-1861 , IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2011 (inproceedings)

Abstract
Future service robots will need to perform a wide range of tasks using various objects. In order to perform complex tasks, robots require a suitable internal representation of the task. We propose a hybrid framework for representing manipulation tasks, which combines continuous motion planning and discrete task-level planning. In addition, we use a mid-level planner to optimize individual actions according to the plan. The proposed framework incorporates biologically-inspired concepts, such as affordances and motor primitives, in order to efficiently plan for manipulation tasks. The final framework is modular, can generalize well to different situations, and is straightforward to expand. Our demonstrations also show how the use of affordances and mid-level planning can lead to improved performance.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Fast Convergent Algorithms for Expectation Propagation Approximate Bayesian Inference

Seeger, M., Nickisch, H.

In JMLR Workshop and Conference Proceedings Volume 15: AISTATS 2011, pages: 652-660, (Editors: Gordon, G. , D. Dunson, M. Dudík ), MIT Press, Cambridge, MA, USA, 14th International Conference on Artificial Intelligence and Statistics, April 2011 (inproceedings)

Abstract
We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuous-variable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from (Opper&Winther, 2005) with covariance decoupling techniques (Wipf&Nagarajan, 2008; Nickisch&Seeger, 2009), it runs at least an order of magnitude faster than the most common EP solver.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Active Exploration for Robot Parameter Selection in Episodic Reinforcement Learning

Kroemer, O., Peters, J.

In Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2011), pages: 25-31, IEEE, Piscataway, NJ, USA, IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), April 2011 (inproceedings)

Abstract
As the complexity of robots and other autonomous systems increases, it becomes more important that these systems can adapt and optimize their settings actively. However, such optimization is rarely trivial. Sampling from the system is often expensive in terms of time and other costs, and excessive sampling should therefore be avoided. The parameter space is also usually continuous and multi-dimensional. Given the inherent exploration-exploitation dilemma of the problem, we propose treating it as an episodic reinforcement learning problem. In this reinforcement learning framework, the policy is defined by the system's parameters and the rewards are given by the system's performance. The rewards accumulate during each episode of a task. In this paper, we present a method for efficiently sampling and optimizing in continuous multidimensional spaces. The approach is based on Gaussian process regression, which can represent continuous non-linear mappings from parameters to system performance. We employ an upper confidence bound policy, which explicitly manages the trade-off between exploration and exploitation. Unlike many other policies for this kind of problem, we do not rely on a discretization of the action space. The presented method was evaluated on a real robot. The robot had to learn grasping parameters in order to adapt its grasping execution to different objects. The proposed method was also tested on a more general gain tuning problem. The results of the experiments show that the presented method can quickly determine suitable parameters and is applicable to real online learning applications.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Relative Entropy Inverse Reinforcement Learning

Boularias, A., Kober, J., Peters, J.

In JMLR Workshop and Conference Proceedings Volume 15: AISTATS 2011, pages: 182-189, (Editors: Gordon, G. , D. Dunson, M. Dudík ), MIT Press, Cambridge, MA, USA, Fourteenth International Conference on Artificial Intelligence and Statistics, April 2011 (inproceedings)

Abstract
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient tool for generalizing the demonstration, based on the assumption that the expert is optimally acting in a Markov Decision Process (MDP). Most of the past work on IRL requires that a (near)-optimal policy can be computed for different reward functions. However, this requirement can hardly be satisfied in systems with a large, or continuous, state space. In this paper, we propose a model-free IRL algorithm, where the relative entropy between the empirical distribution of the state-action trajectories under a uniform policy and their distribution under the learned policy is minimized by stochastic gradient descent. We compare this new approach to well-known IRL algorithms using approximate MDP models. Empirical results on simulated car racing, gridworld and ball-in-a-cup problems show that our approach is able to learn good policies from a small number of demonstrations.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Removing noise from astronomical images using a pixel-specific noise model

Burger, H., Schölkopf, B., Harmeling, S.

In pages: 8, (Editors: H Lensch and SL Narasimhan and ME Testorf), IEEE, Piscataway, NJ, USA, IEEE International Conference on Computational Photography (ICCP), April 2011 (inproceedings)

Abstract
For digital photographs of astronomical objects, where exposure times are usually long and ISO settings high, the so-called dark-current is a significant source of noise. Dark-current refers to thermally generated electrons and is therefore present even in the absence of light. This paper presents a novel approach for denoising astronomical images that have been corrupted by dark-current noise. Our method relies on a probabilistic description of the dark-current of each pixel of a given camera. The noise model is then combined with an image prior which is adapted to astronomical images. In a laboratory environment, we use a black and white CCD camera containing a cooling unit and show that our method is superior to existing methods in terms of root mean squared error. Furthermore, we show that our method is practically relevant by providing visually more appealing results on astronomical photographs taken with a single lens reflex CMOS camera.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Towards Motor Skill Learning for Robotics

Peters, J., Mülling, K., Kober, J., Nguyen-Tuong, D., Kroemer, O.

In Robotics Research, pages: 469-482, (Editors: Pradalier, C. , R. Siegwart, G. Hirzinger), Springer, Berlin, Germany, 14th International Symposium on Robotics Research (ISRR), January 2011 (inproceedings)

Abstract
Learning robots that can acquire new motor skills and refine existing one has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early steps towards this goal in the 1980s made clear that reasoning and human insights will not suffice. Instead, new hope has been offered by the rise of modern machine learning approaches. However, to date, it becomes increasingly clear that off-the-shelf machine learning approaches will not suffice for motor skill learning as these methods often do not scale into the high-dimensional domains of manipulator and humanoid robotics nor do they fulfill the real-time requirement of our domain. As an alternative, we propose to break the generic skill learning problem into parts that we can understand well from a robotics point of view. After designing appropriate learning approaches for these basic components, these will serve as the ingredients of a general approach to motor skill learning. In this paper, we discuss our recent and current progress in this direction. For doing so, we present our work on learning to control, on learning elementary movements as well as our steps towards learning of complex tasks. We show several evaluations both using real robots as well as physically realistic simulations.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning Visual Representations for Interactive Systems

Piater, J., Jodogne, S., Detry, R., Kraft, D., Krüger, N., Kroemer, O., Peters, J.

In Robotics Research, pages: 399-416, (Editors: Pradalier, C. , R. Siegwart, G. Hirzinger), Springer, Berlin, Germany, 14th International Symposium on Robotics Research (ISRR), January 2011 (inproceedings)

Abstract
We describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a non-parametric representation of grasp success likelihoods over gripper poses, which we call a gra sp d ensi ty. Thus, object detection in a novel scene simultaneously produces suitable grasping options.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Robot Learning

Peters, J., Tedrake, R., Roy, N., Morimoto, J.

In Encyclopedia of Machine Learning, pages: 865-869, Encyclopedia of machine learning, (Editors: Sammut, C. and Webb, G. I.), Springer, New York, NY, USA, January 2011 (inbook)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Non-Parametric Approach to Dynamic Programming

Kroemer, O., Peters, J.

In Advances in Neural Information Processing Systems 24, pages: 1719-1727, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
In this paper, we consider the problem of policy evaluation for continuousstate systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin’s method. Furthermore, we also present a unified view of several well-known policy evaluation methods. In particular, we show that the same Galerkin method can be used to derive Least-Squares Temporal Difference learning, Kernelized Temporal Difference learning, and a discrete-state Dynamic Programming solution, as well as our proposed method. In a numerical evaluation of these algorithms, the proposed approach performed better than the other methods.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Transfer Learning with Copulas

Lopez-Paz, D., Hernandez-Lobato, J.

In pages: 2, NIPS, Workshop on Copulas in Machine Learning, 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Denoising sparse noise via online dictionary learning

Cherian, A., Sra, S., Papanikolopoulos, N.

In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, pages: 2060 -2063, IEEE, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
What You Expect Is What You Get? Potential Use of Contingent Negative Variation for Passive BCI Systems in Gaze-Based HCI

Ihme, K., Zander, TO.

In Affective Computing and Intelligent Interaction, 6975, pages: 447-456, Lecture Notes in Computer Science, (Editors: D’Mello, S., Graesser, A., Schuller, B. and Martin, J.-C.), Springer, Berlin, Germany, 2011 (inbook)

Abstract
When using eye movements for cursor control in human-computer interaction (HCI), it may be difficult to find an appropriate substitute for the click operation. Most approaches make use of dwell times. However, in this context the so-called Midas-Touch-Problem occurs which means that the system wrongly interprets fixations due to long processing times or spontaneous dwellings of the user as command. Lately it has been shown that brain-computer interface (BCI) input bears good prospects to overcome this problem using imagined hand movements to elicit a selection. The current approach tries to develop this idea further by exploring potential signals for the use in a passive BCI, which would have the advantage that the brain signals used as input are generated automatically without conscious effort of the user. To explore event-related potentials (ERPs) giving information about the user’s intention to select an object, 32-channel electroencephalography (EEG) was recorded from ten participants interacting with a dwell-time-based system. Comparing ERP signals during the dwell time with those occurring during fixations on a neutral cross hair, a sustained negative slow cortical potential at central electrode sites was revealed. This negativity might be a contingent negative variation (CNV) reflecting the participants’ anticipation of the upcoming selection. Offline classification suggests that the CNV is detectable in single trial (mean accuracy 74.9 %). In future, research on the CNV should be accomplished to ensure its stable occurence in human-computer interaction and render possible its use as a potential substitue for the click operation.

ei

DOI [BibTex]

DOI [BibTex]


no image
PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Deisenroth, MP., Rasmussen, CE.

In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pages: 465-472, (Editors: L Getoor and T Scheffer), Omnipress, 2011 (inproceedings)

Abstract
In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, PILCO can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-of-the-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks.

ei

Web [BibTex]

Web [BibTex]


no image
Kernel Bayes’ Rule

Fukumizu, K., Song, L., Gretton, A.

In Advances in Neural Information Processing Systems 24, pages: 1737-1745, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Optimal Reinforcement Learning for Gaussian Systems

Hennig, P.

In Advances in Neural Information Processing Systems 24, pages: 325-333, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
The exploration-exploitation trade-off is among the central challenges of reinforcement learning. The optimal Bayesian solution is intractable in general. This paper studies to what extent analytic statements about optimal learning are possible if all beliefs are Gaussian processes. A first order approximation of learning of both loss and dynamics, for nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics, is described by an infinite-dimensional partial differential equation. An approximate finitedimensional projection gives an impression for how this result may be helpful.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]


no image
Efficient inference in matrix-variate Gaussian models with iid observation noise

Stegle, O., Lippert, C., Mooij, J., Lawrence, N., Borgwardt, K.

In Advances in Neural Information Processing Systems 24, pages: 630-638, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
Inference in matrix-variate Gaussian models has major applications for multioutput prediction and joint learning of row and column covariances from matrixvariate data. Here, we discuss an approach for efficient inference in such models that explicitly account for iid observation noise. Computational tractability can be retained by exploiting the Kronecker product between row and column covariance matrices. Using this framework, we show how to generalize the Graphical Lasso in order to learn a sparse inverse covariance between features while accounting for a low-rank confounding covariance between samples. We show practical utility on applications to biology, where we model covariances with more than 100,000 dimensions. We find greater accuracy in recovering biological network structures and are able to better reconstruct the confounders.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel Methods in Bioinformatics

Borgwardt, KM.

In Handbook of Statistical Bioinformatics, pages: 317-334, Springer Handbooks of Computational Statistics ; 3, (Editors: Lu, H.H.-S., Schölkopf, B. and Zhao, H.), Springer, Berlin, Germany, 2011 (inbook)

Abstract
Kernel methods have now witnessed more than a decade of increasing popularity in the bioinformatics community. In this article, we will compactly review this development, examining the areas in which kernel methods have contributed to computational biology and describing the reasons for their success.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Expectation Propagation for the Estimation of Conditional Bivariate Copulas

Hernandez-Lobato, J., Lopez-Paz, D., Gharhamani, Z.

In pages: 2, NIPS, Workshop on Copulas in Machine Learning, 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Efficient Similarity Search for Covariance Matrices via the Jensen-Bregman LogDet Divergence

Cherian, A., Sra, S., Banerjee, A., Papanikolopoulos, N.

In IEEE International Conference on Computer Vision, ICCV 2011, pages: 2399-2406, (Editors: DN Metaxas and L Quan and A Sanfeliu and LJ Van Gool), IEEE, 13th International Conference on Computer Vision (ICCV), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Introducing the detection of auditory error responses based on BCI technology for passive interaction

Zander, TO., Klippel, DM., Scherer, R.

In Proceedings of the 5th International Brain–Computer Interface Conference, pages: 252-255, (Editors: GR Müller-Putz and R Scherer and M Billinger and A Kreilinger and V Kaiser and C Neuper), Graz: Verlag der Technischen Universität, 2011 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Cue Combination: Beyond Optimality

Rosas, P., Wichmann, F.

In Sensory Cue Integration, pages: 144-152, (Editors: Trommershäuser, J., Körding, K. and Landy, M. S.), Oxford University Press, 2011 (inbook)

ei

[BibTex]

[BibTex]


no image
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval

Sra, S., Cherian, A.

In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, LNCS vol 6913, Part III, pages: 318-332, (Editors: D Gunopulos and T Hofmann and D Malerba and M Vazirgiannis), Springer, 22th European Conference on Machine Learning (ECML), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Restricted boltzmann machines as useful tool for detecting oscillatory eeg components

Balderas, D., Zander, TO., Bachl, F., Neuper, C., Scherer, R.

In Proceedings of the 5th International Brain–Computer Interface Conference, pages: 68-71, (Editors: GR Müller-Putz and R Scherer and M Billinger and A Kkreilinger and V Kaiser and C Neuper), Graz: Verlag der Technischen Universität, 2011 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Hierarchical Multitask Structured Output Learning for Large-scale Sequence Segmentation

Görnitz, N., Widmer, C., Zeller, G., Kahles, A., Sonnenburg, S., Rätsch, G.

In Advances in Neural Information Processing Systems 24, pages: 2690-2698, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and FCN Pereira and KQ Weinberger), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Phase transition in the family of p-resistances

Alamgir, M., von Luxburg, U.

In Advances in Neural Information Processing Systems 24, pages: 379-387, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We study the family of p-resistances on graphs for p ≥ 1. This family generalizes the standard resistance distance. We prove that for any fixed graph, for p=1, the p-resistance coincides with the shortest path distance, for p=2 it coincides with the standard resistance distance, and for p → ∞ it converges to the inverse of the minimal s-t-cut in the graph. Secondly, we consider the special case of random geometric graphs (such as k-nearest neighbor graphs) when the number n of vertices in the graph tends to infinity. We prove that an interesting phase-transition takes place. There exist two critical thresholds p^* and p^** such that if p < p^*, then the p-resistance depends on meaningful global properties of the graph, whereas if p > p^**, it only depends on trivial local quantities and does not convey any useful information. We can explicitly compute the critical values: p^* = 1 + 1/(d-1) and p^** = 1 + 1/(d-2) where d is the dimension of the underlying space (we believe that the fact that there is a small gap between p^* and p^** is an artifact of our proofs. We also relate our findings to Laplacian regularization and suggest to use q-Laplacians as regularizers, where q satisfies 1/p^* + 1/q = 1.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
On Fast Approximate Submodular Minimization

Jegelka, S., Lin, H., Bilmes, J.

In Advances in Neural Information Processing Systems 24, pages: 460-468, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We are motivated by an application to extract a representative subset of machine learning training data and by the poor empirical performance we observe of the popular minimum norm algorithm. In fact, for our application, minimum norm can have a running time of about O(n7) (O(n5) oracle calls). We therefore propose a fast approximate method to minimize arbitrary submodular functions. For a large sub-class of submodular functions, the algorithm is exact. Other submodular functions are iteratively approximated by tight submodular upper bounds, and then repeatedly optimized. We show theoretical properties, and empirical results suggest significant speedups over minimum norm while retaining higher accuracies.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
PAC-Bayesian Analysis of Contextual Bandits

Seldin, Y., Auer, P., Laviolette, F., Shawe-Taylor, J., Ortner, R.

In Advances in Neural Information Processing Systems 24, pages: 1683-1691, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We derive an instantaneous (per-round) data-dependent regret bound for stochastic multiarmed bandits with side information (also known as contextual bandits). The scaling of our regret bound with the number of states (contexts) $N$ goes as $\sqrt{N I_{\rho_t}(S;A)}$, where $I_{\rho_t}(S;A)$ is the mutual information between states and actions (the side information) used by the algorithm at round $t$. If the algorithm uses all the side information, the regret bound scales as $\sqrt{N \ln K}$, where $K$ is the number of actions (arms). However, if the side information $I_{\rho_t}(S;A)$ is not fully used, the regret bound is significantly tighter. In the extreme case, when $I_{\rho_t}(S;A) = 0$, the dependence on the number of states reduces from linear to logarithmic. Our analysis allows to provide the algorithm large amount of side information, let the algorithm to decide which side information is relevant for the task, and penalize the algorithm only for the side information that it is using de facto. We also present an algorithm for multiarmed bandits with side information with computational complexity that is a linear in the number of actions.

ei

PDF PDF Web [BibTex]

PDF PDF Web [BibTex]


no image
Fast projections onto L1,q-norm balls for grouped feature selection

Sra, S.

In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, LNCS vol 6913, Part III, pages: 305-317, (Editors: D Gunopulos and T Hofmann and D Malerba and M Vazirgiannis), Springer, 22th European Conference on Machine Learning (ECML), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]