Header logo is


2007


no image
Predicting Structured Data

Bakir, G., Hofmann, T., Schölkopf, B., Smola, A., Taskar, B., Vishwanathan, S.

pages: 360, Advances in neural information processing systems, MIT Press, Cambridge, MA, USA, September 2007 (book)

Abstract
Machine learning develops intelligent computer systems that are able to generalize from previously seen examples. A new domain of machine learning, in which the prediction must satisfy the additional constraints found in structured data, poses one of machine learning’s greatest challenges: learning functional dependencies between arbitrary input and output domains. This volume presents and analyzes the state of the art in machine learning algorithms and theory in this novel field. The contributors discuss applications as diverse as machine translation, document markup, computational biology, and information extraction, among others, providing a timely overview of an exciting field.

ei

Web [BibTex]

2007


Web [BibTex]


no image
Learning with Transformation Invariant Kernels

Walder, C., Chapelle, O.

(165), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2007 (techreport)

Abstract
Abstract. This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite (c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm for arbitrary c.p.d. kernels. For the thin-plate kernel this leads to a classifier with only one parameter (the amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even though the Gaussian involves a second parameter (the length scale).

ei

PDF [BibTex]

PDF [BibTex]


no image
Scalable Semidefinite Programming using Convex Perturbations

Kulis, B., Sra, S., Jegelka, S.

(TR-07-47), University of Texas, Austin, TX, USA, September 2007 (techreport)

Abstract
Several important machine learning problems can be modeled and solved via semidefinite programs. Often, researchers invoke off-the-shelf software for the associated optimization, which can be inappropriate for many applications due to computational and storage requirements. In this paper, we introduce the use of convex perturbations for semidefinite programs (SDPs). Using a particular perturbation function, we arrive at an algorithm for SDPs that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a first-order method which makes it scalable, c) it can easily exploit the structure of a particular SDP to gain efficiency (e.g., when the constraint matrices are low-rank). We demonstrate on several machine learning applications that the proposed algorithm is effective in finding fast approximations to large-scale SDPs.

ei

PDF [BibTex]

PDF [BibTex]


no image
Sparse Multiscale Gaussian Process Regression

Walder, C., Kim, K., Schölkopf, B.

(162), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis functions and any given criteria, this additional flexibility permits approximations no worse and typically better than was previously possible. Although we focus on g.p. regression, the central idea is applicable to all kernel based algorithms, such as the support vector machine. We perform gradient based optimisation of the marginal likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various other sparse g.p. methods. Our approach outperforms the other methods, particularly for the case of very few basis functions, i.e. a very high sparsity ratio.

ei

PDF [BibTex]

PDF [BibTex]


no image
Efficient Subwindow Search for Object Localization

Blaschko, M., Hofmann, T., Lampert, C.

(164), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image. We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem. We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.

ei

PDF [BibTex]

PDF [BibTex]


no image
Cluster Identification in Nearest-Neighbor Graphs

Maier, M., Hein, M., von Luxburg, U.

(163), Max-Planck-Institute for Biological Cybernetics, Tübingen, Germany, May 2007 (techreport)

Abstract
Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are ``identified‘‘: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict ``optimal‘‘ values of k for the mutual and symmetric k-nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

ei

PDF [BibTex]

PDF [BibTex]


no image
Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models: a Variational Approach

Chiappa, S., Barber, D.

(161), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, March 2007 (techreport)

Abstract
We describe two related models to cluster multidimensional time-series under the assumption of an underlying linear Gaussian dynamical process. In the first model, times-series are assigned to the same cluster when they show global similarity in their dynamics, while in the second model times-series are assigned to the same cluster when they show simultaneous similarity. Both models are based on Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models in order to (semi) automatically determine an appropriate number of components in the mixture, and to additionally bias the components to a parsimonious parameterization. The resulting models are formally intractable and to deal with this we describe a deterministic approximation based on a novel implementation of Variational Bayes.

ei

PDF [BibTex]

PDF [BibTex]


no image
Automatic 3D Face Reconstruction from Single Images or Video

Breuer, P., Kim, K., Kienzle, W., Blanz, V., Schölkopf, B.

(160), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2007 (techreport)

Abstract
This paper presents a fully automated algorithm for reconstructing a textured 3D model of a face from a single photograph or a raw video stream. The algorithm is based on a combination of Support Vector Machines (SVMs) and a Morphable Model of 3D faces. After SVM face detection, individual facial features are detected using a novel regression-and classification-based approach, and probabilistically plausible configurations of features are selected to produce a list of candidates for several facial feature positions. In the next step, the configurations of feature points are evaluated using a novel criterion that is based on a Morphable Model and a combination of linear projections. Finally, the feature points initialize a model-fitting procedure of the Morphable Model. The result is a high-resolution 3D surface model.

ei

PDF [BibTex]

PDF [BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

am ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Learning an Outlier-Robust Kalman Filter

Ting, J., Theodorou, E., Schaal, S.

CLMC Technical Report: TR-CLMC-2007-1, Los Angeles, CA, 2007, clmc (techreport)

Abstract
We introduce a modified Kalman filter that performs robust, real-time outlier detection, without the need for manual parameter tuning by the user. Systems that rely on high quality sensory data (for instance, robotic systems) can be sensitive to data containing outliers. The standard Kalman filter is not robust to outliers, and other variations of the Kalman filter have been proposed to overcome this issue. However, these methods may require manual parameter tuning, use of heuristics or complicated parameter estimation procedures. Our Kalman filter uses a weighted least squares-like approach by introducing weights for each data sample. A data sample with a smaller weight has a weaker contribution when estimating the current time step?s state. Using an incremental variational Expectation-Maximization framework, we learn the weights and system dynamics. We evaluate our Kalman filter algorithm on data from a robotic dog.

am

PDF [BibTex]

PDF [BibTex]

2006


no image
Minimal Logical Constraint Covering Sets

Sinz, F., Schölkopf, B.

(155), Max Planck Institute for Biological Cybernetics, Tübingen, December 2006 (techreport)

Abstract
We propose a general framework for computing minimal set covers under class of certain logical constraints. The underlying idea is to transform the problem into a mathematical programm under linear constraints. In this sense it can be seen as a natural extension of the vector quantization algorithm proposed by Tipping and Schoelkopf. We show which class of logical constraints can be cast and relaxed into linear constraints and give an algorithm for the transformation.

ei

PDF [BibTex]

2006


PDF [BibTex]


no image
New Methods for the P300 Visual Speller

Biessmann, F.

(1), (Editors: Hill, J. ), Max-Planck Institute for Biological Cybernetics, Tübingen, Germany, November 2006 (techreport)

ei

PDF [BibTex]

PDF [BibTex]


no image
Geometric Analysis of Hilbert Schmidt Independence criterion based ICA contrast function

Shen, H., Jegelka, S., Gretton, A.

(PA006080), National ICT Australia, Canberra, Australia, October 2006 (techreport)

ei

Web [BibTex]

Web [BibTex]


no image
Semi-Supervised Learning

Chapelle, O., Schölkopf, B., Zien, A.

pages: 508, Adaptive computation and machine learning, MIT Press, Cambridge, MA, USA, September 2006 (book)

Abstract
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research. Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction.

ei

Web [BibTex]

Web [BibTex]


no image
A tutorial on spectral clustering

von Luxburg, U.

(149), Max Planck Institute for Biological Cybernetics, Tübingen, August 2006 (techreport)

Abstract
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. Nevertheless, on the first glance spectral clustering looks a bit mysterious, and it is not obvious to see why it works at all and what it really does. This article is a tutorial introduction to spectral clustering. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

ei

PDF [BibTex]

PDF [BibTex]


no image
Towards the Inference of Graphs on Ordered Vertexes

Zien, A., Raetsch, G., Ong, C.

(150), Max Planck Institute for Biological Cybernetics, Tübingen, August 2006 (techreport)

Abstract
We propose novel methods for machine learning of structured output spaces. Specifically, we consider outputs which are graphs with vertices that have a natural order. We consider the usual adjacency matrix representation of graphs, as well as two other representations for such a graph: (a) decomposing the graph into a set of paths, (b) converting the graph into a single sequence of nodes with labeled edges. For each of the three representations, we propose an encoding and decoding scheme. We also propose an evaluation measure for comparing two graphs.

ei

PDF [BibTex]

PDF [BibTex]


no image
An Automated Combination of Sequence Motif Kernels for Predicting Protein Subcellular Localization

Zien, A., Ong, C.

(146), Max Planck Institute for Biological Cybernetics, Tübingen, April 2006 (techreport)

Abstract
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and require many design decisions from the developer. We propose an elegant and fully automated approach to building a prediction system for protein subcellular localization. We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We further propose a multiclass support vector machine method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we generalize our method to optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Training a Support Vector Machine in the Primal

Chapelle, O.

(147), Max Planck Institute for Biological Cybernetics, Tübingen, April 2006, The version in the "Large Scale Kernel Machines" book is more up to date. (techreport)

Abstract
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and there is no reason for ignoring it. Moreover, from the primal point of view, new families of algorithms for large scale SVM training can be investigated.

ei

PDF [BibTex]

PDF [BibTex]


no image
Cross-Validation Optimization for Structured Hessian Kernel Methods

Seeger, M., Chapelle, O.

Max-Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2006 (techreport)

Abstract
We address the problem of learning hyperparameters in kernel methods for which the Hessian of the objective is structured. We propose an approximation to the cross-validation log likelihood whose gradient can be computed analytically, solving the hyperparameter learning problem efficiently through nonlinear optimization. Crucially, our learning method is based entirely on matrix-vector multiplication primitives with the kernel matrices and their derivatives, allowing straightforward specialization to new kernels or to large datasets. When applied to the problem of multi-way classification, our method scales linearly in the number of classes and gives rise to state-of-the-art results on a remote imaging task.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Gaussian Processes for Machine Learning

Rasmussen, CE., Williams, CKI.

pages: 248, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA, January 2006 (book)

Abstract
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

ei

Web [BibTex]

Web [BibTex]


no image
Statistical Learning of LQG controllers

Theodorou, E.

Technical Report-2006-1, Computational Action and Vision Lab University of Minnesota, 2006, clmc (techreport)

am

PDF [BibTex]

PDF [BibTex]