Header logo is



Thumb xl paraview preview
Design of a visualization scheme for functional connectivity data of Human Brain

Bramlage, L.

Hochschule Osnabrück - University of Applied Sciences, 2017 (thesis)

sf

Bramlage_BSc_2017.pdf [BibTex]

2007


no image
Bayesian Estimators for Robins-Ritov’s Problem

Harmeling, S., Toussaint, M.

(EDI-INF-RR-1189), School of Informatics, University of Edinburgh, October 2007 (techreport)

Abstract
Bayesian or likelihood-based approaches to data analysis became very popular in the field of Machine Learning. However, there exist theoretical results which question the general applicability of such approaches; among those a result by Robins and Ritov which introduce a specific example for which they prove that a likelihood-based estimator will fail (i.e. it does for certain cases not converge to a true parameter estimate, even given infinite data). In this paper we consider various approaches to formulate likelihood-based estimators in this example, basically by considering various extensions of the presumed generative model of the data. We can derive estimators which are very similar to the classical Horvitz-Thompson and which also account for a priori knowledge of an observation probability function.

ei

PDF [BibTex]

2007


PDF [BibTex]


no image
Learning with Transformation Invariant Kernels

Walder, C., Chapelle, O.

(165), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2007 (techreport)

Abstract
Abstract. This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite (c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm for arbitrary c.p.d. kernels. For the thin-plate kernel this leads to a classifier with only one parameter (the amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even though the Gaussian involves a second parameter (the length scale).

ei

PDF [BibTex]

PDF [BibTex]


no image
Scalable Semidefinite Programming using Convex Perturbations

Kulis, B., Sra, S., Jegelka, S.

(TR-07-47), University of Texas, Austin, TX, USA, September 2007 (techreport)

Abstract
Several important machine learning problems can be modeled and solved via semidefinite programs. Often, researchers invoke off-the-shelf software for the associated optimization, which can be inappropriate for many applications due to computational and storage requirements. In this paper, we introduce the use of convex perturbations for semidefinite programs (SDPs). Using a particular perturbation function, we arrive at an algorithm for SDPs that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a first-order method which makes it scalable, c) it can easily exploit the structure of a particular SDP to gain efficiency (e.g., when the constraint matrices are low-rank). We demonstrate on several machine learning applications that the proposed algorithm is effective in finding fast approximations to large-scale SDPs.

ei

PDF [BibTex]

PDF [BibTex]


no image
Sparse Multiscale Gaussian Process Regression

Walder, C., Kim, K., Schölkopf, B.

(162), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis functions and any given criteria, this additional flexibility permits approximations no worse and typically better than was previously possible. Although we focus on g.p. regression, the central idea is applicable to all kernel based algorithms, such as the support vector machine. We perform gradient based optimisation of the marginal likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various other sparse g.p. methods. Our approach outperforms the other methods, particularly for the case of very few basis functions, i.e. a very high sparsity ratio.

ei

PDF [BibTex]

PDF [BibTex]


no image
Efficient Subwindow Search for Object Localization

Blaschko, M., Hofmann, T., Lampert, C.

(164), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2007 (techreport)

Abstract
Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image. We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem. We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.

ei

PDF [BibTex]

PDF [BibTex]


no image
Cluster Identification in Nearest-Neighbor Graphs

Maier, M., Hein, M., von Luxburg, U.

(163), Max-Planck-Institute for Biological Cybernetics, Tübingen, Germany, May 2007 (techreport)

Abstract
Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are ``identified‘‘: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict ``optimal‘‘ values of k for the mutual and symmetric k-nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

ei

PDF [BibTex]

PDF [BibTex]


no image
Exploring model selection techniques for nonlinear dimensionality reduction

Harmeling, S.

(EDI-INF-RR-0960), School of Informatics, University of Edinburgh, March 2007 (techreport)

Abstract
Nonlinear dimensionality reduction (NLDR) methods have become useful tools for practitioners who are faced with the analysis of high-dimensional data. Of course, not all NLDR methods are equally applicable to a particular dataset at hand. Thus it would be useful to come up with model selection criteria that help to choose among different NLDR algorithms. This paper explores various approaches to this problem and evaluates them on controlled data sets. Comprehensive experiments will show that model selection scores based on stability are not useful, while scores based on Gaussian processes are helpful for the NLDR problem.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models: a Variational Approach

Chiappa, S., Barber, D.

(161), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, March 2007 (techreport)

Abstract
We describe two related models to cluster multidimensional time-series under the assumption of an underlying linear Gaussian dynamical process. In the first model, times-series are assigned to the same cluster when they show global similarity in their dynamics, while in the second model times-series are assigned to the same cluster when they show simultaneous similarity. Both models are based on Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Models in order to (semi) automatically determine an appropriate number of components in the mixture, and to additionally bias the components to a parsimonious parameterization. The resulting models are formally intractable and to deal with this we describe a deterministic approximation based on a novel implementation of Variational Bayes.

ei

PDF [BibTex]

PDF [BibTex]


no image
Modeling data using directional distributions: Part II

Sra, S., Jain, P., Dhillon, I.

(TR-07-05), University of Texas, Austin, TX, USA, February 2007 (techreport)

Abstract
High-dimensional data is central to most data mining applications, and only recently has it been modeled via directional distributions. In [Banerjee et al., 2003] the authors introduced the use of the von Mises-Fisher (vMF) distribution for modeling high-dimensional directional data, particularly for text and gene expression analysis. The vMF distribution is one of the simplest directional distributions. TheWatson, Bingham, and Fisher-Bingham distributions provide distri- butions with an increasing number of parameters and thereby commensurately increased modeling power. This report provides a followup study to the initial development in [Banerjee et al., 2003] by presenting Expectation Maximization (EM) procedures for estimating parameters of a mixture of Watson (moW) distributions. The numerical challenges associated with parameter estimation for both of these distributions are significantly more difficult than for the vMF distribution. We develop new numerical approximations for estimating the parameters permitting us to model real- life data more accurately. Our experimental results establish that for certain data sets improved modeling power translates into better results.

ei

PDF [BibTex]

PDF [BibTex]


no image
Automatic 3D Face Reconstruction from Single Images or Video

Breuer, P., Kim, K., Kienzle, W., Blanz, V., Schölkopf, B.

(160), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2007 (techreport)

Abstract
This paper presents a fully automated algorithm for reconstructing a textured 3D model of a face from a single photograph or a raw video stream. The algorithm is based on a combination of Support Vector Machines (SVMs) and a Morphable Model of 3D faces. After SVM face detection, individual facial features are detected using a novel regression-and classification-based approach, and probabilistically plausible configurations of features are selected to produce a list of candidates for several facial feature positions. In the next step, the configurations of feature points are evaluated using a novel criterion that is based on a Morphable Model and a combination of linear projections. Finally, the feature points initialize a model-fitting procedure of the Morphable Model. The result is a high-resolution 3D surface model.

ei

PDF [BibTex]

PDF [BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

am ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Thumb xl mabuse
Denoising archival films using a learned Bayesian model

Moldovan, T. M., Roth, S., Black, M. J.

(CS-07-03), Brown University, Department of Computer Science, 2007 (techreport)

ps

pdf [BibTex]

pdf [BibTex]


no image
Learning an Outlier-Robust Kalman Filter

Ting, J., Theodorou, E., Schaal, S.

CLMC Technical Report: TR-CLMC-2007-1, Los Angeles, CA, 2007, clmc (techreport)

Abstract
We introduce a modified Kalman filter that performs robust, real-time outlier detection, without the need for manual parameter tuning by the user. Systems that rely on high quality sensory data (for instance, robotic systems) can be sensitive to data containing outliers. The standard Kalman filter is not robust to outliers, and other variations of the Kalman filter have been proposed to overcome this issue. However, these methods may require manual parameter tuning, use of heuristics or complicated parameter estimation procedures. Our Kalman filter uses a weighted least squares-like approach by introducing weights for each data sample. A data sample with a smaller weight has a weaker contribution when estimating the current time step?s state. Using an incremental variational Expectation-Maximization framework, we learn the weights and system dynamics. We evaluate our Kalman filter algorithm on data from a robotic dog.

am

PDF [BibTex]

PDF [BibTex]

2005


no image
Popper, Falsification and the VC-dimension

Corfield, D., Schölkopf, B., Vapnik, V.

(145), Max Planck Institute for Biological Cybernetics, November 2005 (techreport)

ei

PDF [BibTex]

2005


PDF [BibTex]


no image
A Combinatorial View of Graph Laplacians

Huang, J.

(144), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2005 (techreport)

Abstract
Discussions about different graph Laplacian, mainly normalized and unnormalized versions of graph Laplacian, have been ardent with respect to various methods in clustering and graph based semi-supervised learning. Previous research on graph Laplacians investigated their convergence properties to Laplacian operators on continuous manifolds. There is still no strong proof on convergence for the normalized Laplacian. In this paper, we analyze different variants of graph Laplacians directly from the ways solving the original graph partitioning problem. The graph partitioning problem is a well-known combinatorial NP hard optimization problem. The spectral solutions provide evidence that normalized Laplacian encodes more reasonable considerations for graph partitioning. We also provide some examples to show their differences.

ei

[BibTex]

[BibTex]


no image
Beyond Pairwise Classification and Clustering Using Hypergraphs

Zhou, D., Huang, J., Schölkopf, B.

(143), Max Planck Institute for Biological Cybernetics, August 2005 (techreport)

Abstract
In many applications, relationships among objects of interest are more complex than pairwise. Simply approximating complex relationships as pairwise ones can lead to loss of information. An alternative for these applications is to analyze complex relationships among data directly, without the need to first represent the complex relationships into pairwise ones. A natural way to describe complex relationships is to use hypergraphs. A hypergraph is a graph in which edges can connect more than two vertices. Thus we consider learning from a hypergraph, and develop a general framework which is applicable to classification and clustering for complex relational data. We have applied our framework to real-world web classification problems and obtained encouraging results.

ei

PDF [BibTex]

PDF [BibTex]


no image
Generalized Nonnegative Matrix Approximations using Bregman Divergences

Sra, S., Dhillon, I.

Univ. of Texas at Austin, June 2005 (techreport)

ei

[BibTex]

[BibTex]


no image
Measuring Statistical Dependence with Hilbert-Schmidt Norms

Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.

(140), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, June 2005 (techreport)

Abstract
We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernel-based independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on HSIC do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernel-based criteria, and of other recently published ICA methods.

ei

PDF [BibTex]

PDF [BibTex]


no image
Consistency of Kernel Canonical Correlation Analysis

Fukumizu, K., Bach, F., Gretton, A.

(942), Institute of Statistical Mathematics, 4-6-7 Minami-azabu, Minato-ku, Tokyo 106-8569 Japan, June 2005 (techreport)

ei

PDF [BibTex]

PDF [BibTex]


no image
Approximate Inference for Robust Gaussian Process Regression

Kuss, M., Pfingsten, T., Csato, L., Rasmussen, C.

(136), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2005 (techreport)

Abstract
Gaussian process (GP) priors have been successfully used in non-parametric Bayesian regression and classification models. Inference can be performed analytically only for the regression model with Gaussian noise. For all other likelihood models inference is intractable and various approximation techniques have been proposed. In recent years expectation-propagation (EP) has been developed as a general method for approximate inference. This article provides a general summary of how expectation-propagation can be used for approximate inference in Gaussian process models. Furthermore we present a case study describing its implementation for a new robust variant of Gaussian process regression. To gain further insights into the quality of the EP approximation we present experiments in which we compare to results obtained by Markov chain Monte Carlo (MCMC) sampling.

ei

PDF [BibTex]

PDF [BibTex]


no image
Maximum-Margin Feature Combination for Detection and Categorization

BakIr, G., Wu, M., Eichhorn, J.

Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2005 (techreport)

Abstract
In this paper we are concerned with the optimal combination of features of possibly different types for detection and estimation tasks in machine vision. We propose to combine features such that the resulting classifier maximizes the margin between classes. In contrast to existing approaches which are non-convex and/or generative we propose to use a discriminative model leading to convex problem formulation and complexity control. Furthermore we assert that decision functions should not compare apples and oranges by comparing features of different types directly. Instead we propose to combine different similarity measures for each different feature type. Furthermore we argue that the question: ”Which feature type is more discriminative for task X?” is ill-posed and show empirically that the answer to this question might depend on the complexity of the decision function.

ei

PDF [BibTex]

PDF [BibTex]


no image
Towards a Statistical Theory of Clustering. Presented at the PASCAL workshop on clustering, London

von Luxburg, U., Ben-David, S.

Presented at the PASCAL workshop on clustering, London, 2005 (techreport)

Abstract
The goal of this paper is to discuss statistical aspects of clustering in a framework where the data to be clustered has been sampled from some unknown probability distribution. Firstly, the clustering of the data set should reveal some structure of the underlying data rather than model artifacts due to the random sampling process. Secondly, the more sample points we have, the more reliable the clustering should be. We discuss which methods can and cannot be used to tackle those problems. In particular we argue that generalization bounds as they are used in statistical learning theory of classification are unsuitable in a general clustering framework. We suggest that the main replacements of generalization bounds should be convergence proofs and stability considerations. This paper should be considered as a road map paper which identifies important questions and potentially fruitful directions for future research about statistical clustering. We do not attempt to present a complete statistical theory of clustering.

ei

PDF [BibTex]

PDF [BibTex]


no image
Approximate Bayesian Inference for Psychometric Functions using MCMC Sampling

Kuss, M., Jäkel, F., Wichmann, F.

(135), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, 2005 (techreport)

Abstract
In psychophysical studies the psychometric function is used to model the relation between the physical stimulus intensity and the observer's ability to detect or discriminate between stimuli of different intensities. In this report we propose the use of Bayesian inference to extract the information contained in experimental data estimate the parameters of psychometric functions. Since Bayesian inference cannot be performed analytically we describe how a Markov chain Monte Carlo method can be used to generate samples from the posterior distribution over parameters. These samples are used to estimate Bayesian confidence intervals and other characteristics of the posterior distribution. In addition we discuss the parameterisation of psychometric functions and the role of prior distributions in the analysis. The proposed approach is exemplified using artificially generate d data and in a case study for real experimental data. Furthermore, we compare our approach with traditional methods based on maximum-likelihood parameter estimation combined with bootstrap techniques for confidence interval estimation. The appendix provides a description of an implementation for the R environment for statistical computing and provides the code for reproducing the results discussed in the experiment section.

ei

PDF [BibTex]

PDF [BibTex]


no image
Linear and Nonlinear Estimation models applied to Hemodynamic Model

Theodorou, E.

Technical Report-2005-1, Computational Action and Vision Lab University of Minnesota, 2005, clmc (techreport)

Abstract
The relation between BOLD signal and neural activity is still poorly understood. The Gaussian Linear Model known as GLM is broadly used in many fMRI data analysis for recovering the underlying neural activity. Although GLM has been proved to be a really useful tool for analyzing fMRI data it can not be used for describing the complex biophysical process of neural metabolism. In this technical report we make use of a system of Stochastic Differential Equations that is based on Buxton model [1] for describing the underlying computational principles of hemodynamic process. Based on this SDE we built a Kalman Filter estimator so as to estimate the induced neural signal as well as the blood inflow under physiologic and sensor noise. The performance of Kalman Filter estimator is investigated under different physiologic noise characteristics and measurement frequencies.

am

PDF [BibTex]

PDF [BibTex]

1998


no image
Generalization bounds and learning rates for Regularized principal manifolds

Smola, A., Williamson, R., Schölkopf, B.

NeuroCOLT, 1998, NeuroColt2-TR 1998-027 (techreport)

ei

[BibTex]

1998


[BibTex]


no image
Generalization Bounds for Convex Combinations of Kernel Functions

Smola, A., Williamson, R., Schölkopf, B.

Royal Holloway College, 1998 (techreport)

ei

[BibTex]

[BibTex]


no image
Generalization Performance of Regularization Networks and Support Vector Machines via Entropy Numbers of Compact Operators

Williamson, R., Smola, A., Schölkopf, B.

(19), NeuroCOLT, 1998, Accepted for publication in IEEE Transactions on Information Theory (techreport)

ei

[BibTex]

[BibTex]


no image
Quantization Functionals and Regularized PrincipalManifolds

Smola, A., Mika, S., Schölkopf, B.

NeuroCOLT, 1998, NC2-TR-1998-028 (techreport)

ei

[BibTex]

[BibTex]


no image
Support Vector Machine Reference Manual

Saunders, C., Stitson, M., Weston, J., Bottou, L., Schölkopf, B., Smola, A.

(CSD-TR-98-03), Department of Computer Science, Royal Holloway, University of London, 1998 (techreport)

ei

PostScript [BibTex]

PostScript [BibTex]

1997


no image
Homing by parameterized scene matching

Franz, M., Schölkopf, B., Bülthoff, H.

(46), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, Febuary 1997 (techreport)

Abstract
In visual homing tasks, animals as well as robots can compute their movements from the current view and a snapshot taken at a home position. Solving this problem exactly would require knowledge about the distances to visible landmarks, information, which is not directly available to passive vision systems. We propose a homing scheme that dispenses with accurate distance information by using parameterized disparity fields. These are obtained from an approximation that incorporates prior knowledge about perspective distortions of the visual environment. A mathematical analysis proves that the approximation does not prevent the scheme from approaching the goal with arbitrary accuracy. Mobile robot experiments are used to demonstrate the practical feasibility of the approach.

ei

[BibTex]

1997


[BibTex]