Header logo is


2020


A Gamified App that Helps People Overcome Self-Limiting Beliefs by Promoting Metacognition
A Gamified App that Helps People Overcome Self-Limiting Beliefs by Promoting Metacognition

Amo, V., Lieder, F.

SIG 8 Meets SIG 16, September 2020 (conference) Accepted

Abstract
Previous research has shown that approaching learning with a growth mindset is key for maintaining motivation and overcoming setbacks. Mindsets are systems of beliefs that people hold to be true. They influence a person's attitudes, thoughts, and emotions when they learn something new or encounter challenges. In clinical psychology, metareasoning (reflecting on one's mental processes) and meta-awareness (recognizing thoughts as mental events instead of equating them to reality) have proven effective for overcoming maladaptive thinking styles. Hence, they are potentially an effective method for overcoming self-limiting beliefs in other domains as well. However, the potential of integrating assisted metacognition into mindset interventions has not been explored yet. Here, we propose that guiding and training people on how to leverage metareasoning and meta-awareness for overcoming self-limiting beliefs can significantly enhance the effectiveness of mindset interventions. To test this hypothesis, we develop a gamified mobile application that guides and trains people to use metacognitive strategies based on Cognitive Restructuring (CR) and Acceptance Commitment Therapy (ACT) techniques. The application helps users to identify and overcome self-limiting beliefs by working with aversive emotions when they are triggered by fixed mindsets in real-life situations. Our app aims to help people sustain their motivation to learn when they face inner obstacles (e.g. anxiety, frustration, and demotivation). We expect the application to be an effective tool for helping people better understand and develop the metacognitive skills of emotion regulation and self-regulation that are needed to overcome self-limiting beliefs and develop growth mindsets.

re

A gamified app that helps people overcome self-limiting beliefs by promoting metacognition [BibTex]


no image
Kernel Conditional Moment Test via Maximum Moment Restriction

Muandet, K., Jitkrittum, W., Kübler, J. M.

Proceedings of the 36th International Conference on Uncertainty in Artificial Intelligence (UAI), August 2020 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Bayesian Online Prediction of Change Points

Agudelo-España, D., Gomez-Gonzalez, S., Bauer, S., Schölkopf, B., Peters, J.

Proceedings of the 36th International Conference on Uncertainty in Artificial Intelligence (UAI), August 2020 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
How to navigate everyday distractions: Leveraging optimal feedback to train attention control

Wirzberger, M., Lado, A., Eckerstorfer, L., Oreshnikov, I., Passy, J., Stock, A., Shenhav, A., Lieder, F.

Annual Meeting of the Cognitive Science Society, July 2020 (conference) Accepted

Abstract
To stay focused on their chosen tasks, people have to inhibit distractions. The underlying attention control skills can improve through reinforcement learning, which can be accelerated by giving feedback. We applied the theory of metacognitive reinforcement learning to develop a training app that gives people optimal feedback on their attention control while they are working or studying. In an eight-day field experiment with 99 participants, we investigated the effect of this training on people's productivity, sustained attention, and self-control. Compared to a control condition without feedback, we found that participants receiving optimal feedback learned to focus increasingly better (f = .08, p < .01) and achieved higher productivity scores (f = .19, p < .01) during the training. In addition, they evaluated their productivity more accurately (r = .12, p < .01). However, due to asymmetric attrition problems, these findings need to be taken with a grain of salt.

re sf

How to navigate everyday distractions: Leveraging optimal feedback to train attention control DOI Project Page [BibTex]


no image
Algorithmic Recourse: from Counterfactual Explanations to Interventions

Karimi, A., Schölkopf, B., Valera, I.

37th International Conference on Machine Learning (ICML), July 2020 (conference) Submitted

ei plg

[BibTex]

[BibTex]


no image
Leveraging Machine Learning to Automatically Derive Robust Planning Strategies from Biased Models of the Environment

Kemtur, A., Jain, Y. R., Mehta, A., Callaway, F., Consul, S., Stojcheski, J., Lieder, F.

CogSci 2020, July 2020, Anirudha Kemtur and Yash Raj Jain contributed equally to this publication. (conference)

Abstract
Teaching clever heuristics is a promising approach to improve decision-making. We can leverage machine learning to discover clever strategies automatically. Current methods require an accurate model of the decision problems people face in real life. But most models are misspecified because of limited information and cognitive biases. To address this problem we develop strategy discovery methods that are robust to model misspecification. Robustness is achieved by model-ing model-misspecification and handling uncertainty about the real-world according to Bayesian inference. We translate our methods into an intelligent tutor that automatically discovers and teaches robust planning strategies. Our robust cognitive tutor significantly improved human decision-making when the model was so biased that conventional cognitive tutors were no longer effective. These findings highlight that our robust strategy discovery methods are a significant step towards leveraging artificial intelligence to improve human decision-making in the real world.

re

Project Page [BibTex]

Project Page [BibTex]


no image
Model-Agnostic Counterfactual Explanations for Consequential Decisions

Karimi, A., Barthe, G., Balle, B., Valera, I.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), June 2020 (conference) Accepted

ei plg

arXiv [BibTex]

arXiv [BibTex]


Actively Learning Gaussian Process Dynamics
Actively Learning Gaussian Process Dynamics

Buisson-Fenet, M., Solowjow, F., Trimpe, S.

2nd Annual Conference on Learning for Dynamics and Control, June 2020 (conference) Accepted

Abstract
Despite the availability of ever more data enabled through modern sensor and computer technology, it still remains an open problem to learn dynamical systems in a sample-efficient way. We propose active learning strategies that leverage information-theoretical properties arising naturally during Gaussian process regression, while respecting constraints on the sampling process imposed by the system dynamics. Sample points are selected in regions with high uncertainty, leading to exploratory behavior and data-efficient training of the model. All results are verified in an extensive numerical benchmark.

ics

ArXiv [BibTex]

ArXiv [BibTex]


{GENTEL : GENerating Training data Efficiently for Learning to segment medical images}
GENTEL : GENerating Training data Efficiently for Learning to segment medical images

Thakur, R. P., Rocamora, S. P., Goel, L., Pohmann, R., Machann, J., Black, M. J.

Congrès Reconnaissance des Formes, Image, Apprentissage et Perception (RFAIP), June 2020 (conference)

Abstract
Accurately segmenting MRI images is crucial for many clinical applications. However, manually segmenting images with accurate pixel precision is a tedious and time consuming task. In this paper we present a simple, yet effective method to improve the efficiency of the image segmentation process. We propose to transform the image annotation task into a binary choice task. We start by using classical image processing algorithms with different parameter values to generate multiple, different segmentation masks for each input MRI image. Then, instead of segmenting the pixels of the images, the user only needs to decide whether a segmentation is acceptable or not. This method allows us to efficiently obtain high quality segmentations with minor human intervention. With the selected segmentations, we train a state-of-the-art neural network model. For the evaluation, we use a second MRI dataset (1.5T Dataset), acquired with a different protocol and containing annotations. We show that the trained network i) is able to automatically segment cases where none of the classical methods obtain a high quality result ; ii) generalizes to the second MRI dataset, which was acquired with a different protocol and was never seen at training time ; and iii) enables detection of miss-annotations in this second dataset. Quantitatively, the trained network obtains very good results: DICE score - mean 0.98, median 0.99- and Hausdorff distance (in pixels) - mean 4.7, median 2.0-.

ps

[BibTex]

[BibTex]


Learning to Dress 3D People in Generative Clothing
Learning to Dress 3D People in Generative Clothing

Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed 3D scans and thus do not generalize to the complexity of dressed people in common images and videos. Additionally, current models lack the expressive power needed to represent the complex non-linear geometry of pose-dependent clothing shape. To address this, we learn a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing. Specifically, we train a conditional Mesh-VAE-GAN to learn the clothing deformation from the SMPL body model, making clothing an additional term on SMPL. Our model is conditioned on both pose and clothing type, giving the ability to draw samples of clothing to dress different body shapes in a variety of styles and poses. To preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to 3D meshes. Our model, named CAPE, represents global shape and fine local structure, effectively extending the SMPL body model to clothing. To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.

ps

arxiv project page code [BibTex]


no image
A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization

F Alimisis, F., Orvieto, A., Becigneul, G., Lucchi, A.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), June 2020 (conference) Accepted

ei

[BibTex]

[BibTex]


Generating 3D People in Scenes without People
Generating 3D People in Scenes without People

Zhang, Y., Hassan, M., Neumann, H., Black, M. J., Tang, S.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
We present a fully automatic system that takes a 3D scene and generates plausible 3D human bodies that are posed naturally in that 3D scene. Given a 3D scene without people, humans can easily imagine how people could interact with the scene and the objects in it. However, this is a challenging task for a computer as solving it requires that (1) the generated human bodies to be semantically plausible within the 3D environment (e.g. people sitting on the sofa or cooking near the stove), and (2) the generated human-scene interaction to be physically feasible such that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions. To that end, we make use of the surface-based 3D human model SMPL-X. We first train a conditional variational autoencoder to predict semantically plausible 3D human poses conditioned on latent scene representations, then we further refine the generated 3D bodies using scene constraints to enforce feasible physical interaction. We show that our approach is able to synthesize realistic and expressive 3D human bodies that naturally interact with 3D environment. We perform extensive experiments demonstrating that our generative framework compares favorably with existing methods, both qualitatively and quantitatively. We believe that our scene-conditioned 3D human generation pipeline will be useful for numerous applications; e.g. to generate training data for human pose estimation, in video games and in VR/AR. Our project page for data and code can be seen at: \url{https://vlg.inf.ethz.ch/projects/PSI/}.

ps

Code PDF [BibTex]

Code PDF [BibTex]


no image
Kernel Conditional Density Operators

Schuster, I., Mollenhauer, M., Klus, S., Muandet, K.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), Proceedings of Machine Learning Research, June 2020 (conference) Accepted

ei

[BibTex]

[BibTex]


Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes
Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes

Geist, A. R., Trimpe, S.

In 2nd Annual Conference on Learning for Dynamics and Control, June 2020 (inproceedings) Accepted

Abstract
The identification of the constrained dynamics of mechanical systems is often challenging. Learning methods promise to ease an analytical analysis, but require considerable amounts of data for training. We propose to combine insights from analytical mechanics with Gaussian process regression to improve the model's data efficiency and constraint integrity. The result is a Gaussian process model that incorporates a priori constraint knowledge such that its predictions adhere to Gauss' principle of least constraint. In return, predictions of the system's acceleration naturally respect potentially non-ideal (non-)holonomic equality constraints. As corollary results, our model enables to infer the acceleration of the unconstrained system from data of the constrained system and enables knowledge transfer between differing constraint configurations.

ics

Arxiv preprint [BibTex]

Arxiv preprint [BibTex]


no image
Where Does It End? - Reasoning About Hidden Surfaces by Object Intersection Constraints

Strecke, M., Stückler, J.

In Proceedings IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR) 2020, June 2020 (inproceedings)

ev

preprint project page [BibTex]

preprint project page [BibTex]


no image
A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control

Zhu, J., Diehl, M., Schölkopf, B.

2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (conference) Accepted

ei

arXiv [BibTex]

arXiv [BibTex]


Learning Physics-guided Face Relighting under Directional Light
Learning Physics-guided Face Relighting under Directional Light

Nestmeyer, T., Lalonde, J., Matthews, I., Lehrmann, A. M.

In Conference on Computer Vision and Pattern Recognition, IEEE/CVF, June 2020 (inproceedings) Accepted

Abstract
Relighting is an essential step in realistically transferring objects from a captured image into another environment. For example, authentic telepresence in Augmented Reality requires faces to be displayed and relit consistent with the observer's scene lighting. We investigate end-to-end deep learning architectures that both de-light and relight an image of a human face. Our model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model. We enable non-diffuse effects including cast shadows and specular highlights by predicting a residual correction to the diffuse render. To train and evaluate our model, we collected a portrait database of 21 subjects with various expressions and poses. Each sample is captured in a controlled light stage setup with 32 individual light sources. Our method creates precise and believable relighting results and generalizes to complex illumination conditions and challenging poses, including when the subject is not looking straight at the camera.

ps

Paper [BibTex]

Paper [BibTex]


{VIBE}: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation

Kocabas, M., Athanasiou, N., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methodsfail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose “Video Inference for Body Pose and Shape Estimation” (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a temporal network architecture and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE

ps

arXiv code video supplemental video [BibTex]


no image
Disentangling Factors of Variations Using Few Labels

Locatello, F., Tschannen, M., Bauer, S., Rätsch, G., Schölkopf, B., Bachem, O.

8th International Conference on Learning Representations (ICLR), April 2020 (conference)

ei

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


no image
Mixed-curvature Variational Autoencoders

Skopek, O., Ganea, O., Becigneul, G.

8th International Conference on Learning Representations (ICLR), April 2020 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals
Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals

Laumann, F., von Kügelgen, J., Barahona, M.

ICLR 2020 Workshop "Tackling Climate Change with Machine Learning", April 2020 (conference)

ei

arXiv PDF [BibTex]

arXiv PDF [BibTex]


From Variational to Deterministic Autoencoders
From Variational to Deterministic Autoencoders

Ghosh*, P., Sajjadi*, M. S. M., Vergari, A., Black, M. J., Schölkopf, B.

8th International Conference on Learning Representations (ICLR) , April 2020, *equal contribution (conference) Accepted

Abstract
Variational Autoencoders (VAEs) provide a theoretically-backed framework for deep generative models. However, they often produce “blurry” images, which is linked to their training objective. Sampling in the most popular implementation, the Gaussian VAE, can be interpreted as simply injecting noise to the input of a deterministic decoder. In practice, this simply enforces a smooth latent space structure. We challenge the adoption of the full VAE framework on this specific point in favor of a simpler, deterministic one. Specifically, we investigate how substituting stochasticity with other explicit and implicit regularization schemes can lead to a meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism for sampling new data points, we propose to employ an efficient ex-post density estimation step that can be readily adopted both for the proposed deterministic autoencoders as well as to improve sample quality of existing VAEs. We show in a rigorous empirical study that regularized deterministic autoencoding achieves state-of-the-art sample quality on the common MNIST, CIFAR-10 and CelebA datasets.

ei ps

arXiv [BibTex]

arXiv [BibTex]


Towards causal generative scene models via competition of experts
Towards causal generative scene models via competition of experts

von Kügelgen*, J., Ustyuzhaninov*, I., Gehler, P., Bethge, M., Schölkopf, B.

ICLR 2020 Workshop "Causal Learning for Decision Making", April 2020, *equal contribution (conference)

ei

arXiv PDF [BibTex]

arXiv PDF [BibTex]


no image
On Mutual Information Maximization for Representation Learning

Tschannen, M., Djolonga, J., Rubenstein, P. K., Gelly, S., Lucic, M.

8th International Conference on Learning Representations (ICLR), April 2020 (conference)

ei

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations
Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations

Rueegg, N., Lassner, C., Black, M. J., Schindler, K.

In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), Febuary 2020 (inproceedings)

Abstract
The goal of many computer vision systems is to transform image pixels into 3D representations. Recent popular models use neural networks to regress directly from pixels to 3D object parameters. Such an approach works well when supervision is available, but in problems like human pose and shape estimation, it is difficult to obtain natural images with 3D ground truth. To go one step further, we propose a new architecture that facilitates unsupervised, or lightly supervised, learning. The idea is to break the problem into a series of transformations between increasingly abstract representations. Each step involves a cycle designed to be learnable without annotated training data, and the chain of cycles delivers the final solution. Specifically, we use 2D body part segments as an intermediate representation that contains enough information to be lifted to 3D, and at the same time is simple enough to be learned in an unsupervised way. We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images. We also explore varying amounts of paired data and show that cycling greatly alleviates the need for paired data. While we present results for modeling humans, our formulation is general and can be applied to other vision problems.

ps

pdf [BibTex]

pdf [BibTex]


no image
More Powerful Selective Kernel Tests for Feature Selection

Lim, J. N., Yamada, M., Jitkrittum, W., Terada, Y., Matsui, S., Shimodaira, H.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020 (conference) To be published

ei

arXiv [BibTex]

arXiv [BibTex]


no image
ACTrain: Ein KI-basiertes Aufmerksamkeitstraining für die Wissensarbeit [ACTrain: An AI-based attention training for knowledge work]

Wirzberger, M., Oreshnikov, I., Passy, J., Lado, A., Shenhav, A., Lieder, F.

66th Spring Conference of the German Ergonomics Society, 2020 (conference)

Abstract
Unser digitales Zeitalter lebt von Informationen und stellt unsere begrenzte Verarbeitungskapazität damit täglich auf die Probe. Gerade in der Wissensarbeit haben ständige Ablenkungen erhebliche Leistungseinbußen zur Folge. Unsere intelligente Anwendung ACTrain setzt genau an dieser Stelle an und verwandelt Computertätigkeiten in eine Trainingshalle für den Geist. Feedback auf Basis maschineller Lernverfahren zeigt anschaulich den Wert auf, sich nicht von einer selbst gewählten Aufgabe ablenken zu lassen. Diese metakognitive Einsicht soll zum Durchhalten motivieren und das zugrunde liegende Fertigkeitsniveau der Aufmerksamkeitskontrolle stärken. In laufenden Feldexperimenten untersuchen wir die Frage, ob das Training mit diesem optimalen Feedback die Aufmerksamkeits- und Selbstkontrollfertigkeiten im Vergleich zu einer Kontrollgruppe ohne Feedback verbessern kann.

re sf

link (url) Project Page [BibTex]


no image
Computationally Tractable Riemannian Manifolds for Graph Embeddings

Cruceru, C., Becigneul, G., Ganea, O.

37th International Conference on Machine Learning (ICML), 2020 (conference) Submitted

ei

[BibTex]

[BibTex]


no image
A Real-Robot Dataset for Assessing Transferability of Learned Dynamics Models

Agudelo-España, D., Zadaianchuk, A., Wenk, P., Garg, A., Akpo, J., Grimminger, F., Viereck, J., Naveau, M., Righetti, L., Martius, G., Krause, A., Schölkopf, B., Bauer, S., Wüthrich, M.

IEEE International Conference on Robotics and Automation (ICRA), 2020 (conference) Accepted

am al ei mg

Project Page PDF [BibTex]

Project Page PDF [BibTex]


Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem
Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem

Zhu, J., Jitkrittum, W., Diehl, M., Schölkopf, B.

In 59th IEEE Conference on Decision and Control (CDC), 2020 (inproceedings) Accepted

ei

[BibTex]

[BibTex]


no image
Practical Accelerated Optimization on Riemannian Manifolds

F Alimisis, F., Orvieto, A., Becigneul, G., Lucchi, A.

37th International Conference on Machine Learning (ICML), 2020 (conference) Submitted

ei

[BibTex]

[BibTex]


no image
Fair Decisions Despite Imperfect Predictions

Kilbertus, N., Gomez Rodriguez, M., Schölkopf, B., Muandet, K., Valera, I.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020 (conference) Accepted

ei plg

[BibTex]

[BibTex]


no image
Planning from Images with Deep Latent Gaussian Process Dynamics

Bosch, N., Achterhold, J., Leal-Taixe, L., Stückler, J.

2nd Annual Conference on Learning for Dynamics and Control (L4DC) , 2020, to appear, arXiv:2005.03770 (conference) Accepted

ev

preprint project page poster [BibTex]

preprint project page poster [BibTex]


no image
Constant Curvature Graph Convolutional Networks

Bachmann*, G., Becigneul*, G., Ganea, O.

37th International Conference on Machine Learning (ICML), 2020, *equal contribution (conference) Submitted

ei

[BibTex]

[BibTex]


no image
DirectShape: Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation

Wang, R., Yang, N., Stückler, J., Cremers, D.

In Accepted for IEEE international Conference on Robotics and Automation (ICRA), 2020, arXiv:1904.10097 (inproceedings) Accepted

ev

[BibTex]

[BibTex]


no image
Divide-and-Conquer Monte Carlo Tree Search for goal directed planning

Parascandolo*, G., Buesing*, L., Merel, J., Hasenclever, L., Aslanides, J., Hamrick, J. B., Heess, N., Neitz, A., Weber, T.

2020, *equal contribution (conference) Submitted

ei

arXiv [BibTex]

arXiv [BibTex]

2014


Advanced Structured Prediction
Advanced Structured Prediction

Nowozin, S., Gehler, P. V., Jancsary, J., Lampert, C. H.

Advanced Structured Prediction, pages: 432, Neural Information Processing Series, MIT Press, November 2014 (book)

Abstract
The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components. These models are expressive and powerful, but exact computation is often intractable. A broad research effort in recent years has aimed at designing structured prediction models and approximate inference and learning procedures that are computationally efficient. This volume offers an overview of this recent research in order to make the work accessible to a broader research community. The chapters, by leading researchers in the field, cover a range of topics, including research trends, the linear programming relaxation approach, innovations in probabilistic modeling, recent theoretical progress, and resource-aware learning.

ps

publisher link (url) [BibTex]

2014


publisher link (url) [BibTex]


Hough-based Object Detection with Grouped Features
Hough-based Object Detection with Grouped Features

Srikantha, A., Gall, J.

International Conference on Image Processing, pages: 1653-1657, Paris, France, IEEE International Conference on Image Processing , October 2014 (conference)

Abstract
Hough-based voting approaches have been successfully applied to object detection. While these methods can be efficiently implemented by random forests, they estimate the probability for an object hypothesis for each feature independently. In this work, we address this problem by grouping features in a local neighborhood to obtain a better estimate of the probability. To this end, we propose oblique classification-regression forests that combine features of different trees. We further investigate the benefit of combining independent and grouped features and evaluate the approach on RGB and RGB-D datasets.

ps

pdf poster DOI Project Page [BibTex]

pdf poster DOI Project Page [BibTex]


Omnidirectional 3D Reconstruction in Augmented Manhattan Worlds
Omnidirectional 3D Reconstruction in Augmented Manhattan Worlds

Schoenbein, M., Geiger, A.

International Conference on Intelligent Robots and Systems, pages: 716 - 723, IEEE, Chicago, IL, USA, IEEE/RSJ International Conference on Intelligent Robots and System, October 2014 (conference)

Abstract
This paper proposes a method for high-quality omnidirectional 3D reconstruction of augmented Manhattan worlds from catadioptric stereo video sequences. In contrast to existing works we do not rely on constructing virtual perspective views, but instead propose to optimize depth jointly in a unified omnidirectional space. Furthermore, we show that plane-based prior models can be applied even though planes in 3D do not project to planes in the omnidirectional domain. Towards this goal, we propose an omnidirectional slanted-plane Markov random field model which relies on plane hypotheses extracted using a novel voting scheme for 3D planes in omnidirectional space. To quantitatively evaluate our method we introduce a dataset which we have captured using our autonomous driving platform AnnieWAY which we equipped with two horizontally aligned catadioptric cameras and a Velodyne HDL-64E laser scanner for precise ground truth depth measurements. As evidenced by our experiments, the proposed method clearly benefits from the unified view and significantly outperforms existing stereo matching techniques both quantitatively and qualitatively. Furthermore, our method is able to reduce noise and the obtained depth maps can be represented very compactly by a small number of image segments and plane parameters.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Human Pose Estimation with Fields of Parts
Human Pose Estimation with Fields of Parts

Kiefel, M., Gehler, P.

In Computer Vision – ECCV 2014, LNCS 8693, pages: 331-346, Lecture Notes in Computer Science, (Editors: Fleet, David and Pajdla, Tomas and Schiele, Bernt and Tuytelaars, Tinne), Springer, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
This paper proposes a new formulation of the human pose estimation problem. We present the Fields of Parts model, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images. The Fields of Parts model is inspired by the idea of Pictorial Structures, it models local appearance and joint spatial configuration of the human body. However the underlying graph structure is entirely different. The idea is simple: we model the presence and absence of a body part at every possible position, orientation, and scale in an image with a binary random variable. This results into a vast number of random variables, however, we show that approximate inference in this model is efficient. Moreover we can encode the very same appearance and spatial structure as in Pictorial Structures models. This approach allows us to combine ideas from segmentation and pose estimation into a single model. The Fields of Parts model can use evidence from the background, include local color information, and it is connected more densely than a kinematic chain structure. On the challenging Leeds Sports Poses dataset we improve over the Pictorial Structures counterpart by 5.5% in terms of Average Precision of Keypoints (APK).

ei ps

website pdf DOI Project Page [BibTex]

website pdf DOI Project Page [BibTex]


Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points
Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

Tzionas, D., Srikantha, A., Aponte, P., Gall, J.

In German Conference on Pattern Recognition (GCPR), pages: 1-13, Lecture Notes in Computer Science, Springer, GCPR, September 2014 (inproceedings)

Abstract
Hand motion capture has been an active research topic in recent years, following the success of full-body pose tracking. Despite similarities, hand tracking proves to be more challenging, characterized by a higher dimensionality, severe occlusions and self-similarity between fingers. For this reason, most approaches rely on strong assumptions, like hands in isolation or expensive multi-camera systems, that limit the practical use. In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera. Our approach combines a generative model with collision detection and discriminatively learned salient points. We quantitatively evaluate our approach on 14 new sequences with challenging interactions.

ps

pdf Supplementary pdf Supplementary Material Project Page DOI Project Page [BibTex]

pdf Supplementary pdf Supplementary Material Project Page DOI Project Page [BibTex]


{OpenDR}: An Approximate Differentiable Renderer
OpenDR: An Approximate Differentiable Renderer

Loper, M. M., Black, M. J.

In Computer Vision – ECCV 2014, 8695, pages: 154-169, Lecture Notes in Computer Science, (Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars ), Springer International Publishing, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
Inverse graphics attempts to take sensor data and infer 3D geometry, illumination, materials, and motions such that a graphics renderer could realistically reproduce the observed scene. Renderers, however, are designed to solve the forward process of image synthesis. To go in the other direction, we propose an approximate di fferentiable renderer (DR) that explicitly models the relationship between changes in model parameters and image observations. We describe a publicly available OpenDR framework that makes it easy to express a forward graphics model and then automatically obtain derivatives with respect to the model parameters and to optimize over them. Built on a new autodiff erentiation package and OpenGL, OpenDR provides a local optimization method that can be incorporated into probabilistic programming frameworks. We demonstrate the power and simplicity of programming with OpenDR by using it to solve the problem of estimating human body shape from Kinect depth and RGB data.

ps

pdf Code Chumpy Supplementary video of talk DOI Project Page [BibTex]

pdf Code Chumpy Supplementary video of talk DOI Project Page [BibTex]


Discovering Object Classes from Activities
Discovering Object Classes from Activities

Srikantha, A., Gall, J.

In European Conference on Computer Vision, 8694, pages: 415-430, Lecture Notes in Computer Science, (Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars ), Springer International Publishing, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
In order to avoid an expensive manual labeling process or to learn object classes autonomously without human intervention, object discovery techniques have been proposed that extract visual similar objects from weakly labelled videos. However, the problem of discovering small or medium sized objects is largely unexplored. We observe that videos with activities involving human-object interactions can serve as weakly labelled data for such cases. Since neither object appearance nor motion is distinct enough to discover objects in these videos, we propose a framework that samples from a space of algorithms and their parameters to extract sequences of object proposals. Furthermore, we model similarity of objects based on appearance and functionality, which is derived from human and object motion. We show that functionality is an important cue for discovering objects from activities and demonstrate the generality of the model on three challenging RGB-D and RGB datasets.

ps

pdf anno poster DOI Project Page [BibTex]

pdf anno poster DOI Project Page [BibTex]


Probabilistic Progress Bars
Probabilistic Progress Bars

Kiefel, M., Schuler, C., Hennig, P.

In Conference on Pattern Recognition (GCPR), 8753, pages: 331-341, Lecture Notes in Computer Science, (Editors: Jiang, X., Hornegger, J., and Koch, R.), Springer, GCPR, September 2014 (inproceedings)

Abstract
Predicting the time at which the integral over a stochastic process reaches a target level is a value of interest in many applications. Often, such computations have to be made at low cost, in real time. As an intuitive example that captures many features of this problem class, we choose progress bars, a ubiquitous element of computer user interfaces. These predictors are usually based on simple point estimators, with no error modelling. This leads to fluctuating behaviour confusing to the user. It also does not provide a distribution prediction (risk values), which are crucial for many other application areas. We construct and empirically evaluate a fast, constant cost algorithm using a Gauss-Markov process model which provides more information to the user.

ei ps pn

website+code pdf DOI [BibTex]

website+code pdf DOI [BibTex]


Optical Flow Estimation with Channel Constancy
Optical Flow Estimation with Channel Constancy

Sevilla-Lara, L., Sun, D., Learned-Miller, E. G., Black, M. J.

In Computer Vision – ECCV 2014, 8689, pages: 423-438, Lecture Notes in Computer Science, (Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars ), Springer International Publishing, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
Large motions remain a challenge for current optical flow algorithms. Traditionally, large motions are addressed using multi-resolution representations like Gaussian pyramids. To deal with large displacements, many pyramid levels are needed and, if an object is small, it may be invisible at the highest levels. To address this we decompose images using a channel representation (CR) and replace the standard brightness constancy assumption with a descriptor constancy assumption. CRs can be seen as an over-segmentation of the scene into layers based on some image feature. If the appearance of a foreground object differs from the background then its descriptor will be different and they will be represented in different layers.We create a pyramid by smoothing these layers, without mixing foreground and background or losing small objects. Our method estimates more accurate flow than the baseline on the MPI-Sintel benchmark, especially for fast motions and near motion boundaries.

ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Modeling Blurred Video with Layers
Modeling Blurred Video with Layers

Wulff, J., Black, M. J.

In Computer Vision – ECCV 2014, 8694, pages: 236-252, Lecture Notes in Computer Science, (Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars ), Springer International Publishing, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
Videos contain complex spatially-varying motion blur due to the combination of object motion, camera motion, and depth variation with fi nite shutter speeds. Existing methods to estimate optical flow, deblur the images, and segment the scene fail in such cases. In particular, boundaries between di fferently moving objects cause problems, because here the blurred images are a combination of the blurred appearances of multiple surfaces. We address this with a novel layered model of scenes in motion. From a motion-blurred video sequence, we jointly estimate the layer segmentation and each layer's appearance and motion. Since the blur is a function of the layer motion and segmentation, it is completely determined by our generative model. Given a video, we formulate the optimization problem as minimizing the pixel error between the blurred frames and images synthesized from the model, and solve it using gradient descent. We demonstrate our approach on synthetic and real sequences.

ps

pdf Supplemental Video Data DOI Project Page Project Page [BibTex]

pdf Supplemental Video Data DOI Project Page Project Page [BibTex]


Intrinsic Video
Intrinsic Video

Kong, N., Gehler, P. V., Black, M. J.

In Computer Vision – ECCV 2014, 8690, pages: 360-375, Lecture Notes in Computer Science, (Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars ), Springer International Publishing, 13th European Conference on Computer Vision, September 2014 (inproceedings)

Abstract
Intrinsic images such as albedo and shading are valuable for later stages of visual processing. Previous methods for extracting albedo and shading use either single images or images together with depth data. Instead, we define intrinsic video estimation as the problem of extracting temporally coherent albedo and shading from video alone. Our approach exploits the assumption that albedo is constant over time while shading changes slowly. Optical flow aids in the accurate estimation of intrinsic video by providing temporal continuity as well as putative surface boundaries. Additionally, we find that the estimated albedo sequence can be used to improve optical flow accuracy in sequences with changing illumination. The approach makes only weak assumptions about the scene and we show that it substantially outperforms existing single-frame intrinsic image methods. We evaluate this quantitatively on synthetic sequences as well on challenging natural sequences with complex geometry, motion, and illumination.

ps

pdf Supplementary Video DOI Project Page Project Page [BibTex]

pdf Supplementary Video DOI Project Page Project Page [BibTex]


Automated Detection of New or Evolving Melanocytic Lesions Using a {3D} Body Model
Automated Detection of New or Evolving Melanocytic Lesions Using a 3D Body Model

Bogo, F., Romero, J., Peserico, E., Black, M. J.

In Medical Image Computing and Computer-Assisted Intervention (MICCAI), 8673, pages: 593-600, Lecture Notes in Computer Science, (Editors: Golland, Polina and Hata, Nobuhiko and Barillot, Christian and Hornegger, Joachim and Howe, Robert), Spring International Publishing, Medical Image Computing and Computer-Assisted Intervention (MICCAI), September 2014 (inproceedings)

Abstract
Detection of new or rapidly evolving melanocytic lesions is crucial for early diagnosis and treatment of melanoma.We propose a fully automated pre-screening system for detecting new lesions or changes in existing ones, on the order of 2 - 3mm, over almost the entire body surface. Our solution is based on a multi-camera 3D stereo system. The system captures 3D textured scans of a subject at diff erent times and then brings these scans into correspondence by aligning them with a learned, parametric, non-rigid 3D body model. This means that captured skin textures are in accurate alignment across scans, facilitating the detection of new or changing lesions. The integration of lesion segmentation with a deformable 3D body model is a key contribution that makes our approach robust to changes in illumination and subject pose.

ps

pdf Poster DOI Project Page [BibTex]

pdf Poster DOI Project Page [BibTex]