Header logo is


2018


no image
Event-triggered Learning for Resource-efficient Networked Control

Solowjow, F., Baumann, D., Garcke, J., Trimpe, S.

In Proceedings of the 2018 American Control Conference (ACC), June 2018 (inproceedings)

ics

arXiv PDF [BibTex]

2018


arXiv PDF [BibTex]


no image
On Time Optimization of Centroidal Momentum Dynamics

Ponton, B., Herzog, A., Prete, A. D., Schaal, S., Righetti, L.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

Abstract
Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing.

am

video paper [BibTex]

video paper [BibTex]


Thumb xl andrease teaser 2
Robust Dense Mapping for Large-Scale Dynamic Environments

Barsan, I. A., Liu, P., Pollefeys, M., Geiger, A.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

Abstract
We present a stereo-based dense mapping algorithm for large-scale dynamic urban environments. In contrast to other existing methods, we simultaneously reconstruct the static background, the moving objects, and the potentially moving but currently stationary objects separately, which is desirable for high-level mobile robotic tasks such as path planning in crowded environments. We use both instance-aware semantic segmentation and sparse scene flow to classify objects as either background, moving, or potentially moving, thereby ensuring that the system is able to model objects with the potential to transition from static to dynamic, such as parked cars. Given camera poses estimated from visual odometry, both the background and the (potentially) moving objects are reconstructed separately by fusing the depth maps computed from the stereo input. In addition to visual odometry, sparse scene flow is also used to estimate the 3D motions of the detected moving objects, in order to reconstruct them accurately. A map pruning technique is further developed to improve reconstruction accuracy and reduce memory consumption, leading to increased scalability. We evaluate our system thoroughly on the well-known KITTI dataset. Our system is capable of running on a PC at approximately 2.5Hz, with the primary bottleneck being the instance-aware semantic segmentation, which is a limitation we hope to address in future work.

avg

pdf Video Project Page [BibTex]

pdf Video Project Page [BibTex]


Thumb xl cover tro paper
An Online Scalable Approach to Unified Multirobot Cooperative Localization and Object Tracking

Ahmad, A., Lawless, G., Lima, P.

In IEEE International Conference on Robotics and Automation (ICRA) 2018, Journal Track., ICRA 2018, May 2018 (inproceedings)

ps

[BibTex]

[BibTex]


Thumb xl meta learning overview
Online Learning of a Memory for Learning Rates

(nominated for best paper award)

Meier, F., Kappler, D., Schaal, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018, accepted (inproceedings)

Abstract
The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.

am

pdf video code [BibTex]

pdf video code [BibTex]


Thumb xl learning ct w asm block diagram detailed
Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks

Sutanto, G., Su, Z., Schaal, S., Meier, F.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl screen shot 2018 04 19 at 14.57.08
Motion-based Object Segmentation based on Dense RGB-D Scene Flow

Shao, L., Shah, P., Dwaracherla, V., Bohg, J.

arXiv, April 2018 (conference)

Abstract
Given two consecutive RGB-D images, we propose a model that estimates a dense 3D motion field, also known as scene flow. We take advantage of the fact that in robot manipulation scenarios, scenes often consist of a set of rigidly moving objects. Our model jointly estimates (i) the segmentation of the scene into an unknown but finite number of objects, (ii) the motion trajectories of these objects and (iii) the object scene flow. We employ an hourglass, deep neural network architecture. In the encoding stage, the RGB and depth images undergo spatial compression and correlation. In the decoding stage, the model outputs three images containing a per-pixel estimate of the corresponding object center as well as object translation and rotation. This forms the basis for inferring the object segmentation and final object scene flow. To evaluate our model, we generated a new and challenging, large-scale, synthetic dataset that is specifically targeted at robotic manipulation: It contains a large number of scenes with a very diverse set of simultaneously moving 3D objects and is recorded with a commonly-used RGB-D camera. In quantitative experiments, we show that we significantly outperform state-of-the-art scene flow and motion-segmentation methods. In qualitative experiments, we show how our learned model transfers to challenging real-world scenes, visually generating significantly better results than existing methods.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Evaluating Low-Power Wireless Cyber-Physical Systems

Baumann, D., Mager, F., Singh, H., Zimmerling, M., Trimpe, S.

In Proceedings of the 1st Workshop on Benchmarking Cyber-Physical Networks and Systems (CPSBench 2018), April 2018 (inproceedings)

ics

arXiv PDF [BibTex]

arXiv PDF [BibTex]


Thumb xl teaser image
Probabilistic Recurrent State-Space Models

Doerr, A., Daniel, C., Schiegg, M., Nguyen-Tuong, D., Schaal, S., Toussaint, M., Trimpe, S.

2018, under review at a conference (conference) Submitted

Abstract
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g., LSTMs) proved extremely successful in modeling complex time-series data. Fully probabilistic SSMs, however, unfortunately often prove hard to train, even for smaller problems. To overcome this limitation, we propose a scalable initialization and training algorithm based on doubly stochastic variational inference and Gaussian processes. In the variational approximation we propose in contrast to related approaches to fully capture the latent state temporal correlations to allow for robust training.

am ics

arXiv pdf Project Page [BibTex]

arXiv pdf Project Page [BibTex]


no image
Group invariance principles for causal generative models

Besserve, M., Shajarisales, N., Schölkopf, B., Janzing, D.

Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018), 84, pages: 557-565, Proceedings of Machine Learning Research, (Editors: Amos Storkey and Fernando Perez-Cruz), PMLR, 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Wasserstein Auto-Encoders

Tolstikhin, I., Bousquet, O., Gelly, S., Schölkopf, B.

6th International Conference on Learning Representations (ICLR 2018), 2018 (conference) Accepted

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl despoina paper teaser
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials

Paschalidou, D., Ulusoy, A. O., Schmitt, C., Gool, L., Geiger, A.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
In this paper, we consider the problem of reconstructing a dense 3D model using images captured from different views. Recent methods based on convolutional neural networks (CNN) allow learning the entire task from data. However, they do not incorporate the physics of image formation such as perspective geometry and occlusion. Instead, classical approaches based on Markov Random Fields (MRF) with ray-potentials explicitly model these physical processes, but they cannot cope with large surface appearance variations across different viewpoints. In this paper, we propose RayNet, which combines the strengths of both frameworks. RayNet integrates a CNN that learns view-invariant feature representations with an MRF that explicitly encodes the physics of perspective projection and occlusion. We train RayNet end-to-end using empirical risk minimization. We thoroughly evaluate our approach on challenging real-world datasets and demonstrate its benefits over a piece-wise trained baseline, hand-crafted models as well as other learning-based approaches.

avg

pdf suppmat Video Project Page code Project Page [BibTex]

pdf suppmat Video Project Page code Project Page [BibTex]


Thumb xl hmrteaser
End-to-end Recovery of Human Shape and Pose

Kanazawa, A., Black, M. J., Jacobs, D. W., Malik, J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.

ps

pdf code project video [BibTex]

pdf code project video [BibTex]


no image
A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming

Yurtsever, A., Fercoq, O., Locatello, F., Cevher, V.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Boosting Variational Inference: an Optimization Perspective

Locatello, F., Khanna, R., Ghosh, J., Rätsch, G.

Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018), 84, pages: 464-472, Proceedings of Machine Learning Research, (Editors: Amos Storkey and Fernando Perez-Cruz), PMLR, 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl yiyi paper teaser
Deep Marching Cubes: Learning Explicit Surface Representations

Liao, Y., Donne, S., Geiger, A.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Existing learning based solutions to 3D surface prediction cannot be trained end-to-end as they operate on intermediate representations (eg, TSDF) from which 3D surface meshes must be extracted in a post-processing step (eg, via the marching cubes algorithm). In this paper, we investigate the problem of end-to-end 3D surface prediction. We first demonstrate that the marching cubes algorithm is not differentiable and propose an alternative differentiable formulation which we insert as a final layer into a 3D convolutional neural network. We further propose a set of loss functions which allow for training our model with sparse point supervision. Our experiments demonstrate that the model allows for predicting sub-voxel accurate 3D shapes of arbitrary topology. Additionally, it learns to complete shapes and to separate an object's inside from its outside even in the presence of sparse and incomplete ground truth. We investigate the benefits of our approach on the task of inferring shapes from 3D point clouds. Our model is flexible and can be combined with a variety of shape encoder and shape inference techniques.

avg

pdf suppmat Video Project Page Project Page [BibTex]

pdf suppmat Video Project Page Project Page [BibTex]


Thumb xl teaser andreas
Semantic Visual Localization

Schönberger, J., Pollefeys, M., Geiger, A., Sattler, T.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, eg, in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes.

avg

pdf suppmat [BibTex]

pdf suppmat [BibTex]


Thumb xl tease pic
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients

Balles, L., Hennig, P.

In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018 (inproceedings) Accepted

Abstract
The ADAM optimizer is exceedingly popular in the deep learning community. Often it works very well, sometimes it doesn't. Why? We interpret ADAM as a combination of two aspects: for each weight, the update direction is determined by the sign of stochastic gradients, whereas the update magnitude is determined by an estimate of their relative variance. We disentangle these two aspects and analyze them in isolation, gaining insight into the mechanisms underlying ADAM. This analysis also extends recent results on adverse effects of ADAM on generalization, isolating the sign aspect as the problematic one. Transferring the variance adaptation to SGD gives rise to a novel method, completing the practitioner's toolbox for problems where ADAM fails.

pn

link (url) [BibTex]

link (url) [BibTex]


no image
Detecting non-causal artifacts in multivariate linear regression models

Janzing, D., Schölkopf, B.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Blind Justice: Fairness with Encrypted Sensitive Attributes

Kilbertus, N., Gascon, A., Kusner, M., Veale, M., Gummadi, K., Weller, A.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


Thumb xl 2017 frvsr
Frame-Recurrent Video Super-Resolution

Sajjadi, M. S. M., Vemulapalli, R., Brown, M.

The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018) , 2018 (conference) Accepted

ei

ArXiv [BibTex]

ArXiv [BibTex]


no image
Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Pong*, V., Gu*, S., Dalal, M., Levine, S.

6th International Conference on Learning Representations (ICLR 2018), 2018, *equal contribution (conference) Accepted

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl david paper teaser
Learning 3D Shape Completion from Laser Scan Data with Weak Supervision

Stutz, D., Geiger, A.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
3D shape completion from partial point clouds is a fundamental problem in computer vision and computer graphics. Recent approaches can be characterized as either data-driven or learning-based. Data-driven approaches rely on a shape model whose parameters are optimized to fit the observations. Learning-based approaches, in contrast, avoid the expensive optimization step and instead directly predict the complete shape from the incomplete observations using deep neural networks. However, full supervision is required which is often not available in practice. In this work, we propose a weakly-supervised learning-based approach to 3D shape completion which neither requires slow optimization nor direct supervision. While we also learn a shape prior on synthetic data, we amortize, ie, learn, maximum likelihood fitting using deep neural networks resulting in efficient shape completion without sacrificing accuracy. Tackling 3D shape completion of cars on ShapeNet and KITTI, we demonstrate that the proposed amortized maximum likelihood approach is able to compete with a fully supervised baseline and a state-of-the-art data-driven approach while being significantly faster. On ModelNet, we additionally show that the approach is able to generalize to other object categories as well.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


no image
Feedback Control Goes Wireless: Guaranteed Stability over Low-power Multi-hop Networks

Mager, F., Baumann, D., Jacob, R., Thiele, L., Trimpe, S., Zimmerling, M.

2018, Under review at a conference (conference) Submitted

Abstract
Closing feedback loops fast and over long distances is key to emerging applications; for example, robot motion control and swarm coordination require update intervals below 100 ms. Low-power wireless is preferred for its flexibility, low cost, and small form factor, especially if the devices support multi-hop communication. Thus far, however, closed-loop control over multi-hop low-power wireless has only been demonstrated for update intervals on the order of multiple seconds. This paper presents a wireless embedded system that tames imperfections impairing control performance such as jitter or packet loss, and a control design that exploits the essential properties of this system to provably guarantee closed-loop stability for linear dynamic systems. Using experiments on a testbed with multiple cart-pole systems, we are the first to demonstrate the feasibility and to assess the performance of closed-loop control and coordination over multi-hop low-power wireless for update intervals from 20 ms to 50 ms.

am ics

arXiv PDF [BibTex]

arXiv PDF [BibTex]


Thumb xl 2018 tgan
Tempered Adversarial Networks

Sajjadi, M. S. M., Schölkopf, B.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

ArXiv [BibTex]

ArXiv [BibTex]


no image
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos

Parmas, P., Doya, K., Rasmussen, C., Peters, J.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Learning Independent Causal Mechanisms

Parascandolo, G., Kilbertus, N., Rojas-Carulla, M., Schölkopf, B.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

Kim, J., Tabibian, B., Oh, A., Schölkopf, B., Gomez Rodriguez, M.

Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

Eysenbach, B., Gu, S., Ibarz, J., Levine, S.

6th International Conference on Learning Representations (ICLR 2018), 2018 (conference) Accepted

ei

Videos link (url) [BibTex]

Videos link (url) [BibTex]


no image
Generalized Score Functions for Causal Discovery

Huang, B., Zhang, K., Lin, Y., Schölkopf, B., C., G.

Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


Thumb xl benvisapp
Learning Transformation Invariant Representations with Weak Supervision

Coors, B., Condurache, A., Mertins, A., Geiger, A.

In International Conference on Computer Vision Theory and Applications, International Conference on Computer Vision Theory and Applications, 2018 (inproceedings)

Abstract
Deep convolutional neural networks are the current state-of-the-art solution to many computer vision tasks. However, their ability to handle large global and local image transformations is limited. Consequently, extensive data augmentation is often utilized to incorporate prior knowledge about desired invariances to geometric transformations such as rotations or scale changes. In this work, we combine data augmentation with an unsupervised loss which enforces similarity between the predictions of augmented copies of an input sample. Our loss acts as an effective regularizer which facilitates the learning of transformation invariant representations. We investigate the effectiveness of the proposed similarity loss on rotated MNIST and the German Traffic Sign Recognition Benchmark (GTSRB) in the context of different classification models including ladder networks. Our experiments demonstrate improvements with respect to the standard data augmentation approach for supervised and semi-supervised learning tasks, in particular in the presence of little annotated data. In addition, we analyze the performance of the proposed approach with respect to its hyperparameters, including the strength of the regularization as well as the layer where representation similarity is enforced.

avg

pdf [BibTex]

pdf [BibTex]


no image
Cause-Effect Inference by Comparing Regression Errors

Blöbaum, P., Janzing, D., Washio, T., Shimizu, S., Schölkopf, B.

Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018) , 84, pages: 900-909, Proceedings of Machine Learning Research, (Editors: Amos Storkey and Fernando Perez-Cruz), PMLR, 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Automatic Estimation of Modulation Transfer Functions

Bauer, M., Volchkov, V., Hirsch, M., Schölkopf, B.

International Conference on Computational Photography (ICCP 2018), 2018 (conference) Accepted

ei sf

Project Page [BibTex]

Project Page [BibTex]


Thumb xl smalrteaser
Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images

Zuffi, S., Kanazawa, A., Black, M. J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is difficult because the 3D scanning methods used to capture human shape are not applicable to wild animals or natural settings. Consequently, we propose a method to capture the detailed 3D shape of animals from images alone. The articulated and deformable nature of animals makes this problem extremely challenging, particularly in unconstrained environments with moving and uncalibrated cameras. To make this possible, we use a strong prior model of articulated animal shape that we fit to the image data. We then deform the animal shape in a canonical reference pose such that it matches image evidence when articulated and projected into multiple images. Our method extracts significantly more 3D shape detail than previous methods and is able to model new species, including the shape of an extinct animal, using only a few video frames. Additionally, the projected 3D shapes are accurate enough to facilitate the extraction of a realistic texture map from multiple frames.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl screen shot 2018 04 18 at 11.01.27 am
Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fail with Grace

Heim, S., Spröwitz, A.

Accepted for SIMPAR 2018, 2018 (conference) Accepted

dlg

[BibTex]

[BibTex]


no image
Differentially Private Database Release via Kernel Mean Embeddings

Balog, M., Tolstikhin, I., Schölkopf, B.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


Thumb xl selection 002
PoTion: Pose MoTion Representation for Action Recognition

Choutas, V., Weinzaepfel, P., Revaud, J., Schmid, C.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that consider- ing them jointly offers rich information for action recogni- tion. We introduce a novel representation that gracefully en- codes the movement of some semantic keypoints. We use the human joints as these keypoints and term our Pose moTion representation PoTion. Specifically, we first run a state- of-the-art human pose estimator [4] and extract heatmaps for the human joints in each frame. We obtain our PoTion representation by temporally aggregating these probability maps. This is achieved by ‘colorizing’ each of them de- pending on the relative time of the frames in the video clip and summing them. This fixed-size representation for an en- tire video clip is suitable to classify actions using a shallow convolutional neural network. Our experimental evaluation shows that PoTion outper- forms other state-of-the-art pose representations [6, 48]. Furthermore, it is complementary to standard appearance and motion streams. When combining PoTion with the recent two-stream I3D approach [5], we obtain state-of- the-art performance on the JHMDB, HMDB and UCF101 datasets.

ps

PDF [BibTex]

PDF [BibTex]


no image
Revisiting First-Order Convex Optimization Over Linear Spaces

Locatello, F., Raj, A., Praneeth Karimireddy, S., Rätsch, G., Schölkopf, B., Stich, S. U., Jaggi, M.

Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
PoTion: Pose MoTion Representation for Action Recognition

Choutas, Vasileios, Weinzaepfel, Philippe, Revaud, Jérôme, Schmid, Cordelia

In CVPR 2018 - IEEE Conference on Computer Vision and Pattern Recognition, pages: 1-10, IEEE, Salt Lake City, United States, June 2018 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]

2017


no image
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

Gu, S., Lillicrap, T., Turner, R. E., Ghahramani, Z., Schölkopf, B., Levine, S.

Proceedings from the conference "Neural Information Processing Systems 2017., pages: 3849-3858, (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

2017


link (url) [BibTex]


no image
Boosting Variational Inference: an Optimization Perspective

Locatello, F., Khanna, R., Ghosh, J., Rätsch, G.

Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Independent Causal Mechanisms

Parascandolo, G., Rojas-Carulla, M., Kilbertus, N., Schölkopf, B.

Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Avoiding Discrimination through Causal Reasoning

Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.

Proceedings from the conference "Neural Information Processing Systems 2017., pages: 656-666, (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees

Locatello, F., Tschannen, M., Rätsch, G., Jaggi, M.

Proceedings from the conference "Neural Information Processing Systems 2017., pages: 773-784, (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
AdaGAN: Boosting Generative Models

Tolstikhin, I., Gelly, S., Bousquet, O., Simon-Gabriel, C. J., Schölkopf, B.

Proceedings from the conference "Neural Information Processing Systems 2017., pages: 5430-5439, (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


Thumb xl larsnips
The Numerics of GANs

Mescheder, L., Nowozin, S., Geiger, A.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

Abstract
In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.

avg

pdf [BibTex]

pdf [BibTex]


no image
Safe Adaptive Importance Sampling

Stich, S. U., Raj, A., Jaggi, M.

Proceedings from the conference "Neural Information Processing Systems 2017., pages: 4384-4394, (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]