Header logo is


2020


Self-supervised motion deblurring
Self-supervised motion deblurring

Liu, P., Janai, J., Pollefeys, M., Sattler, T., Geiger, A.

IEEE Robotics and Automation Letters, 2020 (article)

Abstract
Motion blurry images challenge many computer vision algorithms, e.g., feature detection, motion estimation, or object recognition. Deep convolutional neural networks are state-of-the-art for image deblurring. However, obtaining training data with corresponding sharp and blurry image pairs can be difficult. In this paper, we present a differentiable reblur model for self-supervised motion deblurring, which enables the network to learn from real-world blurry image sequences without relying on sharp images for supervision. Our key insight is that motion cues obtained from consecutive images yield sufficient information to inform the deblurring task. We therefore formulate deblurring as an inverse rendering problem, taking into account the physical image formation process: we first predict two deblurred images from which we estimate the corresponding optical flow. Using these predictions, we re-render the blurred images and minimize the difference with respect to the original blurry inputs. We use both synthetic and real dataset for experimental evaluations. Our experiments demonstrate that self-supervised single image deblurring is really feasible and leads to visually compelling results.

avg

pdf Project Page Blog [BibTex]

2020


pdf Project Page Blog [BibTex]


no image
Semi-Supervised Learning of Multi-Object 3D Scene Representations

Elich, C., Oswald, M. R., Pollefeys, M., Stueckler, J.

CoRR, abs/2010.04030, 2020 (article)

Abstract
Representing scenes at the granularity of objects is a prerequisite for scene understanding and decision making. We propose a novel approach for learning multi-object 3D scene representations from images. A recurrent encoder regresses a latent representation of 3D shapes, poses and texture of each object from an input RGB image. The 3D shapes are represented continuously in function-space as signed distance functions (SDF) which we efficiently pre-train from example shapes in a supervised way. By differentiable rendering we then train our model to decompose scenes self-supervised from RGB-D images. Our approach learns to decompose images into the constituent objects of the scene and to infer their shape, pose and texture from a single view. We evaluate the accuracy of our model in inferring the 3D scene layout and demonstrate its generative capabilities.

ev

link (url) [BibTex]

link (url) [BibTex]


Wearable and Stretchable Strain Sensors: Materials, Sensing Mechanisms, and Applications
Wearable and Stretchable Strain Sensors: Materials, Sensing Mechanisms, and Applications

Souri, H., Banerjee, H., Jusufi, A., Radacsi, N., Stokes, A. A., Park, I., Sitti, M., Amjadi, M.

Advanced Intelligent Systems, 2020 (article)

bio pi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Learning Neural Light Transport
Learning Neural Light Transport

Sanzenbacher, P., Mescheder, L., Geiger, A.

Arxiv, 2020 (article)

Abstract
In recent years, deep generative models have gained significance due to their ability to synthesize natural-looking images with applications ranging from virtual reality to data augmentation for training computer vision models. While existing models are able to faithfully learn the image distribution of the training set, they often lack controllability as they operate in 2D pixel space and do not model the physical image formation process. In this work, we investigate the importance of 3D reasoning for photorealistic rendering. We present an approach for learning light transport in static and dynamic 3D scenes using a neural network with the goal of predicting photorealistic images. In contrast to existing approaches that operate in the 2D image domain, our approach reasons in both 3D and 2D space, thus enabling global illumination effects and manipulation of 3D scene geometry. Experimentally, we find that our model is able to produce photorealistic renderings of static and dynamic scenes. Moreover, it compares favorably to baselines which combine path tracing and image denoising at the same computational budget.

avg

arxiv [BibTex]


HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking
HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking

Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixe, L., Leibe, B.

International Journal of Computer Vision (IJCV), 2020 (article)

Abstract
Multi-Object Tracking (MOT) has been notoriously difficult to evaluate. Previous metrics overemphasize the importance of either detection or association. To address this, we present a novel MOT evaluation metric, HOTA (Higher Order Tracking Accuracy), which explicitly balances the effect of performing accurate detection, association and localization into a single unified metric for comparing trackers. HOTA decomposes into a family of sub-metrics which are able to evaluate each of five basic error types separately, which enables clear analysis of tracking performance. We evaluate the effectiveness of HOTA on the MOTChallenge benchmark, and show that it is able to capture important aspects of MOT performance not previously taken into account by established metrics. Furthermore, we show HOTA scores better align with human visual evaluation of tracking performance.

avg

pdf [BibTex]

pdf [BibTex]


no image
Visual-Inertial Mapping with Non-Linear Factor Recovery

Usenko, V., Demmel, N., Schubert, D., Stückler, J., Cremers, D.

IEEE Robotics and Automation Letters (RA-L), 5, 2020, accepted for presentation at IEEE International Conference on Robotics and Automation (ICRA) 2020, to appear, arXiv:1904.06504 (article)

Abstract
Cameras and inertial measurement units are complementary sensors for ego-motion estimation and environment mapping. Their combination makes visual-inertial odometry (VIO) systems more accurate and robust. For globally consistent mapping, however, combining visual and inertial information is not straightforward. To estimate the motion and geometry with a set of images large baselines are required. Because of that, most systems operate on keyframes that have large time intervals between each other. Inertial data on the other hand quickly degrades with the duration of the intervals and after several seconds of integration, it typically contains only little useful information. In this paper, we propose to extract relevant information for visual-inertial mapping from visual-inertial odometry using non-linear factor recovery. We reconstruct a set of non-linear factors that make an optimal approximation of the information on the trajectory accumulated by VIO. To obtain a globally consistent map we combine these factors with loop-closing constraints using bundle adjustment. The VIO factors make the roll and pitch angles of the global map observable, and improve the robustness and the accuracy of the mapping. In experiments on a public benchmark, we demonstrate superior performance of our method over the state-of-the-art approaches.

ev

[BibTex]

[BibTex]


no image
Fish-like aquatic propulsion studied using a pneumatically-actuated soft-robotic model

Wolf, Z., Jusufi, A., Vogt, D. M., Lauder, G. V.

Bioinspiration & Biomimetics, 15(4):046008, Inst. of Physics, London, 2020 (article)

bio

DOI [BibTex]

DOI [BibTex]

2016


Probabilistic Duality for Parallel Gibbs Sampling without Graph Coloring
Probabilistic Duality for Parallel Gibbs Sampling without Graph Coloring

Mescheder, L., Nowozin, S., Geiger, A.

Arxiv, 2016 (article)

Abstract
We present a new notion of probabilistic duality for random variables involving mixture distributions. Using this notion, we show how to implement a highly-parallelizable Gibbs sampler for weakly coupled discrete pairwise graphical models with strictly positive factors that requires almost no preprocessing and is easy to implement. Moreover, we show how our method can be combined with blocking to improve mixing. Even though our method leads to inferior mixing times compared to a sequential Gibbs sampler, we argue that our method is still very useful for large dynamic networks, where factors are added and removed on a continuous basis, as it is hard to maintain a graph coloring in this setup. Similarly, our method is useful for parallelizing Gibbs sampling in graphical models that do not allow for graph colorings with a small number of colors such as densely connected graphs.

avg

pdf [BibTex]


Map-Based Probabilistic Visual Self-Localization
Map-Based Probabilistic Visual Self-Localization

Brubaker, M. A., Geiger, A., Urtasun, R.

IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), 2016 (article)

Abstract
Accurate and efficient self-localization is a critical problem for autonomous systems. This paper describes an affordable solution to vehicle self-localization which uses odometry computed from two video cameras and road maps as the sole inputs. The core of the method is a probabilistic model for which an efficient approximate inference algorithm is derived. The inference algorithm is able to utilize distributed computation in order to meet the real-time requirements of autonomous systems in some instances. Because of the probabilistic nature of the model the method is capable of coping with various sources of uncertainty including noise in the visual odometry and inherent ambiguities in the map (e.g., in a Manhattan world). By exploiting freely available, community developed maps and visual odometry measurements, the proposed method is able to localize a vehicle to 4m on average after 52 seconds of driving on maps which contain more than 2,150km of drivable roads.

avg ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]