Header logo is


2019


Thumb xl lv
Taking a Deeper Look at the Inverse Compositional Algorithm

Lv, Z., Dellaert, F., Rehg, J. M., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.

avg

pdf suppmat Video Project Page Poster [BibTex]

2019


pdf suppmat Video Project Page Poster [BibTex]


Thumb xl mots
MOTS: Multi-Object Tracking and Segmentation

Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B. B. G., Geiger, A., Leibe, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 65,213 pixel masks for 977 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes.

avg

pdf suppmat Project Page Poster Video Project Page [BibTex]

pdf suppmat Project Page Poster Video Project Page [BibTex]


Thumb xl behl
PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

Behl, A., Paschalidou, D., Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

avg

pdf suppmat Project Page Poster Video [BibTex]

pdf suppmat Project Page Poster Video [BibTex]


Thumb xl donne
Learning Non-volumetric Depth Fusion using Successive Reprojections

Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

avg

pdf suppmat Project Page Video Poster [BibTex]

pdf suppmat Project Page Video Poster [BibTex]


Thumb xl liao
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation

Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
We propose a technique for depth estimation with a monocular structured-light camera, \ie, a calibrated stereo set-up with one camera and one laser projector. Instead of formulating the depth estimation via a correspondence search problem, we show that a simple convolutional architecture is sufficient for high-quality disparity estimates in this setting. As accurate ground-truth is hard to obtain, we train our model in a self-supervised fashion with a combination of photometric and geometric losses. Further, we demonstrate that the projected pattern of the structured light sensor can be reliably separated from the ambient information. This can then be used to improve depth boundaries in a weakly supervised fashion by modeling the joint statistics of image and depth edges. The model trained in this fashion compares favorably to the state-of-the-art on challenging synthetic and real-world datasets. In addition, we contribute a novel simulator, which allows to benchmark active depth prediction algorithms in controlled conditions.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


no image
Variational Autoencoders Recover PCA Directions (by Accident)

Rolinek, M., Zietlow, D., Martius, G.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
The Variational Autoencoder (VAE) is a powerful architecture capable of representation learning and generative modeling. When it comes to learning interpretable (disentangled) representations, VAE and its variants show unparalleled performance. However, the reasons for this are unclear, since a very particular alignment of the latent embedding is needed but the design of the VAE does not encourage it in any explicit way. We address this matter and offer the following explanation: the diagonal approximation in the encoder together with the inherent stochasticity force local orthogonality of the decoder. The local behavior of promoting both reconstruction and orthogonality matches closely how the PCA embedding is chosen. Alongside providing an intuitive understanding, we justify the statement with full theoretical analysis as well as with experiments.

al

arXiv [BibTex]

arXiv [BibTex]


Thumb xl superquadrics parsing
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids

Paschalidou, D., Ulusoy, A. O., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.

avg

Project Page Poster suppmat pdf Video handout [BibTex]

Project Page Poster suppmat pdf Video handout [BibTex]


Thumb xl m13 bacteriophages
Self-Assembled Phage-Based Colloids for High Localized Enzymatic Activity

Alarcon-Correa, M., Guenther, J., Troll, J., Kadiri, V. M., Bill, J., Fischer, P., Rothenstein, D.

ACS Nano, March 2019 (article)

Abstract
Catalytically active colloids are model systems for chemical motors and active matter. It is desirable to replace the inorganic catalysts and the toxic fuels that are often used, with biocompatible enzymatic reactions. However, compared to inorganic catalysts, enzyme-coated colloids tend to exhibit less activity. Here, we show that the self-assembly of genetically engineered M13 bacteriophages that bind enzymes to magnetic beads ensures high and localized enzymatic activity. These phage-decorated colloids provide a proteinaceous environment for directed enzyme immobilization. The magnetic properties of the colloidal carrier particle permit repeated enzyme recovery from a reaction solution, while the enzymatic activity is retained. Moreover, localizing the phage-based construct with a magnetic field in a microcontainer allows the enzyme-phage-colloids to function as an enzymatic micropump, where the enzymatic reaction generates a fluid flow. This system shows the fastest fluid flow reported to date by a biocompatible enzymatic micropump. In addition, it is functional in complex media including blood where the enzyme driven micropump can be powered at the physiological blood-urea concentration.

pf

link (url) DOI [BibTex]


Thumb xl jcp pfg nmr
Absolute diffusion measurements of active enzyme solutions by NMR

Guenther, J., Majer, G., Fischer, P.

J. Chem. Phys., 150(124201), March 2019 (article)

Abstract
The diffusion of enzymes is of fundamental importance for many biochemical processes. Enhanced or directed enzyme diffusion can alter the accessibility of substrates and the organization of enzymes within cells. Several studies based on fluorescence correlation spectroscopy (FCS) report enhanced diffusion of enzymes upon interaction with their substrate or inhibitor. In this context, major importance is given to the enzyme fructose-bisphosphate aldolase, for which enhanced diffusion has been reported even though the catalysed reaction is endothermic. Additionally, enhanced diffusion of tracer particles surrounding the active aldolase enzymes has been reported. These studies suggest that active enzymes can act as chemical motors that self-propel and give rise to enhanced diffusion. However, fluorescence studies of enzymes can, despite several advantages, suffer from artefacts. Here we show that the absolute diffusion coefficients of active enzyme solutions can be determined with Pulsed Field Gradient Nuclear Magnetic Resonance (PFG-NMR). The advantage of PFG-NMR is that the motion of the molecule of interest is directly observed in its native state without the need for any labelling. Further, PFG-NMR is model-free and thus yields absolute diffusion constants. Our PFG-NMR experiments of solutions containing active fructose-bisphosphate aldolase from rabbit muscle do not show any diffusion enhancement for the active enzymes nor the surrounding molecules. Additionally, we do not observe any diffusion enhancement of aldolase in the presence of its inhibitor pyrophosphate.

pf

link (url) DOI [BibTex]


Thumb xl activeoptorheologicalmedium
Chemical Nanomotors at the Gram Scale Form a Dense Active Optorheological Medium

Choudhury, U., Singh, D. P., Qiu, T., Fischer, P.

Adv. Mat., (1807382), Febuary 2019 (article)

Abstract
The rheological properties of a colloidal suspension are a function of the concentration of the colloids and their interactions. While suspensions of passive colloids are well studied and have been shown to form crystals, gels, and glasses, examples of energy‐consuming “active” colloidal suspensions are still largely unexplored. Active suspensions of biological matter, such as motile bacteria or dense mixtures of active actin–motor–protein mixtures have, respectively, reveals superfluid‐like and gel‐like states. Attractive inanimate systems for active matter are chemically self‐propelled particles. It has so far been challenging to use these swimming particles at high enough densities to affect the bulk material properties of the suspension. Here, it is shown that light‐triggered asymmetric titanium dioxide that self‐propel, can be obtained in large quantities, and self‐organize to make a gram‐scale active medium. The suspension shows an activity‐dependent tenfold reversible change in its bulk viscosity.

pf

link (url) DOI [BibTex]


Thumb xl hyperrayleigh
First Observation of Optical Activity in Hyper-Rayleigh Scattering

Collins, J., Rusimova, K., Hooper, D., Jeong, H. H., Ohnoutek, L., Pradaux-Caggiano, F., Verbiest, T., Carbery, D., Fischer, P., Valev, V.

Phys. Rev. X, 9(011024), January 2019 (article)

Abstract
Chiral nano- or metamaterials and surfaces enable striking photonic properties, such as negative refractive index and superchiral light, driving promising applications in novel optical components, nanorobotics, and enhanced chiral molecular interactions with light. In characterizing chirality, although nonlinear chiroptical techniques are typically much more sensitive than their linear optical counterparts, separating true chirality from anisotropy is a major challenge. Here, we report the first observation of optical activity in second-harmonic hyper-Rayleigh scattering (HRS). We demonstrate the effect in a 3D isotropic suspension of Ag nanohelices in water. The effect is 5 orders of magnitude stronger than linear optical activity and is well pronounced above the multiphoton luminescence background. Because of its sensitivity, isotropic environment, and straightforward experimental geometry, HRS optical activity constitutes a fundamental experimental breakthrough in chiral photonics for media including nanomaterials, metamaterials, and chemical molecules.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl systemillustration
Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

Gumbsch, C., Butz, M. V., Martius, G.

IEEE Transactions on Cognitive and Developmental Systems, 2019 (article)

Abstract
Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

al

arXiv PDF video link (url) DOI Project Page [BibTex]


no image
Machine Learning for Haptics: Inferring Multi-Contact Stimulation From Sparse Sensor Configuration

Sun, H., Martius, G.

Frontiers in Neurorobotics, 13, pages: 51, 2019 (article)

Abstract
Robust haptic sensation systems are essential for obtaining dexterous robots. Currently, we have solutions for small surface areas such as fingers, but affordable and robust techniques for covering large areas of an arbitrary 3D surface are still missing. Here, we introduce a general machine learning framework to infer multi-contact haptic forces on a 3D robot’s limb surface from internal deformation measured by only a few physical sensors. The general idea of this framework is to predict first the whole surface deformation pattern from the sparsely placed sensors and then to infer number, locations and force magnitudes of unknown contact points. We show how this can be done even if training data can only be obtained for single-contact points using transfer learning at the example of a modified limb of the Poppy robot. With only 10 strain-gauge sensors we obtain a high accuracy also for multiple-contact points. The method can be applied to arbitrarily shaped surfaces and physical sensor types, as long as training data can be obtained.

al

link (url) DOI [BibTex]


Thumb xl teaser website
Occupancy Networks: Learning 3D Reconstruction in Function Space

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, 2019 (inproceedings)

Abstract
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

avg

Code Video pdf suppmat Project Page [BibTex]

Code Video pdf suppmat Project Page [BibTex]

2018


Thumb xl toc image
Role of symmetry in driven propulsion at low Reynolds number

Sachs, J., Morozov, K. I., Kenneth, O., Qiu, T., Segreto, N., Fischer, P., Leshansky, A. M.

Phys. Rev. E, 98(6):063105, American Physical Society, December 2018 (article)

Abstract
We theoretically and experimentally investigate low-Reynolds-number propulsion of geometrically achiral planar objects that possess a dipole moment and that are driven by a rotating magnetic field. Symmetry considerations (involving parity, $\widehat{P}$, and charge conjugation, $\widehat{C}$) establish correspondence between propulsive states depending on orientation of the dipolar moment. Although basic symmetry arguments do not forbid individual symmetric objects to efficiently propel due to spontaneous symmetry breaking, they suggest that the average ensemble velocity vanishes. Some additional arguments show, however, that highly symmetrical ($\widehat{P}$-even) objects exhibit no net propulsion while individual less symmetrical ($\widehat{C}\widehat{P}$-even) propellers do propel. Particular magnetization orientation, rendering the shape $\widehat{C}\widehat{P}$-odd, yields unidirectional motion typically associated with chiral structures, such as helices. If instead of a structure with a permanent dipole we consider a polarizable object, some of the arguments have to be modified. For instance, we demonstrate a truly achiral ($\widehat{P}$- and $\widehat{C}\widehat{P}$-even) planar shape with an induced electric dipole that can propel by electro-rotation. We thereby show that chirality is not essential for propulsion due to rotation-translation coupling at low Reynolds number.

pf

link (url) DOI Project Page [BibTex]

2018


link (url) DOI Project Page [BibTex]


Thumb xl learn etc
Deep Reinforcement Learning for Event-Triggered Control

Baumann, D., Zhu, J., Martius, G., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), pages: 943-950, 57th IEEE International Conference on Decision and Control (CDC), December 2018 (inproceedings)

al ics

arXiv PDF DOI Project Page Project Page [BibTex]

arXiv PDF DOI Project Page Project Page [BibTex]


Thumb xl toc image
Optical and Thermophoretic Control of Janus Nanopen Injection into Living Cells

Maier, C. M., Huergo, M. A., Milosevic, S., Pernpeintner, C., Li, M., Singh, D. P., Walker, D., Fischer, P., Feldmann, J., Lohmüller, T.

Nano Letters, 18, pages: 7935–7941, November 2018 (article) Accepted

Abstract
Devising strategies for the controlled injection of functional nanoparticles and reagents into living cells paves the way for novel applications in nanosurgery, sensing, and drug delivery. Here, we demonstrate the light-controlled guiding and injection of plasmonic Janus nanopens into living cells. The pens are made of a gold nanoparticle attached to a dielectric alumina shaft. Balancing optical and thermophoretic forces in an optical tweezer allows single Janus nanopens to be trapped and positioned on the surface of living cells. While the optical injection process involves strong heating of the plasmonic side, the temperature of the alumina stays significantly lower, thus allowing the functionalization with fluorescently labeled, single-stranded DNA and, hence, the spatially controlled injection of genetic material with an untethered nanocarrier.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl content nanoroboter werden ins auge injiziert
A swarm of slippery micropropellers penetrates the vitreous body of the eye

Wu, Z., Troll, J., Jeong, H. H., Wei, Q., Stang, M., Ziemssen, F., Wang, Z., Dong, M., Schnichels, S., Qiu, T., Fischer, P.

Science Advances, 4(11):eaat4388, November 2018 (article)

Abstract
The intravitreal delivery of therapeutic agents promises major benefits in the field of ocular medicine. Traditional delivery methods rely on the random, passive diffusion of molecules, which do not allow for the rapid delivery of a concentrated cargo to a defined region at the posterior pole of the eye. The use of particles promises targeted delivery but faces the challenge that most tissues including the vitreous have a tight macromolecular matrix that acts as a barrier and prevents its penetration. Here, we demonstrate novel intravitreal delivery microvehicles slippery micropropellers that can be actively propelled through the vitreous humor to reach the retina. The propulsion is achieved by helical magnetic micropropellers that have a liquid layer coating to minimize adhesion to the surrounding biopolymeric network. The submicrometer diameter of the propellers enables the penetration of the biopolymeric network and the propulsion through the porcine vitreous body of the eye over centimeter distances. Clinical optical coherence tomography is used to monitor the movement of the propellers and confirm their arrival on the retina near the optic disc. Overcoming the adhesion forces and actively navigating a swarm of micropropellers in the dense vitreous humor promise practical applications in ophthalmology.

pf

Video: Nanorobots propel through the eye link (url) DOI [BibTex]

Video: Nanorobots propel through the eye link (url) DOI [BibTex]


Thumb xl toc image
Gait learning for soft microrobots controlled by light fields

Rohr, A. V., Trimpe, S., Marco, A., Fischer, P., Palagi, S.

In International Conference on Intelligent Robots and Systems (IROS) 2018, pages: 6199-6206, International Conference on Intelligent Robots and Systems 2018, October 2018 (inproceedings)

Abstract
Soft microrobots based on photoresponsive materials and controlled by light fields can generate a variety of different gaits. This inherent flexibility can be exploited to maximize their locomotion performance in a given environment and used to adapt them to changing environments. However, because of the lack of accurate locomotion models, and given the intrinsic variability among microrobots, analytical control design is not possible. Common data-driven approaches, on the other hand, require running prohibitive numbers of experiments and lead to very sample-specific results. Here we propose a probabilistic learning approach for light-controlled soft microrobots based on Bayesian Optimization (BO) and Gaussian Processes (GPs). The proposed approach results in a learning scheme that is highly data-efficient, enabling gait optimization with a limited experimental budget, and robust against differences among microrobot samples. These features are obtained by designing the learning scheme through the comparison of different GP priors and BO settings on a semisynthetic data set. The developed learning scheme is validated in microrobot experiments, resulting in a 115% improvement in a microrobot’s locomotion performance with an experimental budget of only 20 tests. These encouraging results lead the way toward self-adaptive microrobotic systems based on lightcontrolled soft microrobots and probabilistic learning control.

ics pf

arXiv IEEE Xplore DOI Project Page [BibTex]

arXiv IEEE Xplore DOI Project Page [BibTex]


Thumb xl sevillagcpr
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 281-297, Springer, Cham, October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

avg ps

arXiv DOI [BibTex]

arXiv DOI [BibTex]


Thumb xl iros18
Towards Robust Visual Odometry with a Multi-Camera System

Liu, P., Geppert, M., Heng, L., Sattler, T., Geiger, A., Pollefeys, M.

In International Conference on Intelligent Robots and Systems (IROS) 2018, International Conference on Intelligent Robots and Systems, October 2018 (inproceedings)

Abstract
We present a visual odometry (VO) algorithm for a multi-camera system and robust operation in challenging environments. Our algorithm consists of a pose tracker and a local mapper. The tracker estimates the current pose by minimizing photometric errors between the most recent keyframe and the current frame. The mapper initializes the depths of all sampled feature points using plane-sweeping stereo. To reduce pose drift, a sliding window optimizer is used to refine poses and structure jointly. Our formulation is flexible enough to support an arbitrary number of stereo cameras. We evaluate our algorithm thoroughly on five datasets. The datasets were captured in different conditions: daytime, night-time with near-infrared (NIR) illumination and night-time without NIR illumination. Experimental results show that a multi-camera setup makes the VO more robust to challenging environments, especially night-time conditions, in which a single stereo configuration fails easily due to the lack of features.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl encyclop med robotics
Nanoscale robotic agents in biological fluids and tissues

Palagi, S., Walker, D. Q. T., Fischer, P.

In The Encyclopedia of Medical Robotics, 2, pages: 19-42, 2, (Editors: Desai, J. P. and Ferreira, A.), World Scientific, October 2018 (inbook)

Abstract
Nanorobots are untethered structures of sub-micron size that can be controlled in a non-trivial way. Such nanoscale robotic agents are envisioned to revolutionize medicine by enabling minimally invasive diagnostic and therapeutic procedures. To be useful, nanorobots must be operated in complex biological fluids and tissues, which are often difficult to penetrate. In this chapter, we first discuss potential medical applications of motile nanorobots. We briefly present the challenges related to swimming at such small scales and we survey the rheological properties of some biological fluids and tissues. We then review recent experimental results in the development of nanorobots and in particular their design, fabrication, actuation, and propulsion in complex biological fluids and tissues. Recent work shows that their nanoscale dimension is a clear asset for operation in biological tissues, since many biological tissues consist of networks of macromolecules that prevent the passage of larger micron-scale structures, but contain dynamic pores through which nanorobots can move.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc image
Fast spatial scanning of 3D ultrasound fields via thermography

Melde, K., Qiu, T., Fischer, P.

Applied Physics Letters, 113(13):133503, September 2018 (article)

Abstract
We propose and demonstrate a thermographic method that allows rapid scanning of ultrasound fields in a volume to yield 3D maps of the sound intensity. A thin sound-absorbing membrane is continuously translated through a volume of interest while a thermal camera records the evolution of its surface temperature. The temperature rise is a function of the absorbed sound intensity, such that the thermal image sequence can be combined to reveal the sound intensity distribution in the traversed volume. We demonstrate the mapping of ultrasound fields, which is several orders of magnitude faster than scanning with a hydrophone. Our results are in very good agreement with theoretical simulations.

pf

link (url) DOI Project Page [BibTex]


Thumb xl ianeccv18
Learning Priors for Semantic 3D Reconstruction

Cherabier, I., Schönberger, J., Oswald, M., Pollefeys, M., Geiger, A.

In Computer Vision – ECCV 2018, Springer International Publishing, Cham, September 2018 (inproceedings)

Abstract
We present a novel semantic 3D reconstruction framework which embeds variational regularization into a neural network. Our network performs a fixed number of unrolled multi-scale optimization iterations with shared interaction weights. In contrast to existing variational methods for semantic 3D reconstruction, our model is end-to-end trainable and captures more complex dependencies between the semantic labels and the 3D geometry. Compared to previous learning-based approaches to 3D reconstruction, we integrate powerful long-range dependencies using variational coarse-to-fine optimization. As a result, our network architecture requires only a moderate number of parameters while keeping a high level of expressiveness which enables learning from very little data. Experiments on real and synthetic datasets demonstrate that our network achieves higher accuracy compared to a purely variational approach while at the same time requiring two orders of magnitude less iterations to converge. Moreover, our approach handles ten times more semantic class labels using the same computational resources.

avg

pdf suppmat Project Page Video DOI Project Page [BibTex]

pdf suppmat Project Page Video DOI Project Page [BibTex]


Thumb xl joeleccv18
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Janai, J., Güney, F., Ranjan, A., Black, M. J., Geiger, A.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11220, pages: 713-731, Springer, Cham, September 2018 (inproceedings)

avg ps

pdf suppmat Video Project Page DOI Project Page [BibTex]

pdf suppmat Video Project Page DOI Project Page [BibTex]


Thumb xl grasping
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


Thumb xl beneccv18
SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images

Coors, B., Condurache, A. P., Geiger, A.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Omnidirectional cameras offer great benefits over classical cameras wherever a wide field of view is essential, such as in virtual reality applications or in autonomous robots. Unfortunately, standard convolutional neural networks are not well suited for this scenario as the natural projection surface is a sphere which cannot be unwrapped to a plane without introducing significant distortions, particularly in the polar regions. In this work, we present SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks. Towards this goal, SphereNet adapts the sampling locations of the convolutional filters, effectively reversing distortions, and wraps the filters around the sphere. By building on regular convolutions, SphereNet enables the transfer of existing perspective convolutional neural network models to the omnidirectional case. We demonstrate the effectiveness of our method on the tasks of image classification and object detection, exploiting two newly created semi-synthetic and real-world omnidirectional datasets.

avg

pdf suppmat Project Page [BibTex]


Thumb xl toc image
Diffusion Measurements of Swimming Enzymes with Fluorescence Correlation Spectroscopy

Günther, J., Börsch, M., Fischer, P.

Accounts of Chemical Research, 51(9):1911-1920, August 2018 (article)

Abstract
Self-propelled chemical motors are chemically powered micro- or nanosized swimmers. The energy required for these motors’ active motion derives from catalytic chemical reactions and the transformation of a fuel dissolved in the solution. While self-propulsion is now well established for larger particles, it is still unclear if enzymes, nature’s nanometer-sized catalysts, are potentially also self-powered nanomotors. Because of its small size, any increase in an enzyme’s diffusion due to active self-propulsion must be observed on top of the enzyme’s passive Brownian motion, which dominates at this scale. Fluorescence correlation spectroscopy (FCS) is a sensitive method to quantify the diffusion properties of single fluorescently labeled molecules in solution. FCS experiments have shown a general increase in the diffusion constant of a number of enzymes when the enzyme is catalytically active. Diffusion enhancements after addition of the enzyme’s substrate (and sometimes its inhibitor) of up to 80\% have been reported, which is at least 1 order of magnitude higher than what theory would predict. However, many factors contribute to the FCS signal and in particular the shape of the autocorrelation function, which underlies diffusion measurements by fluorescence correlation spectroscopy. These effects need to be considered to establish if and by how much the catalytic activity changes an enzyme’s diffusion.We carefully review phenomena that can play a role in FCS experiments and the determination of enzyme diffusion, including the dissociation of enzyme oligomers upon interaction with the substrate, surface binding of the enzyme to glass during the experiment, conformational changes upon binding, and quenching of the fluorophore. We show that these effects can cause changes in the FCS signal that behave similar to an increase in diffusion. However, in the case of the enzymes F1-ATPase and alkaline phosphatase, we demonstrate that there is no measurable increase in enzyme diffusion. Rather, dissociation and conformational changes account for the changes in the FCS signal in the former and fluorophore quenching in the latter. Within the experimental accuracy of our FCS measurements, we do not observe any change in diffusion due to activity for the enzymes we have investigated.We suggest useful control experiments and additional tests for future FCS experiments that should help establish if the observed diffusion enhancement is real or if it is due to an experimental or data analysis artifact. We show that fluorescence lifetime and mean intensity measurements are essential in order to identify the nature of the observed changes in the autocorrelation function. While it is clear from theory that chemically active enzymes should also act as self-propelled nanomotors, our FCS measurements show that the associated increase in diffusion is much smaller than previously reported. Further experiments are needed to quantify the contribution of the enzymes’ catalytic activity to their self-propulsion. We hope that our findings help to establish a useful protocol for future FCS studies in this field and help establish by how much the diffusion of an enzyme is enhanced through catalytic activity.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc imagen
Uphill production of dihydrogen by enzymatic oxidation of glucose without an external energy source

Suraniti, E., Merzeau, P., Roche, J., Gounel, S., Mark, A. G., Fischer, P., Mano, N., Kuhn, A.

Nature Communications, 9(1):3229, August 2018 (article)

Abstract
Chemical systems do not allow the coupling of energy from several simple reactions to drive a subsequent reaction, which takes place in the same medium and leads to a product with a higher energy than the one released during the first reaction. Gibbs energy considerations thus are not favorable to drive e.g., water splitting by the direct oxidation of glucose as a model reaction. Here, we show that it is nevertheless possible to carry out such an energetically uphill reaction, if the electrons released in the oxidation reaction are temporarily stored in an electromagnetic system, which is then used to raise the electrons' potential energy so that they can power the electrolysis of water in a second step. We thereby demonstrate the general concept that lower energy delivering chemical reactions can be used to enable the formation of higher energy consuming reaction products in a closed system.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc image
Chemical micromotors self-assemble and self-propel by spontaneous symmetry breaking

Yu, T., Chuphal, P., Thakur, S., Reigh, S. Y., Singh, D. P., Fischer, P.

Chem. Comm., 54, pages: 11933-11936, August 2018 (article)

Abstract
Self-propelling chemical motors have thus far required the fabrication of Janus particles with an asymmetric catalyst distribution. Here, we demonstrate that simple, isotropic colloids can spontaneously assemble to yield dimer motors that self-propel. In a mixture of isotropic titanium dioxide colloids with photo-chemical catalytic activity and passive silica colloids, light illumination causes diffusiophoretic attractions between the active and passive particles and leads to the formation of dimers. The dimers constitute a symmetry-broken motor, whose dynamics can be fully controlled by the illumination conditions. Computer simulations reproduce the dynamics of the colloids and are in good agreement with experiments. The current work presents a simple route to obtain large numbers of self-propelling chemical motors from a dispersion of spherically symmetric colloids through spontaneous symmetry breaking.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc image
A machine from machines

Fischer, P.

Nature Physics, 14, pages: 1072–1073, July 2018 (misc)

Abstract
Building spinning microrotors that self-assemble and synchronize to form a gear sounds like an impossible feat. However, it has now been achieved using only a single type of building block -- a colloid that self-propels.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc image
Chemotaxis of Active Janus Nanoparticles

Popescu, M. N., Uspal, W. E., Bechinger, C., Fischer, P.

Nano Letters, 18(9):5345–5349, July 2018 (article)

Abstract
While colloids and molecules in solution exhibit passive Brownian motion, particles that are partially covered with a catalyst, which promotes the transformation of a fuel dissolved in the solution, can actively move. These active Janus particles are known as “chemical nanomotors” or self-propelling “swimmers” and have been realized with a range of catalysts, sizes, and particle geometries. Because their active translation depends on the fuel concentration, one expects that active colloidal particles should also be able to swim toward a fuel source. Synthesizing and engineering nanoparticles with distinct chemotactic properties may enable important developments, such as particles that can autonomously swim along a pH gradient toward a tumor. Chemotaxis requires that the particles possess an active coupling of their orientation to a chemical gradient. In this Perspective we provide a simple, intuitive description of the underlying mechanisms for chemotaxis, as well as the means to analyze and classify active particles that can show positive or negative chemotaxis. The classification provides guidance for engineering a specific response and is a useful organizing framework for the quantitative analysis and modeling of chemotactic behaviors. Chemotaxis is emerging as an important focus area in the field of active colloids and promises a number of fascinating applications for nanoparticles and particle-based delivery.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl mazen
Robust Physics-based Motion Retargeting with Realistic Body Shapes

Borno, M. A., Righetti, L., Black, M. J., Delp, S. L., Fiume, E., Romero, J.

Computer Graphics Forum, 37, pages: 6:1-12, July 2018 (article)

Abstract
Motion capture is often retargeted to new, and sometimes drastically different, characters. When the characters take on realistic human shapes, however, we become more sensitive to the motion looking right. This means adapting it to be consistent with the physical constraints imposed by different body shapes. We show how to take realistic 3D human shapes, approximate them using a simplified representation, and animate them so that they move realistically using physically-based retargeting. We develop a novel spacetime optimization approach that learns and robustly adapts physical controllers to new bodies and constraints. The approach automatically adapts the motion of the mocap subject to the body shape of a target subject. This motion respects the physical properties of the new body and every body shape results in a different and appropriate movement. This makes it easy to create a varied set of motions from a single mocap sequence by simply varying the characters. In an interactive environment, successful retargeting requires adapting the motion to unexpected external forces. We achieve robustness to such forces using a novel LQR-tree formulation. We show that the simulated motions look appropriate to each character’s anatomy and their actions are robust to perturbations.

mg ps

pdf video Project Page Project Page [BibTex]

pdf video Project Page Project Page [BibTex]


Thumb xl cover book high 1
Colloidal Chemical Nanomotors

Alarcon-Correa, M.

Colloidal Chemical Nanomotors, pages: 150, Cuvillier Verlag, MPI-IS , June 2018 (phdthesis)

Abstract
Synthetic sophisticated nanostructures represent a fundamental building block for the development of nanotechnology. The fabrication of nanoparticles complex in structure and material composition is key to build nanomachines that can operate as man-made nanoscale motors, which autonomously convert external energy into motion. To achieve this, asymmetric nanoparticles were fabricated combining a physical vapor deposition technique known as NanoGLAD and wet chemical synthesis. This thesis primarily concerns three complex colloidal systems that have been developed: i)Hollow nanocup inclusion complexes that have a single Au nanoparticle in their pocket. The Au particle can be released with an external trigger. ii)The smallest self-propelling nanocolloids that have been made to date, which give rise to a local concentration gradient that causes enhanced diffusion of the particles. iii)Enzyme-powered pumps that have been assembled using bacteriophages as biological nanoscaffolds. This construct also can be used for enzyme recovery after heterogeneous catalysis.

pf

[BibTex]

[BibTex]


Thumb xl propultion. of helical m
Bioinspired microrobots

Palagi, S., Fischer, P.

Nature Reviews Materials, 3, pages: 113–124, May 2018 (article)

Abstract
Microorganisms can move in complex media, respond to the environment and self-organize. The field of microrobotics strives to achieve these functions in mobile robotic systems of sub-millimetre size. However, miniaturization of traditional robots and their control systems to the microscale is not a viable approach. A promising alternative strategy in developing microrobots is to implement sensing, actuation and control directly in the materials, thereby mimicking biological matter. In this Review, we discuss design principles and materials for the implementation of robotic functionalities in microrobots. We examine different biological locomotion strategies, and we discuss how they can be artificially recreated in magnetic microrobots and how soft materials improve control and performance. We show that smart, stimuli-responsive materials can act as on-board sensors and actuators and that ‘active matter’ enables autonomous motion, navigation and collective behaviours. Finally, we provide a critical outlook for the field of microrobotics and highlight the challenges that need to be overcome to realize sophisticated microrobots, which one day might rival biological machines.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl icra2018
Soft Miniaturized Linear Actuators Wirelessly Powered by Rotating Permanent Magnets

Qiu, T., Palagi, S., Sachs, J., Fischer, P.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 3595-3600, May 2018 (inproceedings)

Abstract
Wireless actuation by magnetic fields allows for the operation of untethered miniaturized devices, e.g. in biomedical applications. Nevertheless, generating large controlled forces over relatively large distances is challenging. Magnetic torques are easier to generate and control, but they are not always suitable for the tasks at hand. Moreover, strong magnetic fields are required to generate a sufficient torque, which are difficult to achieve with electromagnets. Here, we demonstrate a soft miniaturized actuator that transforms an externally applied magnetic torque into a controlled linear force. We report the design, fabrication and characterization of both the actuator and the magnetic field generator. We show that the magnet assembly, which is based on a set of rotating permanent magnets, can generate strong controlled oscillating fields over a relatively large workspace. The actuator, which is 3D-printed, can lift a load of more than 40 times its weight. Finally, we show that the actuator can be further miniaturized, paving the way towards strong, wirelessly powered microactuators.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl andrease teaser 2
Robust Dense Mapping for Large-Scale Dynamic Environments

Barsan, I. A., Liu, P., Pollefeys, M., Geiger, A.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

Abstract
We present a stereo-based dense mapping algorithm for large-scale dynamic urban environments. In contrast to other existing methods, we simultaneously reconstruct the static background, the moving objects, and the potentially moving but currently stationary objects separately, which is desirable for high-level mobile robotic tasks such as path planning in crowded environments. We use both instance-aware semantic segmentation and sparse scene flow to classify objects as either background, moving, or potentially moving, thereby ensuring that the system is able to model objects with the potential to transition from static to dynamic, such as parked cars. Given camera poses estimated from visual odometry, both the background and the (potentially) moving objects are reconstructed separately by fusing the depth maps computed from the stereo input. In addition to visual odometry, sparse scene flow is also used to estimate the 3D motions of the detected moving objects, in order to reconstruct them accurately. A map pruning technique is further developed to improve reconstruction accuracy and reduce memory consumption, leading to increased scalability. We evaluate our system thoroughly on the well-known KITTI dataset. Our system is capable of running on a PC at approximately 2.5Hz, with the primary bottleneck being the instance-aware semantic segmentation, which is a limitation we hope to address in future work.

avg

pdf Video Project Page Project Page [BibTex]

pdf Video Project Page Project Page [BibTex]


Thumb xl screenshot 2018 05 18 16 38 40
Learning 3D Shape Completion under Weak Supervision

Stutz, D., Geiger, A.

Arxiv, May 2018 (article)

Abstract
We address the problem of 3D shape completion from sparse and noisy point clouds, a fundamental problem in computer vision and robotics. Recent approaches are either data-driven or learning-based: Data-driven approaches rely on a shape model whose parameters are optimized to fit the observations; Learning-based approaches, in contrast, avoid the expensive optimization step by learning to directly predict complete shapes from incomplete observations in a fully-supervised setting. However, full supervision is often not available in practice. In this work, we propose a weakly-supervised learning-based approach to 3D shape completion which neither requires slow optimization nor direct supervision. While we also learn a shape prior on synthetic data, we amortize, i.e., learn, maximum likelihood fitting using deep neural networks resulting in efficient shape completion without sacrificing accuracy. On synthetic benchmarks based on ShapeNet and ModelNet as well as on real robotics data from KITTI and Kinect, we demonstrate that the proposed amortized maximum likelihood approach is able to compete with fully supervised baselines and outperforms data-driven approaches, while requiring less supervision and being significantly faster.

avg

PDF Project Page Project Page [BibTex]


no image
Nonlinear decoding of a complex movie from the mammalian retina

Botella-Soler, V., Deny, S., Martius, G., Marre, O., Tkačik, G.

PLOS Computational Biology, 14(5):1-27, Public Library of Science, May 2018 (article)

Abstract
Author summary Neurons in the retina transform patterns of incoming light into sequences of neural spikes. We recorded from ∼100 neurons in the rat retina while it was stimulated with a complex movie. Using machine learning regression methods, we fit decoders to reconstruct the movie shown from the retinal output. We demonstrated that retinal code can only be read out with a low error if decoders make use of correlations between successive spikes emitted by individual neurons. These correlations can be used to ignore spontaneous spiking that would, otherwise, cause even the best linear decoders to “hallucinate” nonexistent stimuli. This work represents the first high resolution single-trial full movie reconstruction and suggests a new paradigm for separating spontaneous from stimulus-driven neural activity.

al

DOI [BibTex]

DOI [BibTex]


Thumb xl graphene silver hybrid
Graphene-silver hybrid devices for sensitive photodetection in the ultraviolet

Paria, D., Jeong, H. H., Vadakkumbatt, V., Deshpande, P., Fischer, P., Ghosh, A., Ghosh, A.

Nanoscale, 10, pages: 7685-7693, April 2018 (article)

Abstract
The weak light-matter interaction in graphene can be enhanced with a number of strategies, among which sensitization with plasmonic nanostructures is particularly attractive. This has resulted in the development of graphene-plasmonic hybrid systems with strongly enhanced photodetection efficiencies in the visible and the IR, but none in the UV. Here, we describe a silver nanoparticle-graphene stacked optoelectronic device that shows strong enhancement of its photoresponse across the entire UV spectrum. The device fabrication strategy is scalable and modular. Self-assembly techniques are combined with physical shadow growth techniques to fabricate a regular large-area array of 50 nm silver nanoparticles onto which CVD graphene is transferred. The presence of the silver nanoparticles resulted in a plasmonically enhanced photoresponse as high as 3.2 A W-1 in the wavelength range from 330 nm to 450 nm. At lower wavelengths, close to the Van Hove singularity of the density of states in graphene, we measured an even higher responsivity of 14.5 A W-1 at 280 nm, which corresponds to a more than 10 000-fold enhancement over the photoresponse of native graphene.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl focus cover
Nanoparticles on the move for medicine

Fischer, P.

Physics World Focus on Nanotechnology, pages: 26028, (Editors: Margaret Harris), IOP Publishing Ltd and individual contributors, April 2018 (article)

Abstract
Peer Fischer outlines the prospects for creating “nanoswimmers” that can be steered through the body to deliver drugs directly to their targets Molecules don’t move very fast on their own. If they had to rely solely on diffusion – a slow and inefficient process linked to the Brownian motion of small particles and molecules in solution – then a protein mole­cule, for instance, would take around three weeks to travel a single centimetre down a nerve fibre. This is why active transport mechanisms exist in cells and in the human body: without them, all the processes of life would happen at a pace that would make snails look speedy.

pf

link (url) [BibTex]

link (url) [BibTex]


Thumb xl singh et al 2018 advanced functional materials
Photogravitactic Microswimmers

Singh, D. P., Uspal, W. E., Popescu, M. N., Wilson, L. G., Fischer, P.

Adv. Func. Mat., 28, pages: 1706660, Febuary 2018 (article)

Abstract
Abstract Phototactic microorganisms are commonly observed to respond to natural sunlight by swimming upward against gravity. This study demonstrates that synthetic photochemically active microswimmers can also swim against gravity. The particles initially sediment and, when illuminated at low light intensities exhibit wall‐bound states of motion near the bottom surface. Upon increasing the intensity of light, the artificial swimmers lift off from the wall and swim against gravity and away from the light source. This motion in the bulk has been further confirmed using holographic microscopy. A theoretical model is presented within the framework of self‐diffusiophoresis, which allows to unequivocally identify the photochemical activity and the phototactic response as key mechanisms in the observed phenomenology. Since the lift‐off threshold intensity depends on the particle size, it can be exploited to selectively address particles with the same density from a polydisperse mixture of active particles and move them in or out of the boundary region. This study provides a simple design strategy to fabricate artificial microswimmers whose two‐ or three‐dimensional swimming behavior can be controlled with light.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl matuschek et al 2018 small
Chiral Plasmonic Hydrogen Sensors

Matuschek, M., Singh, D. P., Hyeon-Ho, J., Nesterov, M., Weiss, T., Fischer, P., Neubrech, F., Na Liu, L.

Small, 14(7):1702990, Febuary 2018 (article)

Abstract
In this article, a chiral plasmonic hydrogen‐sensing platform using palladium‐based nanohelices is demonstrated. Such 3D chiral nanostructures fabricated by nanoglancing angle deposition exhibit strong circular dichroism both experimentally and theoretically. The chiroptical properties of the palladium nanohelices are altered upon hydrogen uptake and sensitively depend on the hydrogen concentration. Such properties are well suited for remote and spark‐free hydrogen sensing in the flammable range. Hysteresis is reduced, when an increasing amount of gold is utilized in the palladium‐gold hybrid helices. As a result, the linearity of the circular dichroism in response to hydrogen is significantly improved. The chiral plasmonic sensor scheme is of potential interest for hydrogen‐sensing applications, where good linearity and high sensitivity are required.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl fig1b
Acoustic Fabrication via the Assembly and Fusion of Particles

Melde, K., Choi, E., Wu, Z., Palagi, S., Qiu, T., Fischer, P.

Advanced Materials, 30(3):1704507, January 2018 (article)

Abstract
Acoustic assembly promises a route toward rapid parallel fabrication of whole objects directly from solution. This study reports the contact-free and maskless assembly, and fixing of silicone particles into arbitrary 2D shapes using ultrasound fields. Ultrasound passes through an acoustic hologram to form a target image. The particles assemble from a suspension along lines of high pressure in the image due to acoustic radiation forces and are then fixed (crosslinked) in a UV-triggered reaction. For this, the particles are loaded with a photoinitiator by solvent-induced swelling. This localizes the reaction and allows the bulk suspension to be reused. The final fabricated parts are mechanically stable and self-supporting.

pf

link (url) DOI Project Page [BibTex]


Thumb xl despoina paper teaser
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials

Paschalidou, D., Ulusoy, A. O., Schmitt, C., Gool, L., Geiger, A.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
In this paper, we consider the problem of reconstructing a dense 3D model using images captured from different views. Recent methods based on convolutional neural networks (CNN) allow learning the entire task from data. However, they do not incorporate the physics of image formation such as perspective geometry and occlusion. Instead, classical approaches based on Markov Random Fields (MRF) with ray-potentials explicitly model these physical processes, but they cannot cope with large surface appearance variations across different viewpoints. In this paper, we propose RayNet, which combines the strengths of both frameworks. RayNet integrates a CNN that learns view-invariant feature representations with an MRF that explicitly encodes the physics of perspective projection and occlusion. We train RayNet end-to-end using empirical risk minimization. We thoroughly evaluate our approach on challenging real-world datasets and demonstrate its benefits over a piece-wise trained baseline, hand-crafted models as well as other learning-based approaches.

avg

pdf suppmat Video Project Page code Poster Project Page [BibTex]

pdf suppmat Video Project Page code Poster Project Page [BibTex]


Thumb xl yiyi paper teaser
Deep Marching Cubes: Learning Explicit Surface Representations

Liao, Y., Donne, S., Geiger, A.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Existing learning based solutions to 3D surface prediction cannot be trained end-to-end as they operate on intermediate representations (eg, TSDF) from which 3D surface meshes must be extracted in a post-processing step (eg, via the marching cubes algorithm). In this paper, we investigate the problem of end-to-end 3D surface prediction. We first demonstrate that the marching cubes algorithm is not differentiable and propose an alternative differentiable formulation which we insert as a final layer into a 3D convolutional neural network. We further propose a set of loss functions which allow for training our model with sparse point supervision. Our experiments demonstrate that the model allows for predicting sub-voxel accurate 3D shapes of arbitrary topology. Additionally, it learns to complete shapes and to separate an object's inside from its outside even in the presence of sparse and incomplete ground truth. We investigate the benefits of our approach on the task of inferring shapes from 3D point clouds. Our model is flexible and can be combined with a variety of shape encoder and shape inference techniques.

avg

pdf suppmat Video Project Page Poster Project Page [BibTex]

pdf suppmat Video Project Page Poster Project Page [BibTex]


Thumb xl teaser andreas
Semantic Visual Localization

Schönberger, J., Pollefeys, M., Geiger, A., Sattler, T.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, eg, in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


Thumb xl hassan teaser paper
Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes

Alhaija, H., Mustikovela, S., Mescheder, L., Geiger, A., Rother, C.

International Journal of Computer Vision (IJCV), 2018, 2018 (article)

Abstract
The success of deep learning in computer vision is based on the availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Unfortunately, creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation and object detection models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment these images with virtual objects. In contrast to modeling complete 3D environments, our data augmentation approach requires only a few user interactions in combination with 3D models of the target object category. Leveraging our approach, we introduce a novel dataset of augmented urban driving scenes with 360 degree images that are used as environment maps to create realistic lighting and reflections on rendered objects. We analyze the significance of realistic object placement by comparing manual placement by humans to automatic methods based on semantic scene analysis. This allows us to create composite images which exhibit both realistic background appearance as well as a large number of complex object arrangements. Through an extensive set of experiments, we conclude the right set of parameters to produce augmented data which can maximally enhance the performance of instance segmentation models. Further, we demonstrate the utility of the proposed approach on training standard deep models for semantic instance segmentation and object detection of cars in outdoor driving scenarios. We test the models trained on our augmented data on the KITTI 2015 dataset, which we have annotated with pixel-accurate ground truth, and on the Cityscapes dataset. Our experiments demonstrate that the models trained on augmented imagery generalize better than those trained on fully synthetic data or models trained on limited amounts of annotated real data.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl eigval gradpen
Which Training Methods for GANs do actually Converge?

Mescheder, L., Geiger, A., Nowozin, S.

International Conference on Machine learning (ICML), 2018 (conference)

Abstract
Recent work has shown local convergence of GAN training for absolutely continuous data and generator distributions. In this paper, we show that the requirement of absolute continuity is necessary: we describe a simple yet prototypical counterexample showing that in the more realistic case of distributions that are not absolutely continuous, unregularized GAN training is not always convergent. Furthermore, we discuss regularization strategies that were recently proposed to stabilize GAN training. Our analysis shows that GAN training with instance noise or zero-centered gradient penalties converges. On the other hand, we show that Wasserstein-GANs and WGAN-GP with a finite number of discriminator updates per generator update do not always converge to the equilibrium point. We discuss these results, leading us to a new explanation for the stability problems of GAN training. Based on our analysis, we extend our convergence results to more general GANs and prove local convergence for simplified gradient penalties even if the generator and data distributions lie on lower dimensional manifolds. We find these penalties to work well in practice and use them to learn high-resolution generative image models for a variety of datasets with little hyperparameter tuning.

avg

code video paper supplement slides poster Project Page [BibTex]