Header logo is


2019


Thumb xl occ flow
Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
Deep learning based 3D reconstruction techniques have recently achieved impressive results. However, while state-of-the-art methods are able to output complex 3D geometry, it is not clear how to extend these results to time-varying topologies. Approaches treating each time step individually lack continuity and exhibit slow inference, while traditional 4D reconstruction methods often utilize a template model or discretize the 4D space at fixed resolution. In this work, we present Occupancy Flow, a novel spatio-temporal representation of time-varying 3D geometry with implicit correspondences. Towards this goal, we learn a temporally and spatially continuous vector field which assigns a motion vector to every point in space and time. In order to perform dense 4D reconstruction from images or sparse point clouds, we combine our method with a continuous 3D representation. Implicitly, our model yields correspondences over time, thus enabling fast inference while providing a sound physical description of the temporal dynamics. We show that our method can be used for interpolation and reconstruction tasks, and demonstrate the accuracy of the learned correspondences. We believe that Occupancy Flow is a promising new 4D representation which will be useful for a variety of spatio-temporal reconstruction tasks.

avg

pdf poster suppmat code Project page video [BibTex]


Thumb xl tex felds
Texture Fields: Learning Texture Representations in Function Space

Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
In recent years, substantial progress has been achieved in learning-based reconstruction of 3D objects. At the same time, generative models were proposed that can generate highly realistic images. However, despite this success in these closely related tasks, texture reconstruction of 3D objects has received little attention from the research community and state-of-the-art methods are either limited to comparably low resolution or constrained experimental setups. A major reason for these limitations is that common representations of texture are inefficient or hard to interface for modern deep learning techniques. In this paper, we propose Texture Fields, a novel texture representation which is based on regressing a continuous 3D function parameterized with a neural network. Our approach circumvents limiting factors like shape discretization and parameterization, as the proposed texture representation is independent of the shape representation of the 3D object. We show that Texture Fields are able to represent high frequency texture and naturally blend with modern deep learning techniques. Experimentally, we find that Texture Fields compare favorably to state-of-the-art methods for conditional texture reconstruction of 3D objects and enable learning of probabilistic generative models for texturing unseen 3D models. We believe that Texture Fields will become an important building block for the next generation of generative 3D models.

avg

pdf suppmat video [BibTex]

pdf suppmat video [BibTex]


Thumb xl lv
Taking a Deeper Look at the Inverse Compositional Algorithm

Lv, Z., Dellaert, F., Rehg, J. M., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.

avg

pdf suppmat Video Project Page Poster [BibTex]

pdf suppmat Video Project Page Poster [BibTex]


Thumb xl mots
MOTS: Multi-Object Tracking and Segmentation

Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B. B. G., Geiger, A., Leibe, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 65,213 pixel masks for 977 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes.

avg

pdf suppmat Project Page Poster Video Project Page [BibTex]

pdf suppmat Project Page Poster Video Project Page [BibTex]


Thumb xl behl
PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

Behl, A., Paschalidou, D., Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

avg

pdf suppmat Project Page Poster Video [BibTex]

pdf suppmat Project Page Poster Video [BibTex]


Thumb xl donne
Learning Non-volumetric Depth Fusion using Successive Reprojections

Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

avg

pdf suppmat Project Page Video Poster [BibTex]

pdf suppmat Project Page Video Poster [BibTex]


Thumb xl liao
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation

Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
We propose a technique for depth estimation with a monocular structured-light camera, \ie, a calibrated stereo set-up with one camera and one laser projector. Instead of formulating the depth estimation via a correspondence search problem, we show that a simple convolutional architecture is sufficient for high-quality disparity estimates in this setting. As accurate ground-truth is hard to obtain, we train our model in a self-supervised fashion with a combination of photometric and geometric losses. Further, we demonstrate that the projected pattern of the structured light sensor can be reliably separated from the ambient information. This can then be used to improve depth boundaries in a weakly supervised fashion by modeling the joint statistics of image and depth edges. The model trained in this fashion compares favorably to the state-of-the-art on challenging synthetic and real-world datasets. In addition, we contribute a novel simulator, which allows to benchmark active depth prediction algorithms in controlled conditions.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


no image
Variational Autoencoders Recover PCA Directions (by Accident)

Rolinek, M., Zietlow, D., Martius, G.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
The Variational Autoencoder (VAE) is a powerful architecture capable of representation learning and generative modeling. When it comes to learning interpretable (disentangled) representations, VAE and its variants show unparalleled performance. However, the reasons for this are unclear, since a very particular alignment of the latent embedding is needed but the design of the VAE does not encourage it in any explicit way. We address this matter and offer the following explanation: the diagonal approximation in the encoder together with the inherent stochasticity force local orthogonality of the decoder. The local behavior of promoting both reconstruction and orthogonality matches closely how the PCA embedding is chosen. Alongside providing an intuitive understanding, we justify the statement with full theoretical analysis as well as with experiments.

al

arXiv [BibTex]

arXiv [BibTex]


Thumb xl superquadrics parsing
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids

Paschalidou, D., Ulusoy, A. O., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.

avg

Project Page Poster suppmat pdf Video handout [BibTex]

Project Page Poster suppmat pdf Video handout [BibTex]


no image
Elastic modulus affects adhesive strength of gecko-inspired synthetics in variable temperature and humidity

Mitchell, CT, Drotlef, D, Dayan, CB, Sitti, M, Stark, AY

In INTEGRATIVE AND COMPARATIVE BIOLOGY, pages: E372-E372, OXFORD UNIV PRESS INC JOURNALS DEPT, 2001 EVANS RD, CARY, NC 27513 USA, March 2019 (inproceedings)

pi

[BibTex]

[BibTex]


no image
X-ray Optics Fabrication Using Unorthodox Approaches

Sanli, U., Baluktsian, M., Ceylan, H., Sitti, M., Weigand, M., Schütz, G., Keskinbora, K.

Bulletin of the American Physical Society, APS, 2019 (article)

mms pi

[BibTex]

[BibTex]


Thumb xl as20205.f2
Microrobotics and Microorganisms: Biohybrid Autonomous Cellular Robots

Alapan, Y., Yasa, O., Yigit, B., Yasa, I. C., Erkoc, P., Sitti, M.

Annual Review of Control, Robotics, and Autonomous Systems, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl woodw1 2892811 large
Tailored Magnetic Springs for Shape-Memory Alloy Actuated Mechanisms in Miniature Robots

Woodward, M. A., Sitti, M.

IEEE Transactions on Robotics, 35, 2019 (article)

Abstract
Animals can incorporate large numbers of actuators because of the characteristics of muscles; whereas, robots cannot, as typical motors tend to be large, heavy, and inefficient. However, shape-memory alloys (SMA), materials that contract during heating because of change in their crystal structure, provide another option. SMA, though, is unidirectional and therefore requires an additional force to reset (extend) the actuator, which is typically provided by springs or antagonistic actuation. These strategies, however, tend to limit the actuator's work output and functionality as their force-displacement relationships typically produce increasing resistive force with limited variability. In contrast, magnetic springs-composed of permanent magnets, where the interaction force between magnets mimics a spring force-have much more variable force-displacement relationships and scale well with SMA. However, as of yet, no method for designing magnetic springs for SMA-actuators has been demonstrated. Therefore, in this paper, we present a new methodology to tailor magnetic springs to the characteristics of these actuators, with experimental results both for the device and robot-integrated SMA-actuators. We found magnetic building blocks, based on sets of permanent magnets, which are well-suited to SMAs and have the potential to incorporate features such as holding force, state transitioning, friction minimization, auto-alignment, and self-mounting. We show magnetic springs that vary by more than 3 N in 750 $\mu$m and two SMA-actuated devices that allow the MultiMo-Bat to reach heights of up to 4.5 m without, and 3.6 m with, integrated gliding airfoils. Our results demonstrate the potential of this methodology to add previously impossible functionality to smart material actuators. We anticipate this methodology will inspire broader consideration of the use of magnetic springs in miniature robots and further study of the potential of tailored magnetic springs throughout mechanical systems.

pi

DOI [BibTex]


Thumb xl systemillustration
Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

Gumbsch, C., Butz, M. V., Martius, G.

IEEE Transactions on Cognitive and Developmental Systems, 2019 (article)

Abstract
Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

al

arXiv PDF video link (url) DOI Project Page [BibTex]


Thumb xl c8sm02215a f1 hi res
The near and far of a pair of magnetic capillary disks

Koens, L., Wang, W., Sitti, M., Lauga, E.

Soft Matter, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl smll201900472 fig 0001 m
Multifarious Transit Gates for Programmable Delivery of Bio‐functionalized Matters

Hu, X., Torati, S. R., Kim, H., Yoon, J., Lim, B., Kim, K., Sitti, M., Kim, C.

Small, Wiley Online Library, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Machine Learning for Haptics: Inferring Multi-Contact Stimulation From Sparse Sensor Configuration

Sun, H., Martius, G.

Frontiers in Neurorobotics, 13, pages: 51, 2019 (article)

Abstract
Robust haptic sensation systems are essential for obtaining dexterous robots. Currently, we have solutions for small surface areas such as fingers, but affordable and robust techniques for covering large areas of an arbitrary 3D surface are still missing. Here, we introduce a general machine learning framework to infer multi-contact haptic forces on a 3D robot’s limb surface from internal deformation measured by only a few physical sensors. The general idea of this framework is to predict first the whole surface deformation pattern from the sparsely placed sensors and then to infer number, locations and force magnitudes of unknown contact points. We show how this can be done even if training data can only be obtained for single-contact points using transfer learning at the example of a modified limb of the Poppy robot. With only 10 strain-gauge sensors we obtain a high accuracy also for multiple-contact points. The method can be applied to arbitrarily shaped surfaces and physical sensor types, as long as training data can be obtained.

al

link (url) DOI [BibTex]


Thumb xl smll201900472 fig 0001 m
Mechanics of a pressure-controlled adhesive membrane for soft robotic gripping on curved surfaces

Song, S., Drotlef, D., Paik, J., Majidi, C., Sitti, M.

Extreme Mechanics Letters, Elsevier, 2019 (article)

pi

[BibTex]


Thumb xl mt 2018 00757w 0007
Graphene oxide synergistically enhances antibiotic efficacy in Vancomycin resistance Staphylococcus aureus

Singh, V., Kumar, V., Kashyap, S., Singh, A. V., Kishore, V., Sitti, M., Saxena, P. S., Srivastava, A.

ACS Applied Bio Materials, ACS Publications, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl itxm a 1566425 f0001 c
Review of emerging concepts in nanotoxicology: opportunities and challenges for safer nanomaterial design

Singh, A. V., Laux, P., Luch, A., Sudrik, C., Wiehr, S., Wild, A., Santamauro, G., Bill, J., Sitti, M.

Toxicology Mechanisms and Methods, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl adtp201800064 fig 0004 m
Mobile microrobots for active therapeutic delivery

Erkoc, P., Yasa, I. C., Ceylan, H., Yasa, O., Alapan, Y., Sitti, M.

Advanced Therapeutics, Wiley Online Library, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl adom201801313 fig 0001 m
Microfluidics Integrated Lithography‐Free Nanophotonic Biosensor for the Detection of Small Molecules

Sreekanth, K. V., Sreejith, S., Alapan, Y., Sitti, M., Lim, C. T., Singh, R.

Advanced Optical Materials, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Gecko-inspired composite microfibers for reversible adhesion on smooth and rough surfaces

Drotlef, D., Dayan, C., Sitti, M.

In INTEGRATIVE AND COMPARATIVE BIOLOGY, pages: E58-E58, OXFORD UNIV PRESS INC JOURNALS DEPT, 2001 EVANS RD, CARY, NC 27513 USA, 2019 (inproceedings)

pi

[BibTex]

[BibTex]


Thumb xl 201904010817153241
ENGINEERING Bio-inspired robotic collectives

Sitti, M.

Nature, 567, pages: 314-315, Macmillan Publishers Ltd., London, England, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Electromechanical actuation of dielectric liquid crystal elastomers for soft robotics

Davidson, Z., Shahsavan, H., Guo, Y., Hines, L., Xia, Y., Yang, S., Sitti, M.

Bulletin of the American Physical Society, APS, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl teaser website
Occupancy Networks: Learning 3D Reconstruction in Function Space

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, 2019 (inproceedings)

Abstract
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

avg

Code Video pdf suppmat Project Page [BibTex]

Code Video pdf suppmat Project Page [BibTex]


Thumb xl nova
NoVA: Learning to See in Novel Viewpoints and Domains

Coors, B., Condurache, A. P., Geiger, A.

In 2019 International Conference on 3D Vision (3DV), 2019 International Conference on 3D Vision (3DV), 2019 (inproceedings)

Abstract
Domain adaptation techniques enable the re-use and transfer of existing labeled datasets from a source to a target domain in which little or no labeled data exists. Recently, image-level domain adaptation approaches have demonstrated impressive results in adapting from synthetic to real-world environments by translating source images to the style of a target domain. However, the domain gap between source and target may not only be caused by a different style but also by a change in viewpoint. This case necessitates a semantically consistent translation of source images and labels to the style and viewpoint of the target domain. In this work, we propose the Novel Viewpoint Adaptation (NoVA) model, which enables unsupervised adaptation to a novel viewpoint in a target domain for which no labeled data is available. NoVA utilizes an explicit representation of the 3D scene geometry to translate source view images and labels to the target view. Experiments on adaptation to synthetic and real-world datasets show the benefit of NoVA compared to state-of-the-art domain adaptation approaches on the task of semantic segmentation.

avg

pdf suppmat poster video [BibTex]

pdf suppmat poster video [BibTex]

2018


Thumb xl screenshot 2018 5 9 swimming back and forth using planar flagellar propulsion at low reynolds numbers   khalil   2018   adv ...
Swimming Back and Forth Using Planar Flagellar Propulsion at Low Reynolds Numbers

Khalil, I. S. M., Tabak, A. F., Hamed, Y., Mitwally, M. E., Tawakol, M., Klingner, A., Sitti, M.

Advanced Science, 5(2):1700461, 2018 (article)

Abstract
Abstract Peritrichously flagellated Escherichia coli swim back and forth by wrapping their flagella together in a helical bundle. However, other monotrichous bacteria cannot swim back and forth with a single flagellum and planar wave propagation. Quantifying this observation, a magnetically driven soft two‐tailed microrobot capable of reversing its swimming direction without making a U‐turn trajectory or actively modifying the direction of wave propagation is designed and developed. The microrobot contains magnetic microparticles within the polymer matrix of its head and consists of two collinear, unequal, and opposite ultrathin tails. It is driven and steered using a uniform magnetic field along the direction of motion with a sinusoidally varying orthogonal component. Distinct reversal frequencies that enable selective and independent excitation of the first or the second tail of the microrobot based on their tail length ratio are found. While the first tail provides a propulsive force below one of the reversal frequencies, the second is almost passive, and the net propulsive force achieves flagellated motion along one direction. On the other hand, the second tail achieves flagellated propulsion along the opposite direction above the reversal frequency.

pi

link (url) DOI [BibTex]

2018


link (url) DOI [BibTex]


Thumb xl learn etc
Deep Reinforcement Learning for Event-Triggered Control

Baumann, D., Zhu, J., Martius, G., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), pages: 943-950, 57th IEEE International Conference on Decision and Control (CDC), December 2018 (inproceedings)

al ics

arXiv PDF DOI Project Page Project Page [BibTex]

arXiv PDF DOI Project Page Project Page [BibTex]


Thumb xl universal custom complex magnetic spring design methodology
Universal Custom Complex Magnetic Spring Design Methodology

Woodward, M. A., Sitti, M.

IEEE Transactions on Magnetics, 54(1):1-13, October 2018 (article)

Abstract
A design methodology is presented for creating custom complex magnetic springs through the design of force-displacement curves. This methodology results in a magnet configuration, which will produce a desired force-displacement relationship. Initially, the problem is formulated and solved as a system of linear equations. Then, given the limited likelihood of a single solution being feasibly manufactured, key parameters of the solution are extracted and varied to create a family of solutions. Finally, these solutions are refined using numerical optimization. Given the properties of magnets, this methodology can create any well-defined function of force versus displacement and is model-independent. To demonstrate this flexibility, a number of example magnetic springs are designed; one of which, designed for use in a jumping-gliding robot's shape memory alloy actuated clutch, is manufactured and experimentally characterized. Due to the scaling of magnetic forces, the displacement region which these magnetic springs are most applicable is that of millimeters and below. However, this region is well situated for miniature robots and smart material actuators, where a tailored magnetic spring, designed to compliment a component, can enhance its performance while adding new functionality. The methodology is also expendable to variable interactions and multi-dimensional magnetic field design.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl sevillagcpr
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 281-297, Springer, Cham, October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

avg ps

arXiv DOI [BibTex]

arXiv DOI [BibTex]


Thumb xl iros18
Towards Robust Visual Odometry with a Multi-Camera System

Liu, P., Geppert, M., Heng, L., Sattler, T., Geiger, A., Pollefeys, M.

In International Conference on Intelligent Robots and Systems (IROS) 2018, International Conference on Intelligent Robots and Systems, October 2018 (inproceedings)

Abstract
We present a visual odometry (VO) algorithm for a multi-camera system and robust operation in challenging environments. Our algorithm consists of a pose tracker and a local mapper. The tracker estimates the current pose by minimizing photometric errors between the most recent keyframe and the current frame. The mapper initializes the depths of all sampled feature points using plane-sweeping stereo. To reduce pose drift, a sliding window optimizer is used to refine poses and structure jointly. Our formulation is flexible enough to support an arbitrary number of stereo cameras. We evaluate our algorithm thoroughly on five datasets. The datasets were captured in different conditions: daytime, night-time with near-infrared (NIR) illumination and night-time without NIR illumination. Experimental results show that a multi-camera setup makes the VO more robust to challenging environments, especially night-time conditions, in which a single stereo configuration fails easily due to the lack of features.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl ianeccv18
Learning Priors for Semantic 3D Reconstruction

Cherabier, I., Schönberger, J., Oswald, M., Pollefeys, M., Geiger, A.

In Computer Vision – ECCV 2018, Springer International Publishing, Cham, September 2018 (inproceedings)

Abstract
We present a novel semantic 3D reconstruction framework which embeds variational regularization into a neural network. Our network performs a fixed number of unrolled multi-scale optimization iterations with shared interaction weights. In contrast to existing variational methods for semantic 3D reconstruction, our model is end-to-end trainable and captures more complex dependencies between the semantic labels and the 3D geometry. Compared to previous learning-based approaches to 3D reconstruction, we integrate powerful long-range dependencies using variational coarse-to-fine optimization. As a result, our network architecture requires only a moderate number of parameters while keeping a high level of expressiveness which enables learning from very little data. Experiments on real and synthetic datasets demonstrate that our network achieves higher accuracy compared to a purely variational approach while at the same time requiring two orders of magnitude less iterations to converge. Moreover, our approach handles ten times more semantic class labels using the same computational resources.

avg

pdf suppmat Project Page Video DOI Project Page [BibTex]

pdf suppmat Project Page Video DOI Project Page [BibTex]


Thumb xl joeleccv18
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Janai, J., Güney, F., Ranjan, A., Black, M. J., Geiger, A.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11220, pages: 713-731, Springer, Cham, September 2018 (inproceedings)

avg ps

pdf suppmat Video Project Page DOI Project Page [BibTex]

pdf suppmat Video Project Page DOI Project Page [BibTex]


Thumb xl beneccv18
SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images

Coors, B., Condurache, A. P., Geiger, A.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Omnidirectional cameras offer great benefits over classical cameras wherever a wide field of view is essential, such as in virtual reality applications or in autonomous robots. Unfortunately, standard convolutional neural networks are not well suited for this scenario as the natural projection surface is a sphere which cannot be unwrapped to a plane without introducing significant distortions, particularly in the polar regions. In this work, we present SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks. Towards this goal, SphereNet adapts the sampling locations of the convolutional filters, effectively reversing distortions, and wraps the filters around the sphere. By building on regular convolutions, SphereNet enables the transfer of existing perspective convolutional neural network models to the omnidirectional case. We demonstrate the effectiveness of our method on the tasks of image classification and object detection, exploiting two newly created semi-synthetic and real-world omnidirectional datasets.

avg

pdf suppmat Project Page [BibTex]


Thumb xl teaser image
Programmable collective behavior in dynamically self-assembled mobile microrobotic swarms

B Yigit, , Y Alapan, , Sitti, M.

Advanced Science, July 2018 (article)

Abstract
Collective control of mobile microrobotic swarms is indispensable for their potential high-impact applications in targeted drug delivery, medical diagnostics, parallel micromanipulation, and environmental sensing and remediation. Lack of on-board computational and sensing capabilities in current microrobotic systems necessitates use of physical interactions among individual microrobots for local physical communication and cooperation. Here, we show that mobile microrobotic swarms with well-defined collective behavior can be designed by engineering magnetic interactions among individual units. Microrobots, consisting of a linear chain of self-assembled magnetic microparticles, locomote on surfaces in response to a precessing magnetic field. Control over the direction of precessing magnetic field allows engineering attractive and repulsive interactions among microrobots and, thus, collective order with well-defined spatial organization and parallel operation over macroscale distances (~ 1 cm). These microrobotic swarms can be guided through confined spaces, while preserving microrobot morphology and function. These swarms can further achieve directional transport of large cargoes on surfaces and small cargoes in bulk fluids. Described design approach, exploiting physical interactions among individual robots, enables facile and rapid formation of self-organized and reconfigurable microrobotic swarms with programmable collective order.

pi

link (url) [BibTex]


Thumb xl picture1
3D-Printed Biodegradable Microswimmer for Drug Delivery and Targeted Cell Labeling

Hakan Ceylan, , I. Ceren Yasa, , Oncay Yasa, , Ahmet Fatih Tabak, , Joshua Giltinan, , Sitti, M.

bioRxiv, pages: 379024, July 2018 (article)

Abstract
Miniaturization of interventional medical devices can leverage minimally invasive technologies by enabling operational resolution at cellular length scales with high precision and repeatability. Untethered micron-scale mobile robots can realize this by navigating and performing in hard-to-reach, confined and delicate inner body sites. However, such a complex task requires an integrated design and engineering strategy, where powering, control, environmental sensing, medical functionality and biodegradability need to be considered altogether. The present study reports a hydrogel-based, biodegradable microrobotic swimmer, which is responsive to the changes in its microenvironment for theranostic cargo delivery and release tasks. We design a double-helical magnetic microswimmer of 20 micrometers length, which is 3D-printed with complex geometrical and compositional features. At normal physiological concentrations, matrix metalloproteinase-2 (MMP-2) enzyme can entirely degrade the microswimmer body in 118 h to solubilized non-toxic products. The microswimmer can respond to the pathological concentrations of MMP-2 by swelling and thereby accelerating the release kinetics of the drug payload. Anti-ErbB 2 antibody-tagged magnetic nanoparticles released from the degraded microswimmers serve for targeted labeling of SKBR3 breast cancer cells to realize the potential of medical imaging of local tissue sites following the therapeutic intervention. These results represent a leap forward toward clinical medical microrobots that are capable of sensing, responding to the local pathological information, and performing specific therapeutic and diagnostic tasks as orderly executed operations using their smart composite material architectures.

pi

DOI Project Page [BibTex]


Thumb xl screen shot 2018 06 29 at 4.24.39 pm
Innate turning preference of leaf-cutting ants in the absence of external orientation cues

Endlein, T., Sitti, M.

Journal of Experimental Biology, The Company of Biologists Ltd, June 2018 (article)

Abstract
Many ants use a combination of cues for orientation but how do ants find their way when all external cues are suppressed? Do they walk in a random way or are their movements spatially oriented? Here we show for the first time that leaf-cutting ants (Acromyrmex lundii) have an innate preference of turning counter-clockwise (left) when external cues are precluded. We demonstrated this by allowing individual ants to run freely on the water surface of a newly-developed treadmill. The surface tension supported medium-sized workers but effectively prevented ants from reaching the wall of the vessel, important to avoid wall-following behaviour (thigmotaxis). Most ants ran for minutes on the spot but also slowly turned counter-clockwise in the absence of visual cues. Reconstructing the effectively walked path revealed a looping pattern which could be interpreted as a search strategy. A similar turning bias was shown for groups of ants in a symmetrical Y-maze where twice as many ants chose the left branch in the absence of optical cues. Wall-following behaviour was tested by inserting a coiled tube before the Y-fork. When ants traversed a left-coiled tube, more ants chose the left box and vice versa. Adding visual cues in form of vertical black strips either outside the treadmill or on one branch of the Y-maze led to oriented walks towards the strips. It is suggested that both, the turning bias and the wall-following are employed as search strategies for an unknown environment which can be overridden by visual cues.

pi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl 1
Motility and chemotaxis of bacteria-driven microswimmers fabricated using antigen 43-mediated biotin display

Schauer, O., Mostaghaci, B., Colin, R., Hürtgen, D., Kraus, D., Sitti, M., Sourjik, V.

Scientific Reports, 8(1):9801, Nature Publishing Group, June 2018 (article)

Abstract
Bacteria-driven biohybrid microswimmers (bacteriabots) combine synthetic cargo with motile living bacteria that enable propulsion and steering. Although fabrication and potential use of such bacteriabots have attracted much attention, existing methods of fabrication require an extensive sample preparation that can drastically decrease the viability and motility of bacteria. Moreover, chemotactic behavior of bacteriabots in a liquid medium with chemical gradients has remained largely unclear. To overcome these shortcomings, we designed Escherichia coli to autonomously display biotin on its cell surface via the engineered autotransporter antigen 43 and thus to bind streptavidin-coated cargo. We show that the cargo attachment to these bacteria is greatly enhanced by motility and occurs predominantly at the cell poles, which is greatly beneficial for the fabrication of motile bacteriabots. We further performed a systemic study to understand and optimize the ability of these bacteriabots to follow chemical gradients. We demonstrate that the chemotaxis of bacteriabots is primarily limited by the cargo-dependent reduction of swimming speed and show that the fabrication of bacteriabots using elongated E. coli cells can be used to overcome this limitation.

pi

link (url) DOI [BibTex]

link (url) DOI [BibTex]