Header logo is


2017


Thumb xl larsnips
The Numerics of GANs

Mescheder, L., Nowozin, S., Geiger, A.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

Abstract
In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.

avg

pdf Project Page [BibTex]

2017


pdf Project Page [BibTex]


Thumb xl teaser iccv2017
Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

Behl, A., Jafari, O. H., Mustikovela, S. K., Alhaija, H. A., Rother, C., Geiger, A.

In Proceedings IEEE International Conference on Computer Vision (ICCV), IEEE, Piscataway, NJ, USA, IEEE International Conference on Computer Vision (ICCV), October 2017 (inproceedings)

Abstract
Existing methods for 3D scene flow estimation often fail in the presence of large displacement or local ambiguities, e.g., at texture-less or reflective surfaces. However, these challenges are omnipresent in dynamic road scenes, which is the focus of this work. Our main contribution is to overcome these 3D motion estimation problems by exploiting recognition. In particular, we investigate the importance of recognition granularity, from coarse 2D bounding box estimates over 2D instance segmentations to fine-grained 3D object part predictions. We compute these cues using CNNs trained on a newly annotated dataset of stereo images and integrate them into a CRF-based model for robust 3D scene flow estimation - an approach we term Instance Scene Flow. We analyze the importance of each recognition cue in an ablation study and observe that the instance segmentation cue is by far strongest, in our setting. We demonstrate the effectiveness of our method on the challenging KITTI 2015 scene flow benchmark where we achieve state-of-the-art performance at the time of submission.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


Thumb xl jonas teaser
Sparsity Invariant CNNs

Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.

International Conference on 3D Vision (3DV) 2017, International Conference on 3D Vision (3DV), October 2017 (conference)

Abstract
In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments \wrt various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


Thumb xl gernot teaser
OctNetFusion: Learning Depth Fusion from Data

Riegler, G., Ulusoy, A. O., Bischof, H., Geiger, A.

International Conference on 3D Vision (3DV) 2017, International Conference on 3D Vision (3DV), October 2017 (conference)

Abstract
In this paper, we present a learning based approach to depth fusion, i.e., dense 3D reconstruction from multiple depth images. The most common approach to depth fusion is based on averaging truncated signed distance functions, which was originally proposed by Curless and Levoy in 1996. While this method is simple and provides great results, it is not able to reconstruct (partially) occluded surfaces and requires a large number frames to filter out sensor noise and outliers. Motivated by the availability of large 3D model repositories and recent advances in deep learning, we present a novel 3D CNN architecture that learns to predict an implicit surface representation from the input depth maps. Our learning based method significantly outperforms the traditional volumetric fusion approach in terms of noise reduction and outlier suppression. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill in gaps in the reconstruction. We demonstrate that our learning based approach outperforms both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric fusion. Further, we demonstrate state-of-the-art 3D shape completion results.

avg

pdf Video 1 Video 2 Project Page Project Page [BibTex]

pdf Video 1 Video 2 Project Page Project Page [BibTex]


Thumb xl andreas teaser
Direct Visual Odometry for a Fisheye-Stereo Camera

Liu, P., Heng, L., Sattler, T., Geiger, A., Pollefeys, M.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

Abstract
We present a direct visual odometry algorithm for a fisheye-stereo camera. Our algorithm performs simultaneous camera motion estimation and semi-dense reconstruction. The pipeline consists of two threads: a tracking thread and a mapping thread. In the tracking thread, we estimate the camera pose via semi-dense direct image alignment. To have a wider field of view (FoV) which is important for robotic perception, we use fisheye images directly without converting them to conventional pinhole images which come with a limited FoV. To address the epipolar curve problem, plane-sweeping stereo is used for stereo matching and depth initialization. Multiple depth hypotheses are tracked for selected pixels to better capture the uncertainty characteristics of stereo matching. Temporal motion stereo is then used to refine the depth and remove false positive depth hypotheses. Our implementation runs at an average of 20 Hz on a low-end PC. We run experiments in outdoor environments to validate our algorithm, and discuss the experimental results. We experimentally show that we are able to estimate 6D poses with low drift, and at the same time, do semi-dense 3D reconstruction with high accuracy.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl hassan paper teasere
Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes

Alhaija, H. A., Mustikovela, S. K., Mescheder, L., Geiger, A., Rother, C.

In Proceedings of the British Machine Vision Conference 2017, Proceedings of the British Machine Vision Conference, September 2017 (inproceedings)

Abstract
The success of deep learning in computer vision is based on the availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Unfortunately, creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment these images with virtual objects. This allows us to create realistic composite images which exhibit both realistic background appearance as well as a large number of complex object arrangements. In contrast to modeling complete 3D environments, our data augmentation approach requires only a few user interactions in combination with 3D shapes of the target object category. We demonstrate the utility of the proposed approach for training a state-of-the-art high-capacity deep model for semantic instance segmentation. In particular, we consider the task of segmenting car instances on the KITTI dataset which we have annotated with pixel-accurate ground truth. Our experiments demonstrate that models trained on augmented imagery generalize better than those trained on synthetic data or models trained on limited amounts of annotated real data.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


no image
Swimming in low reynolds numbers using planar and helical flagellar waves

Khalil, I. S. M., Tabak, A. F., Seif, M. A., Klingner, A., Adel, B., Sitti, M.

In International Conference on Intelligent Robots and Systems (IROS) 2017, pages: 1907-1912, International Conference on Intelligent Robots and Systems, September 2017 (inproceedings)

Abstract
In travelling towards the oviducts, sperm cells undergo transitions between planar to helical flagellar propulsion by a beating tail based on the viscosity of the environment. In this work, we aim to model and mimic this behaviour in low Reynolds number fluids using externally actuated soft robotic sperms. We numerically investigate the effects of transition between planar to helical flagellar propulsion on the swimming characteristics of the robotic sperm using a model based on resistive-force theory to study the role of viscous forces on its flexible tail. Experimental results are obtained using robots that contain magnetic particles within the polymer matrix of its head and an ultra-thin flexible tail. The planar and helical flagellar propulsion are achieved using in-plane and out-of-plane uniform fields with sinusoidally varying components, respectively. We experimentally show that the swimming speed of the robotic sperm increases by a factor of 1.4 (fluid viscosity 5 Pa.s) when it undergoes a controlled transition between planar to helical flagellar propulsion, at relatively low actuation frequencies.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl img01
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks

Mescheder, L., Nowozin, S., Geiger, A.

In Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (inproceedings)

Abstract
Variational Autoencoders (VAEs) are expressive latent variable models that can be used to learn complex probability distributions from training data. However, the quality of the resulting model crucially relies on the expressiveness of the inference model. We introduce Adversarial Variational Bayes (AVB), a technique for training Variational Autoencoders with arbitrarily expressive inference models. We achieve this by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood-problem as a two-player game, hence establishing a principled connection between VAEs and Generative Adversarial Networks (GANs). We show that in the nonparametric limit our method yields an exact maximum-likelihood assignment for the parameters of the generative model, as well as the exact posterior distribution over the latent variables given an observation. Contrary to competing approaches which combine VAEs with GANs, our approach has a clear theoretical justification, retains most advantages of standard Variational Autoencoders and is easy to implement.

avg

pdf suppmat Project Page arxiv-version Project Page [BibTex]

pdf suppmat Project Page arxiv-version Project Page [BibTex]


Thumb xl publications toc
An XY ϴz flexure mechanism with optimal stiffness properties

Lum, G. Z., Pham, M. T., Teo, T. J., Yang, G., Yeo, S. H., Sitti, M.

In 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pages: 1103-1110, July 2017 (inproceedings)

Abstract
The development of optimal XY θz flexure mechanisms, which can deliver high precision motion about the z-axis, and along the x- and y-axes is highly desirable for a wide range of micro/nano-positioning tasks pertaining to biomedical research, microscopy technologies and various industrial applications. Although maximizing the stiffness ratios is a very critical design requirement, the achievable translational and rotational stiffness ratios of existing XY θz flexure mechanisms are still restricted between 0.5 and 130. As a result, these XY θz flexure mechanisms are unable to fully optimize their workspace and capabilities to reject disturbances. Here, we present an optimal XY θz flexure mechanism, which is designed to have maximum stiffness ratios. Based on finite element analysis (FEA), it has translational stiffness ratio of 248, rotational stiffness ratio of 238 and a large workspace of 2.50 mm × 2.50 mm × 10°. Despite having such a large workspace, FEA also predicts that the proposed mechanism can still achieve a high bandwidth of 70 Hz. In comparison, the bandwidth of similar existing flexure mechanisms that can deflect more than 0.5 mm or 0.5° is typically less than 45 Hz. Hence, the high stiffness ratios of the proposed mechanism are achieved without compromising its dynamic performance. Preliminary experimental results pertaining to the mechanism's translational actuating stiffness and bandwidth were in agreement with the FEA predictions as the deviation was within 10%. In conclusion, the proposed flexure mechanism exhibits superior performance and can be used across a wide range of applications.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl publications toc
Positioning of drug carriers using permanent magnet-based robotic system in three-dimensional space

Khalil, I. S. M., Alfar, A., Tabak, A. F., Klingner, A., Stramigioli, S., Sitti, M.

In 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pages: 1117-1122, July 2017 (inproceedings)

Abstract
Magnetic control of drug carriers using systems with open-configurations is essential to enable scaling to the size of in vivo applications. In this study, we demonstrate motion control of paramagnetic microparticles in a low Reynolds number fluid, using a permanent magnet-based robotic system with an open-configuration. The microparticles are controlled in three-dimensional (3D) space using a cylindrical NdFeB magnet that is fixed to the end-effector of a robotic arm. We develop a kinematic map between the position of the microparticles and the configuration of the robotic arm, and use this map as a basis of a closed-loop control system based on the position of the microparticles. Our experimental results show the ability of the robot configuration to control the exerted field gradient on the dipole of the microparticles, and achieve positioning in 3D space with maximum error of 300 µm and 600 µm in the steady-state during setpoint and trajectory tracking, respectively.

pi

DOI [BibTex]

DOI [BibTex]


no image
Self-assembly of micro/nanosystems across scales and interfaces

Mastrangeli, M.

In 2017 19th International Conference on Solid-State Sensors, Actuators and Microsystems (TRANSDUCERS), pages: 676 - 681, IEEE, July 2017 (inproceedings)

Abstract
Steady progress in understanding and implementation are establishing self-assembly as a versatile, parallel and scalable approach to the fabrication of transducers. In this contribution, I illustrate the principles and reach of self-assembly with three applications at different scales - namely, the capillary self-alignment of millimetric components, the sealing of liquid-filled polymeric microcapsules, and the accurate capillary assembly of single nanoparticles - and propose foreseeable directions for further developments.

pi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl joel slow flow crop
Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data

Janai, J., Güney, F., Wulff, J., Black, M., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 1406-1416, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Existing optical flow datasets are limited in size and variability due to the difficulty of capturing dense ground truth. In this paper, we tackle this problem by tracking pixels through densely sampled space-time volumes recorded with a high-speed video camera. Our model exploits the linearity of small motions and reasons about occlusions from multiple frames. Using our technique, we are able to establish accurate reference flow fields outside the laboratory in natural environments. Besides, we show how our predictions can be used to augment the input images with realistic motion blur. We demonstrate the quality of the produced flow fields on synthetic and real-world datasets. Finally, we collect a novel challenging optical flow dataset by applying our technique on data from a high-speed camera and analyze the performance of the state-of-the-art in optical flow under various levels of motion blur.

avg ps

pdf suppmat Project page Video DOI Project Page [BibTex]

pdf suppmat Project page Video DOI Project Page [BibTex]


Thumb xl img03
OctNet: Learning Deep 3D Representations at High Resolutions

Riegler, G., Ulusoy, O., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.

avg ps

pdf suppmat Project Page Video Project Page [BibTex]

pdf suppmat Project Page Video Project Page [BibTex]


Thumb xl schoeps2017cvpr
A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos

Schöps, T., Schönberger, J. L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Motivated by the limitations of existing multi-view stereo benchmarks, we present a novel dataset for this task. Towards this goal, we recorded a variety of indoor and outdoor scenes using a high-precision laser scanner and captured both high-resolution DSLR imagery as well as synchronized low-resolution stereo videos with varying fields-of-view. To align the images with the laser scans, we propose a robust technique which minimizes photometric errors conditioned on the geometry. In contrast to previous datasets, our benchmark provides novel challenges and covers a diverse set of viewpoints and scene types, ranging from natural scenes to man-made indoor and outdoor environments. Furthermore, we provide data at significantly higher temporal and spatial resolution. Our benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images. We make our datasets and an online evaluation server available at http://www.eth3d.net.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


Thumb xl camposeco2017cvpr
Toroidal Constraints for Two Point Localization Under High Outlier Ratios

Camposeco, F., Sattler, T., Cohen, A., Geiger, A., Pollefeys, M.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Localizing a query image against a 3D model at large scale is a hard problem, since 2D-3D matches become more and more ambiguous as the model size increases. This creates a need for pose estimation strategies that can handle very low inlier ratios. In this paper, we draw new insights on the geometric information available from the 2D-3D matching process. As modern descriptors are not invariant against large variations in viewpoint, we are able to find the rays in space used to triangulate a given point that are closest to a query descriptor. It is well known that two correspondences constrain the camera to lie on the surface of a torus. Adding the knowledge of direction of triangulation, we are able to approximate the position of the camera from \emphtwo matches alone. We derive a geometric solver that can compute this position in under 1 microsecond. Using this solver, we propose a simple yet powerful outlier filter which scales quadratically in the number of matches. We validate the accuracy of our solver and demonstrate the usefulness of our method in real world settings.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page pdf Project Page [BibTex]


Thumb xl cvpr2017 landpsace
Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels

Ulusoy, A. O., Black, M. J., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.

avg ps

YouTube pdf suppmat Project Page [BibTex]

YouTube pdf suppmat Project Page [BibTex]


Thumb xl publications toc
Dynamic analysis on hexapedal water-running robot with compliant joints

Kim, H., Liu, Y., Jeong, K., Sitti, M., Seo, T.

In 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), pages: 250-251, June 2017 (inproceedings)

Abstract
The dynamic analysis has been considered as one of the important design methods to design robots. In this research, we derive dynamic equation of hexapedal water-running robot to design compliant joints. The compliant joints that connect three bodies will be used to improve mobility and stability of water-running motion's pitch behavior. We considered all of parts as rigid body including links of six Klann mechanisms and three main frames. And then, we derived dynamic equation by using the Lagrangian method with external force of the water. We are expecting that the dynamic analysis is going to be used to design parts of the water running robot.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl publications toc
Design and actuation of a magnetic millirobot under a constant unidirectional magnetic field

Erin, O., Giltinan, J., Tsai, L., Sitti, M.

In Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), pages: 3404-3410, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

Abstract
Magnetic untethered millirobots, which are actuated and controlled by remote magnetic fields, have been proposed for medical applications due to their ability to safely pass through tissues at long ranges. For example, magnetic resonance imaging (MRI) systems with a 3-7 T constant unidirectional magnetic field and 3D gradient coils have been used to actuate magnetic robots. Such magnetically constrained systems place limits on the degrees of freedom that can be actuated for untethered devices. This paper presents a design and actuation methodology for a magnetic millirobot that exhibits both position and orientation control in 2D under a magnetic field, dominated by a constant unidirectional magnetic field as found in MRI systems. Placing a spherical permanent magnet, which is free to rotate inside the millirobot and located away from the center of mass, allows the generation of net forces and torques with applied 3D magnetic field gradients. We model this system in a 3D planar case and experimentally demonstrate open-loop control of both position and orientation by the applied 2D field gradients. The actuation performance is characterized across the most important design variables, and we experimentally demonstrate that the proposed approach is feasible.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl publications toc
Magnetically actuated soft capsule endoscope for fine-needle aspiration biopsy

Son, D., Dogan, M. D., Sitti, M.

In Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), pages: 1132-1139, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

Abstract
This paper presents a magnetically actuated soft capsule endoscope for fine-needle aspiration biopsy (B-MASCE) in the upper gastrointestinal tract. A thin and hollow needle is attached to the capsule, which can penetrate deeply into tissues to obtain subsurface biopsy sample. The design utilizes a soft elastomer body as a compliant mechanism to guide the needle. An internal permanent magnet provides a means for both actuation and tracking. The capsule is designed to roll towards its target and then deploy the biopsy needle in a precise location selected as the target area. B-MASCE is controlled by multiple custom-designed electromagnets while its position and orientation are tracked by a magnetic sensor array. In in vitro trials, B-MASCE demonstrated rolling locomotion and biopsy of a swine tissue model positioned inside an anatomical human stomach model. It was confirmed after the experiment that a tissue sample was retained inside the needle.

pi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl image toc
The use of clamping grips and friction pads by tree frogs for climbing curved surfaces

Endlein, T., Ji, A., Yuan, S., Hill, I., Wang, H., Barnes, W. J. P., Dai, Z., Sitti, M.

In Proc. R. Soc. B, 284(1849):20162867, Febuary 2017 (inproceedings)

Abstract
Most studies on the adhesive mechanisms of climbing animals have addressed attachment against flat surfaces, yet many animals can climb highly curved surfaces, like twigs and small branches. Here we investigated whether tree frogs use a clamping grip by recording the ground reaction forces on a cylindrical object with either a smooth or anti-adhesive, rough surface. Furthermore, we measured the contact area of fore and hindlimbs against differently sized transparent cylinders and the forces of individual pads and subarticular tubercles in restrained animals. Our study revealed that frogs use friction and normal forces of roughly a similar magnitude for holding on to cylindrical objects. When challenged with climbing a non-adhesive surface, the compressive forces between opposite legs nearly doubled, indicating a stronger clamping grip. In contrast to climbing flat surfaces, frogs increased the contact area on all limbs by engaging not just adhesive pads but also subarticular tubercles on curved surfaces. Our force measurements showed that tubercles can withstand larger shear stresses than pads. SEM images of tubercles revealed a similar structure to that of toe pads including the presence of nanopillars, though channels surrounding epithelial cells were less pronounced. The tubercles' smaller size, proximal location on the toes and shallow cells make them probably less prone to buckling and thus ideal for gripping curved surfaces.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl publications toc
Planning spin-walking locomotion for automatic grasping of microobjects by an untethered magnetic microgripper

Dong, X., Sitti, M.

In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages: 6612-6618, 2017 (inproceedings)

Abstract
Most demonstrated mobile microrobot tasks so far have been achieved via pick-and-placing and dynamic trapping with teleoperation or simple path following algorithms. In our previous work, an untethered magnetic microgripper has been developed which has advanced functions, such as gripping objects. Both teleoperated manipulation in 2D and 3D have been demonstrated. However, it is challenging to control the magnetic microgripper to carry out manipulation tasks, because the grasping of objects so far in the literature relies heavily on teleoperation, which takes several minutes with even a skilled human expert. Here, we propose a new spin-walking locomotion and an automated 2D grasping motion planner for the microgripper, which enables time-efficient automatic grasping of microobjects that has not been achieved yet for untethered microrobots. In its locomotion, the microgripper repeatedly rotates about two principal axes to regulate its pose and move precisely on a surface. The motion planner could plan different motion primitives for grasping and compensate the uncertainties in the motion by learning the uncertainties and planning accordingly. We experimentally demonstrated that, using the proposed method, the microgripper could align to the target pose with error less than 0.1 body length and grip the objects within 40 seconds. Our method could significantly improve the time efficiency of micro-scale manipulation and have potential applications in microassembly and biomedical engineering.

pi

DOI Project Page [BibTex]

DOI Project Page [BibTex]

2015


Thumb xl zhou
Exploiting Object Similarity in 3D Reconstruction

Zhou, C., Güney, F., Wang, Y., Geiger, A.

In International Conference on Computer Vision (ICCV), December 2015 (inproceedings)

Abstract
Despite recent progress, reconstructing outdoor scenes in 3D from movable platforms remains a highly difficult endeavor. Challenges include low frame rates, occlusions, large distortions and difficult lighting conditions. In this paper, we leverage the fact that the larger the reconstructed area, the more likely objects of similar type and shape will occur in the scene. This is particularly true for outdoor scenes where buildings and vehicles often suffer from missing texture or reflections, but share similarity in 3D shape. We take advantage of this shape similarity by locating objects using detectors and jointly reconstructing them while learning a volumetric model of their shape. This allows us to reduce noise while completing missing surfaces as objects of similar shape benefit from all observations for the respective category. We evaluate our approach with respect to LIDAR ground truth on a novel challenging suburban dataset and show its advantages over the state-of-the-art.

avg ps

pdf suppmat [BibTex]

2015


pdf suppmat [BibTex]


Thumb xl philip
FollowMe: Efficient Online Min-Cost Flow Tracking with Bounded Memory and Computation

Lenz, P., Geiger, A., Urtasun, R.

In International Conference on Computer Vision (ICCV), International Conference on Computer Vision (ICCV), December 2015 (inproceedings)

Abstract
One of the most popular approaches to multi-target tracking is tracking-by-detection. Current min-cost flow algorithms which solve the data association problem optimally have three main drawbacks: they are computationally expensive, they assume that the whole video is given as a batch, and they scale badly in memory and computation with the length of the video sequence. In this paper, we address each of these issues, resulting in a computationally and memory-bounded solution. First, we introduce a dynamic version of the successive shortest-path algorithm which solves the data association problem optimally while reusing computation, resulting in faster inference than standard solvers. Second, we address the optimal solution to the data association problem when dealing with an incoming stream of data (i.e., online setting). Finally, we present our main contribution which is an approximate online solution with bounded memory and computation which is capable of handling videos of arbitrary length while performing tracking in real time. We demonstrate the effectiveness of our algorithms on the KITTI and PETS2009 benchmarks and show state-of-the-art performance, while being significantly faster than existing solvers.

avg ps

pdf suppmat video project [BibTex]

pdf suppmat video project [BibTex]


Thumb xl teaser
Towards Probabilistic Volumetric Reconstruction using Ray Potentials

(Best Paper Award)

Ulusoy, A. O., Geiger, A., Black, M. J.

In 3D Vision (3DV), 2015 3rd International Conference on, pages: 10-18, Lyon, October 2015 (inproceedings)

Abstract
This paper presents a novel probabilistic foundation for volumetric 3-d reconstruction. We formulate the problem as inference in a Markov random field, which accurately captures the dependencies between the occupancy and appearance of each voxel, given all input images. Our main contribution is an approximate highly parallelized discrete-continuous inference algorithm to compute the marginal distributions of each voxel's occupancy and appearance. In contrast to the MAP solution, marginals encode the underlying uncertainty and ambiguity in the reconstruction. Moreover, the proposed algorithm allows for a Bayes optimal prediction with respect to a natural reconstruction loss. We compare our method to two state-of-the-art volumetric reconstruction algorithms on three challenging aerial datasets with LIDAR ground truth. Our experiments demonstrate that the proposed algorithm compares favorably in terms of reconstruction accuracy and the ability to expose reconstruction uncertainty.

avg ps

code YouTube pdf suppmat DOI Project Page [BibTex]

code YouTube pdf suppmat DOI Project Page [BibTex]


Thumb xl 07353111
Compliant wing design for a flapping wing micro air vehicle

Colmenares, D., Kania, R., Zhang, W., Sitti, M.

In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, pages: 32-39, September 2015 (inproceedings)

Abstract
In this work, we examine several wing designs for a motor-driven, flapping-wing micro air vehicle capable of liftoff. The full system consists of two wings independently driven by geared pager motors that include a spring in parallel with the output shaft. The linear transmission allows for resonant operation, while control is achieved by direct drive of the wing angle. Wings used in previous work were chosen to be fully rigid for simplicity of modeling and fabrication. However, biological wings are highly flexible and other micro air vehicles have successfully utilized flexible wing structures for specialized tasks. The goal of our study is to determine if wing flexibility can be generally used to increase wing performance. Two approaches to lift improvement using flexible wings are explored, resonance of the wing cantilever structure and dynamic wing twisting. We design and test several wings that are compared using different figures of merit. A twisted design improved lift per power by 73.6% and maximum lift production by 53.2% compared to the original rigid design. Wing twist is then modeled in order to propose optimal wing twist profiles that can maximize either wing efficiency or lift production.

pi

DOI [BibTex]

DOI [BibTex]


no image
Millimeter-scale magnetic swimmers using elastomeric undulations

Zhang, J., Diller, E.

In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 1706-1711, September 2015 (inproceedings)

Abstract
This paper presents a new soft-bodied millimeterscale swimmer actuated by rotating uniform magnetic fields. The proposed swimmer moves through internal undulatory deformations, resulting from a magnetization profile programmed into its body. To understand the motion of the swimmer, a mathematical model is developed to describe the general relationship between the deflection of a flexible strip and its magnetization profile. As a special case, the situation of the swimmer on the water surface is analyzed and predictions made by the model are experimentally verified. Experimental results show the controllability of the proposed swimmer under a computer vision-based closed-loop controller. The swimmers have nominal dimensions of 1.5×4.9×0.06 mm and a top speed of 50 mm/s (10 body lengths per second). Waypoint following and multiagent control are demonstrated for swimmers constrained at the air-water interface and underwater swimming is also shown, suggesting the promising potential of this type of swimmer in biomedical and microfluidic applications.

pi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl img displet
Displets: Resolving Stereo Ambiguities using Object Knowledge

Güney, F., Geiger, A.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015, pages: 4165-4175, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2015 (inproceedings)

Abstract
Stereo techniques have witnessed tremendous progress over the last decades, yet some aspects of the problem still remain challenging today. Striking examples are reflecting and textureless surfaces which cannot easily be recovered using traditional local regularizers. In this paper, we therefore propose to regularize over larger distances using object-category specific disparity proposals (displets) which we sample using inverse graphics techniques based on a sparse disparity estimate and a semantic segmentation of the image. The proposed displets encode the fact that objects of certain categories are not arbitrarily shaped but typically exhibit regular structures. We integrate them as non-local regularizer for the challenging object class 'car' into a superpixel based CRF framework and demonstrate its benefits on the KITTI stereo evaluation.

avg ps

pdf abstract suppmat [BibTex]

pdf abstract suppmat [BibTex]


Thumb xl img sceneflow
Object Scene Flow for Autonomous Vehicles

Menze, M., Geiger, A.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015, pages: 3061-3070, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2015 (inproceedings)

Abstract
This paper proposes a novel model and dataset for 3D scene flow estimation with an application to autonomous driving. Taking advantage of the fact that outdoor scenes often decompose into a small number of independently moving objects, we represent each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object. This minimal representation increases robustness and leads to a discrete-continuous CRF where the data term decomposes into pairwise potentials between superpixels and objects. Moreover, our model intrinsically segments the scene into its constituting dynamic components. We demonstrate the performance of our model on existing benchmarks as well as a novel realistic dataset with scene flow ground truth. We obtain this dataset by annotating 400 dynamic scenes from the KITTI raw data collection using detailed 3D CAD models for all vehicles in motion. Our experiments also reveal novel challenges which can't be handled by existing methods.

avg ps

pdf abstract suppmat DOI [BibTex]

pdf abstract suppmat DOI [BibTex]


Thumb xl publications toc
Fiberbot: A miniature crawling robot using a directional fibrillar pad

Han, Y., Marvi, H., Sitti, M.

In Robotics and Automation (ICRA), 2015 IEEE International Conference on, pages: 3122-3127, May 2015 (inproceedings)

Abstract
Vibration-driven locomotion has been widely used for crawling robot studies. Such robots usually have a vibration motor as the actuator and a fibrillar structure for providing directional friction on the substrate. However, there has not been any studies about the effect of fiber structure on robot crawling performance. In this paper, we develop Fiberbot, a custom made mini vibration robot, for studying the effect of fiber angle on robot velocity, steering, and climbing performance. It is known that the friction force with and against fibers depends on the fiber angle. Thus, we first present a new fabrication method for making millimeter scale fibers at a wide range of angles. We then show that using 30° angle fibers that have the highest friction anisotropy (ratio of backward to forward friction force) among the other fibers we fabricated in this study, Fiberbot speed on glass increases to 13.8±0.4 cm/s (compared to ν = 0.6±0.1 cm/s using vertical fibers). We also demonstrate that the locomotion direction of Fiberbot depends on the tilting direction of fibers and we can steer the robot by rotating the fiber pad. Fiberbot could also climb on glass at inclinations of up to 10° when equipped with fibers of high friction anisotropy. We show that adding a rigid tail to the robot it can climb on glass at 25° inclines. Moreover, the robot is able to crawl on rough surfaces such as wood (ν = 10.0±0.2 cm/s using 30° fiber pad). Fiberbot, a low-cost vibration robot equipped with a custom-designed fiber pad with steering and climbing capabilities could be used for studies on collective behavior on a wide range of topographies as well as search and exploratory missions.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl publications toc
Platform design and tethered flight of a motor-driven flapping-wing system

Hines, L., Colmenares, D., Sitti, M.

In Robotics and Automation (ICRA), 2015 IEEE International Conference on, pages: 5838-5845, May 2015 (inproceedings)

Abstract
In this work, we examine two design modifications to a tethered motor-driven flapping-wing system. Previously, we had demonstrated a simple mechanism utilizing a linear transmission for resonant operation and direct drive of the wing flapping angle for control. The initial two-wing system had a weight of 2.7 grams and a maximum lift-to-weight ratio of 1.4. While capable of vertical takeoff, in open-loop flight it demonstrated instability and pitch oscillations at the wing flapping frequency, leading to flight times of only a few wing strokes. Here the effect of vertical wing offset as well as an alternative multi-wing layout is investigated and experimentally tested with newly constructed prototypes. With only a change in vertical wing offset, stable open-loop flight of the two-wing flapping system is shown to be theoretically possible, but difficult to achieve with our current design and operating parameters. Both of the new two and four-wing systems, however, prove capable of flying to the end of the tether, with the four-wing system prototype eliminating disruptive wing beat oscillations.

pi

DOI [BibTex]

DOI [BibTex]


Thumb xl geiger
Joint 3D Object and Layout Inference from a single RGB-D Image

(Best Paper Award)

Geiger, A., Wang, C.

In German Conference on Pattern Recognition (GCPR), 9358, pages: 183-195, Lecture Notes in Computer Science, Springer International Publishing, 2015 (inproceedings)

Abstract
Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. In contrast to existing holistic approaches, our model leverages detailed 3D geometry using inverse graphics and explicitly enforces occlusion and visibility constraints for respecting scene properties and projective geometry. We cast the task as MAP inference in a factor graph and solve it efficiently using message passing. We evaluate our method with respect to several baselines on the challenging NYUv2 indoor dataset using 21 object categories. Our experiments demonstrate that the proposed method is able to infer scenes with a large degree of clutter and occlusions.

avg ps

pdf suppmat video project DOI [BibTex]

pdf suppmat video project DOI [BibTex]


Thumb xl menze
Discrete Optimization for Optical Flow

Menze, M., Heipke, C., Geiger, A.

In German Conference on Pattern Recognition (GCPR), 9358, pages: 16-28, Springer International Publishing, 2015 (inproceedings)

Abstract
We propose to look at large-displacement optical flow from a discrete point of view. Motivated by the observation that sub-pixel accuracy is easily obtained given pixel-accurate optical flow, we conjecture that computing the integral part is the hardest piece of the problem. Consequently, we formulate optical flow estimation as a discrete inference problem in a conditional random field, followed by sub-pixel refinement. Naive discretization of the 2D flow space, however, is intractable due to the resulting size of the label set. In this paper, we therefore investigate three different strategies, each able to reduce computation and memory demands by several orders of magnitude. Their combination allows us to estimate large-displacement optical flow both accurately and efficiently and demonstrates the potential of discrete optimization for optical flow. We obtain state-of-the-art performance on MPI Sintel and KITTI.

avg ps

pdf suppmat project DOI [BibTex]

pdf suppmat project DOI [BibTex]


Thumb xl isa
Joint 3D Estimation of Vehicles and Scene Flow

Menze, M., Heipke, C., Geiger, A.

In Proc. of the ISPRS Workshop on Image Sequence Analysis (ISA), 2015 (inproceedings)

Abstract
Three-dimensional reconstruction of dynamic scenes is an important prerequisite for applications like mobile robotics or autonomous driving. While much progress has been made in recent years, imaging conditions in natural outdoor environments are still very challenging for current reconstruction and recognition methods. In this paper, we propose a novel unified approach which reasons jointly about 3D scene flow as well as the pose, shape and motion of vehicles in the scene. Towards this goal, we incorporate a deformable CAD model into a slanted-plane conditional random field for scene flow estimation and enforce shape consistency between the rendered 3D models and the parameters of all superpixels in the image. The association of superpixels to objects is established by an index variable which implicitly enables model selection. We evaluate our approach on the challenging KITTI scene flow dataset in terms of object and scene flow estimation. Our results provide a prove of concept and demonstrate the usefulness of our method.

avg ps

PDF [BibTex]

PDF [BibTex]

2006


no image
Miniature endoscopic capsule robot using biomimetic micro-patterned adhesives

Karagozler, M. E., Cheung, E., Kwon, J., Sitti, M.

In Biomedical Robotics and Biomechatronics, 2006. BioRob 2006. The First IEEE/RAS-EMBS International Conference on, pages: 105-111, 2006 (inproceedings)

pi

[BibTex]

2006


[BibTex]


no image
Toward micro wall-climbing robots using biomimetic fibrillar adhesives

Greuter, M., Shah, G., Caprari, G., Tâche, F., Siegwart, R., Sitti, M.

In Proceedings of the 3rd International Symposium on Autonomous Minirobots for Research and Edutainment (AMiRE 2005), pages: 39-46, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Geckobot: A gecko inspired climbing robot using elastomer adhesives

Unver, O., Uneri, A., Aydemir, A., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 2329-2335, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Towards hybrid swimming microrobots: bacteria assisted propulsion of polystyrene beads

Behkam, B., Sitti, M.

In Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th Annual International Conference of the IEEE, pages: 2421-2424, 2006 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Soft microcontact printing with force control using microrobotic assembly based templates

Tafazzoli, A., Sitti, M.

In Advanced Motion Control, 2006. 9th IEEE International Workshop on, pages: 500-505, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Modeling of the supporting legs for designing biomimetic water strider robots

Song, Y. S., Suhr, S. H., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 2303-2310, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A novel water running robot inspired by basilisk lizards

Floyd, S., Keegan, T., Palmisano, J., Sitti, M.

In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, pages: 5430-5436, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Force-controlled microcontact printing using microassembled particle templates

Tafazzoli, A., Pawashe, C., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 263-268, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Waalbot: An agile small-scale wall climbing robot utilizing pressure sensitive adhesives

Murphy, M. P., Tso, W., Tanzini, M., Sitti, M.

In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, pages: 3411-3416, 2006 (inproceedings)

pi

[BibTex]

[BibTex]

2005


no image
Modeling and testing of a biomimetic flagellar propulsion method for microscale biomedical swimming robots

Behkam, B., Sitti, M.

In Proceedings of Advanced Intelligent Mechatronics Conference, pages: 37-42, 2005 (inproceedings)

pi

Project Page [BibTex]

2005


Project Page [BibTex]


no image
Biologically inspired adhesion based surface climbing robots

Menon, C., Sitti, M.

In Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on, pages: 2715-2720, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Claytronics: highly scalable communications, sensing, and actuation networks

Aksak, Burak, Bhat, Preethi Srinivas, Campbell, Jason, DeRosa, Michael, Funiak, Stanislav, Gibbons, Phillip B, Goldstein, Seth Copen, Guestrin, Carlos, Gupta, Ashish, Helfrich, Casey, others

In Proceedings of the 3rd international conference on Embedded networked sensor systems, pages: 299-299, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Biologically Inspired Miniature Water Strider Robot.

Suhr, S. H., Song, Y. S., Lee, S. J., Sitti, M.

In Robotics: Science and Systems, pages: 319-326, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Polymer micro/nanofiber fabrication using micro/nanopipettes

Nain, A. S., Amon, C., Sitti, M.

In Nanotechnology, 2005. 5th IEEE Conference on, pages: 366-369, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Fusion of biomedical microcapsule endoscope and microsystem technology

Kim, Tae Song, Kim, Byungkyu, Cho, Dongil Dan, Song, Si Young, Dario, P, Sitti, M

In Solid-State Sensors, Actuators and Microsystems, 2005. Digest of Technical Papers. TRANSDUCERS’05. The 13th International Conference on, 1, pages: 9-14, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Atomic force microscope based two-dimensional assembly of micro/nanoparticles

Tafazzoli, A., Pawashe, C., Sitti, M.

In Assembly and Task Planning: From Nano to Macro Assembly and Manufacturing, 2005.(ISATP 2005). The 6th IEEE International Symposium on, pages: 230-235, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A new endoscopic microcapsule robot using beetle inspired microfibrillar adhesives

Cheung, E., Karagozler, M. E., Park, S., Kim, B., Sitti, M.

In Advanced Intelligent Mechatronics. Proceedings, 2005 IEEE/ASME International Conference on, pages: 551-557, 2005 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]