Institute Talks

Multi-contact locomotion control for legged robots

Talk
  • 25 April 2017 • 11:00 12:30
  • Dr. Andrea Del Prete
  • N2.025 (AMD seminar room - 2nd floor)

This talk will survey recent work to achieve multi-contact locomotion control of humanoid and legged robots. I will start by presenting some results on robust optimization-based control. We exploited robust optimization techniques, either stochastic or worst-case, to improve the robustness of Task-Space Inverse Dynamics (TSID), a well-known control framework for legged robots. We modeled uncertainties in the joint torques, and we immunized the constraints of the system to any of the realizations of these uncertainties. We also applied the same methodology to ensure the balance of the robot despite bounded errors in the its inertial parameters. Extensive simulations in a realistic environment show that the proposed robust controllers greatly outperform the classic one. Then I will present preliminary results on a new capturability criterion for legged robots in multi-contact. "N-step capturability" is the ability of a system to come to a stop by taking N or fewer steps. Simplified models to compute N-step capturability already exist and are widely used, but they are limited to locomotion on flat terrains. We propose a new efficient algorithm to compute 0-step capturability for a robot in arbitrary contact scenarios. Finally, I will present our recent efforts to transfer the above-mentioned techniques to the real humanoid robot HRP-2, on which we recently implemented joint torque control.

Organizers: Ludovic Righetti

Learning from Synthetic Humans

Talk
  • 04 May 2017 • 15:00 16:00
  • Gul Varol
  • N3.022 (Greenhouse)

Estimating human pose, shape, and motion from images and video are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL: a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.

Organizers: Dimitris Tzionas

Frederick Eberhardt - TBA

IS Colloquium
  • 03 July 2017 • 11:15 12:15
  • Frederick Eberhardt
  • Max Planck House Lecture Hall

Organizers: Sebastian Weichwald

  • Trevor Darrell
  • MPH Lecture Hall, Tübingen

Learning of layered or "deep" representations has provided significant advances in computer vision in recent years, but has traditionally been limited to fully supervised settings with very large amounts of training data. New results show that such methods can also excel when learning in sparse/weakly labeled settings across modalities and domains. I'll present our recent long-term recurrent network model which can learn cross-modal translation and can provide open-domain video to text transcription. I'll also describe state-of-the-art models for fully convolutional pixel-dense segmentation from weakly labeled input, and finally will discuss new methods for adapting deep recognition models to new domains with few or no target labels for categories of interest.

Organizers: Jonas Wulff


  • Andre Seyfarth
  • MRZ Seminar Room

In this presentation a series of conceptual models for describing human and animal locomotion will be presented ranging from standing to walking and running. By subsequently increasing the complexity of the models we show that basic properties of the underlying spring-mass model can be inherited by the more detailed models. Model extensions include the consideration of a rigid trunk (instead of a point mass), non-elastic leg properties (instead of a mass-less leg spring), additional legs (two and four legs), leg masses, leg segments (e.g. a compliantly attached foot) and energy management protocols. Furthermore we propose a methodology to evaluate and refine conceptual models based on the test trilogy. This approach consists of a simulation test, a hardware test and a behavioral comparison of biological experiments with model predictions and hardware models.


  • Andre Seyfarth
  • MRZ Seminar room

In this presentation a series of conceptual models for describing human and animal locomotion will be presented ranging from standing to walking and running. By subsequently increasing the complexity of the models we show that basic properties of the underlying spring-mass model can be inherited by the more detailed models. Model extensions include the consideration of a rigid trunk (instead of a point mass), non-elastic leg properties (instead of a mass-less leg spring), additional legs (two and four legs), leg masses, leg segments (e.g. a compliantly attached foot) and energy management protocols. Furthermore we propose a methodology to evaluate and refine conceptual models based on the test trilogy. This approach consists of a simulation test, a hardware test and a behavioral comparison of biological experiments with model predictions and hardware models.

Organizers: Ludovic Righetti


Learning Rich and Fair Representations from Images and Text

Talk
  • 10 June 2015 • 03:00 pm 04:00 pm
  • Rich Zemel
  • MPH Lecture Hall, Tübingen

I will talk about two types of machine learning problems, which are important but have received little attention. The first are problems naturally formulated as learning a one-to-many mapping, which can handle the inherent ambiguity in tasks such as generating segmentations or captions for images. A second problem involves learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible. The primary approach we formulate for both problems is a constrained form of joint embedding in a deep generative model, that can develop informative representations of sentences and images. Applications discussed will include image captioning, question-answering, segmentation, classification without discrimination, and domain adaptation.

Organizers: Gerard Pons-Moll


  • Hans-Peter Seidel
  • MPH Hall

During the last three decades computer graphics established itself as a core discipline within computer science and information technology. Two decades ago, most digital content was textual. Today it has expanded to include audio, images, video, and a variety of graphical representations. New and emerging technologies such as multimedia, social networks, digital television, digital photography and the rapid development of new sensing devices, telecommunication and telepresence, virtual reality, or 3D-internet further indicate the potential of computer graphics in the years to come. Typical for the field is the coincidence of very large data sets with the demand for fast, and possibly interactive, high quality visual feedback. Furthermore, the user should be able to interact with the environment in a natural and intuitive way. In order to address the challenges mentioned above, a new and more integrated scientific view of computer graphics is required. In contrast to the classical approach to computer graphics which takes as input a scene model -- consisting of a set of light sources, a set of objects (specified by their shape and material properties), and a camera -- and uses simulation to compute an image, we like to take the more integrated view of `3D Image Analysis and Synthesis’ for our research. We consider the whole pipeline from data acquisition, over data processing to rendering in our work. In our opinion, this point of view is necessary in order to exploit the capabilities and perspectives of modern hardware, both on the input (sensors, scanners, digital photography, digital video) and output (graphics hardware, multiple platforms) side. Our vision and long term goal is the development of methods and tools to efficiently handle the huge amount of data during the acquisition process, to extract structure and meaning from the abundance of digital data, and to turn this into graphical representations that facilitate further processing, rendering, and interaction. In this presentation I will highlight some of our ongoing research by means of examples. Topics covered include 3D reconstruction and digital geometry processing, shape analysis and shape design, motion and performance capture, and 3D video processing.


  • Andrea Vedaldi
  • MPH Hall

Learnable representations, and deep convolutional neural networks (CNNs) in particular, have become the preferred way of extracting visual features for image understanding tasks, from object recognition to semantic segmentation. In this talk I will discuss several recent advances in deep representations for computer vision. After reviewing modern CNN architectures, I will give an example of a state-of-the-art network in text spotting; in particular, I will show that, by using only synthetic data and a sufficiently large deep model, it is possible directly map image regions to English words, a classification problem with 90K classes, obtaining in this manner state-of-the-art performance in text spotting. I will also briefly touch on other applications of deep learning to object recognition and discuss feature universality and transfer learning. In the last part of the talk I will move to the problem of understanding deep networks, which remain largely black boxes, presenting two possible approaches to their analysis. The first one are visualisation techniques that can investigate the information retained and learned by a visual representation. The second one is a method that allows exploring how representation capture geometric notions such as image transformations, and to find whether different representations are related and how.


  • Cristian Sminchisescu
  • MRZ Seminar room

Recent progress in computer-based visual recognition heavily relies on machine learning methods trained using large scale annotated datasets. While such data has made advances in model design and evaluation possible, it does not necessarily provide insights or constraints into those intermediate levels of computation, or deep structure, perceived as ultimately necessary in order to design reliable computer vision systems. This is noticeable in the accuracy of state of the art systems trained with such annotations, which still lag behind human performance in similar tasks. Nor does the existing data makes it immediately possible to exploit insights from a working system - the human eye - to derive potentially better features, models or algorithms. In this talk I will present a mix of perceptual and computational insights resulted from the analysis of large-scale human eye movement and 3d body motion capture datasets, collected in the context of visual recognition tasks (Human3.6M available at http://vision.imar.ro/human3.6m/, and Actions in the Eye available at http://vision.imar.ro/eyetracking/). I will show that attention models (fixation detectors, scan-paths estimators, weakly supervised object detector response functions and search strategies) can be learned from human eye movement data, and can produce state of the art results when used in end-to-end automatic visual recognition systems. I will also describe recent work in large-scale human pose estimation, showing the feasibility of pixel-level body part labeling from RGB, and towards promising 2D and 3D human pose estimation results in monocular images.In this context, I will discuss perceptual, perhaps surprising recent quantitative experiments, revealing that humans may not be significantly better than computers at perceiving 3D articulated poses from monocular images. Such findings may challenge established definitions of computer vision `tasks' and their expected levels of performance.


  • Auke Ijspeert
  • Max Planck Lecture Hall

Organizers: Ludovic Righetti


  • Benedetta Gennaro
  • MRC seminar room (0.A.03)

The breast is not just a protruding gland situated on the front of the thorax in female bodies: behind biology lies an intricate symbolism that has taken various and often contradictory meanings.  We begin our journey looking at pre-historic artifacts that revered the breast as the ultimate symbol of life; we then transition to the rich iconographical tradition centering on the so-called Virgo Lactans when the breast became a metaphor of nourishment for the entire Christian community. Next, we look at how artists have eroticized the breast in portraits of fifteenth-century French courtesans and how enlightenment philosophers and revolutionary events have transformed it into a symbol of the national community. Lastly, we analyze how contemporary society has medicalized the breast through cosmetic surgery and discourses around breast cancer, and has objectified it by making the breast a constant presence in advertisement and magazine covers. Through twenty-five centuries of representations, I will talk about how the breast has been coded as both "good" and "bad," sacred and erotic, life-giving and life-destroying.


Quadrupedal locomotion: a planning & control framework for HyQ

Talk
  • 09 March 2015 • 11:00 am 12:00 am
  • Ioannis Havoutis
  • AMD seminar room (TTR building first floor)

It is a great pleasure to invite you to the talk of Ioannis Havoutis (cf. info below) on Monday March 9th at 11h in the AMD seminar room (TTR building, first floor). have a nice week-end, ludovic Quadrupedal animals move with skill, grace and agility. Quadrupedal robots have made tremendous progress in the last few years. In this talk I will give an overview of our work with the Hydraulic Quadruped -HyQ- and present our latest framework for perception, planning and control of quadrupedal locomotion in challenging environments. In addition, I will give a short preview of our work on optimization of dynamic motions, and our future goals.

Organizers: Ludovic Righetti