Header logo is
Institute Talks

Self-Supervised Representation Learning for Visual Behavior Analysis and Synthesis

Talk
  • 14 December 2018 • 12:00 13:00
  • Prof. Dr. Björn Ommer
  • PS Aquarium

Understanding objects and their behavior from images and videos is a difficult inverse problem. It requires learning a metric in image space that reflects object relations in real world. This metric learning problem calls for large volumes of training data. While images and videos are easily available, labels are not, thus motivating self-supervised metric and representation learning. Furthermore, I will present a widely applicable strategy based on deep reinforcement learning to improve the surrogate tasks underlying self-supervision. Thereafter, the talk will cover the learning of disentangled representations that explicitly separate different object characteristics. Our approach is based on an analysis-by-synthesis paradigm and can generate novel object instances with flexible changes to individual characteristics such as their appearance and pose. It nicely addresses diverse applications in human and animal behavior analysis, a topic we have intensive collaboration on with neuroscientists. Time permitting, I will discuss the disentangling of representations from a wider perspective including novel strategies to image stylization and new strategies for regularization of the latent space of generator networks.

Organizers: Joel Janai

Generating Faces & Heads: Texture, Shape and Beyond.

Talk
  • 17 December 2018 • 11:00 12:00
  • Stefanos Zafeiriou
  • PS Aquarium

The past few years with the advent of Deep Convolutional Neural Networks (DCNNs), as well as the availability of visual data it was shown that it is possible to produce excellent results in very challenging tasks, such as visual object recognition, detection, tracking etc. Nevertheless, in certain tasks such as fine-grain object recognition (e.g., face recognition) it is very difficult to collect the amount of data that are needed. In this talk, I will show how, using DCNNs, we can generate highly realistic faces and heads and use them for training algorithms such as face and facial expression recognition. Next, I will reverse the problem and demonstrate how by having trained a very powerful face recognition network it can be used to perform very accurate 3D shape and texture reconstruction of faces from a single image. Finally, I will demonstrate how to create very lightweight networks for representing 3D face texture and shape structure by capitalising upon intrinsic mesh convolutions.

Organizers: Dimitris Tzionas

Deep learning on 3D face reconstruction, modelling and applications

Talk
  • 19 December 2018 • 11:00 12:00
  • Yao Feng
  • PS Aquarium

In this talk, I will present my understanding on 3D face reconstruction, modelling and applications from a deep learning perspective. In the first part of my talk, I will discuss the relationship between representations (point clouds, meshes, etc) and network layers (CNN, GCN, etc) on face reconstruction task, then present my ECCV work PRN which proposed a new representation to help achieve state-of-the-art performance on face reconstruction and dense alignment tasks. I will also introduce my open source project face3d that provides examples for generating different 3D face representations. In the second part of the talk, I will talk some publications in integrating 3D techniques into deep networks, then introduce my upcoming work which implements this. In the third part, I will present how related tasks could promote each other in deep learning, including face recognition for face reconstruction task and face reconstruction for face anti-spoofing task. Finally, with such understanding of these three parts, I will present my plans on 3D face modelling and applications.

Organizers: Timo Bolkart

Mind Games

IS Colloquium
  • 21 December 2018 • 11:00 12:00
  • Peter Dayan
  • IS Lecture Hall

Much existing work in reinforcement learning involves environments that are either intentionally neutral, lacking a role for cooperation and competition, or intentionally simple, when agents need imagine nothing more than that they are playing versions of themselves. Richer game theoretic notions become important as these constraints are relaxed. For humans, this encompasses issues that concern utility, such as envy and guilt, and that concern inference, such as recursive modeling of other players, I will discuss studies treating a paradigmatic game of trust as an interactive partially-observable Markov decision process, and will illustrate the solution concepts with evidence from interactions between various groups of subjects, including those diagnosed with borderline and anti-social personality disorders.

TBA

IS Colloquium
  • 28 January 2019 • 11:15 12:15
  • Florian Marquardt

Organizers: Matthias Bauer

A fine-grained perspective onto object interactions

Talk
  • 30 October 2018 • 10:30 11:30
  • Dima Damen
  • N0.002

This talk aims to argue for a fine-grained perspective onto human-object interactions, from video sequences. I will present approaches for the understanding of ‘what’ objects one interacts with during daily activities, ‘when’ should we label the temporal boundaries of interactions, ‘which’ semantic labels one can use to describe such interactions and ‘who’ is better when contrasting people perform the same interaction. I will detail my group’s latest works on sub-topics related to: (1) assessing action ‘completion’ – when an interaction is attempted but not completed [BMVC 2018], (2) determining skill or expertise from video sequences [CVPR 2018] and (3) finding unequivocal semantic representations for object interactions [ongoing work]. I will also introduce EPIC-KITCHENS 2018, the recently released largest dataset of object interactions in people’s homes, recorded using wearable cameras. The dataset includes 11.5M frames fully annotated with objects and actions, based on unique annotations from the participants narrating their own videos, thus reflecting true intention. Three open challenges are now available on object detection, action recognition and action anticipation [http://epic-kitchens.github.io]

Organizers: Mohamed Hassan


Artificial haptic intelligence for human-machine systems

IS Colloquium
  • 25 October 2018 • 11:00 11:00
  • Veronica J. Santos
  • N2.025 at MPI-IS in Tübingen

The functionality of artificial manipulators could be enhanced by artificial “haptic intelligence” that enables the identification of object features via touch for semi-autonomous decision-making and/or display to a human operator. This could be especially useful when complementary sensory modalities, such as vision, are unavailable. I will highlight past and present work to enhance the functionality of artificial hands in human-machine systems. I will describe efforts to develop multimodal tactile sensor skins, and to teach robots how to haptically perceive salient geometric features such as edges and fingertip-sized bumps and pits using machine learning techniques. I will describe the use of reinforcement learning to teach robots goal-based policies for a functional contour-following task: the closure of a ziplock bag. Our Contextual Multi-Armed Bandits approach tightly couples robot actions to the tactile and proprioceptive consequences of the actions, and selects future actions based on prior experiences, the current context, and a functional task goal. Finally, I will describe current efforts to develop real-time capabilities for the perception of tactile directionality, and to develop models for haptically locating objects buried in granular media. Real-time haptic perception and decision-making capabilities could be used to advance semi-autonomous robot systems and reduce the cognitive burden on human teleoperators of devices ranging from wheelchair-mounted robots to explosive ordnance disposal robots.

Organizers: Katherine Kuchenbecker Adam Spiers


Artificial haptic intelligence for human-machine systems

IS Colloquium
  • 24 October 2018 • 11:00 12:00
  • Veronica J. Santos
  • 5H7 at MPI-IS in Stuttgart

The functionality of artificial manipulators could be enhanced by artificial “haptic intelligence” that enables the identification of object features via touch for semi-autonomous decision-making and/or display to a human operator. This could be especially useful when complementary sensory modalities, such as vision, are unavailable. I will highlight past and present work to enhance the functionality of artificial hands in human-machine systems. I will describe efforts to develop multimodal tactile sensor skins, and to teach robots how to haptically perceive salient geometric features such as edges and fingertip-sized bumps and pits using machine learning techniques. I will describe the use of reinforcement learning to teach robots goal-based policies for a functional contour-following task: the closure of a ziplock bag. Our Contextual Multi-Armed Bandits approach tightly couples robot actions to the tactile and proprioceptive consequences of the actions, and selects future actions based on prior experiences, the current context, and a functional task goal. Finally, I will describe current efforts to develop real-time capabilities for the perception of tactile directionality, and to develop models for haptically locating objects buried in granular media. Real-time haptic perception and decision-making capabilities could be used to advance semi-autonomous robot systems and reduce the cognitive burden on human teleoperators of devices ranging from wheelchair-mounted robots to explosive ordnance disposal robots.

Organizers: Katherine Kuchenbecker


Learning to Act with Confidence

Talk
  • 23 October 2018 • 12:00 13:00
  • Andreas Krause
  • MPI-IS Tübingen, N0.002

Actively acquiring decision-relevant information is a key capability of intelligent systems, and plays a central role in the scientific process. In this talk I will present research from my group on this topic at the intersection of statistical learning, optimization and decision making. In particular, I will discuss how statistical confidence bounds can guide data acquisition in a principled way to make effective and reliable decisions in a variety of complex domains. I will also discuss several applications, ranging from autonomously guiding wetlab experiments in protein function optimization to safe exploration in robotics.


Control Systems for a Surgical Robot on the Space Station

IS Colloquium
  • 23 October 2018 • 16:30 17:30
  • Chris Macnab
  • MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

As part of a proposed design for a surgical robot on the space station, my research group has been asked to look at controls that can provide literally surgical precision. Due to excessive time delay, we envision a system with a local model being controlled by a surgeon while the remote system on the space station follows along in a safe manner. Two of the major design considerations that come into play for the low-level feedback loops on the remote side are 1) the harmonic drives in a robot will cause excessive vibrations in a micro-gravity environment unless active damping strategies are employed and 2) when interacting with a human tissue environment the robot must apply smooth control signals that result in precise positions and forces. Thus, we envision intelligent strategies that utilize nonlinear, adaptive, neural-network, and/or fuzzy control theory as the most suitable. However, space agencies, or their engineering sub-contractors, typically provide gain and phase margin characteristics as requirements to the engineers involved in a control system design, which are normally associated with PID or other traditional linear control schemes. We are currently endeavouring to create intelligent controls that have guaranteed gain and phase margins using the Cerebellar Model Articulation Controller.

Organizers: Katherine Kuchenbecker


  • Ravi Haksar
  • MPI-IS Stuttgart, seminar room 2P4

What do forest fires, disease outbreaks, robot swarms, and social networks have in common? How can we develop a common set of tools for these applications? In this talk, I will first introduce a modeling framework that describes large-scale phenomena and which is based on the idea of "local interactions." I will then describe my work on creating estimation and control methods for a single agent and for a cooperative team of autonomous agents. In particular, these algorithms are scalable as the solution does not change if the number of agents or environment size changes. Forest fires and the 2013 Ebola outbreak in West Africa are presented as examples.

Organizers: Sebastian Trimpe


  • Charlotte Le Mouel
  • 2P4, Heisenbergstr. 3, 70188 Stuttgart

Theories of motor control in neuroscience usually focus on the role of the nervous system in the coordination of movement. However, the literature in sports science as well as in embodied robotics suggests that improvements in motor performance can be achieved through an improvement of the body mechanical properties themselves, rather than only the control. I therefore developed the thesis that efficient motor coordination in animals and humans relies on the adjustment of the body mechanical properties to the task at hand, by the postural system.

Organizers: Charlotte Le Mouel Alexander Sproewitz


  • Mario Herger
  • Kupferbau Universität Tübingen, Hörsaal 22

Über 1.000 selbstfahrende Testfahrzeuge von insgesamt 57 Unternehmen fahren im Silicon Valley bereits herum, und nun steht die Google-Schwester Waymo davor, 82.000 Robotertaxis auf die Straßen zu bringen. Und das nicht irgendwann, sondern noch dieses Jahr. Währenddessen rüstet sich Tesla mit seinem vollelektrischen Model 3 für einen Frontalangriff auf die deutschen Hersteller. In den USA sind die Verkaufszahlen deutscher Mittelklassewagen im Vergleich zum Vorjahr um 29 Prozent eingebrochen.


Still, In Motion

Talk
  • 12 October 2018 • 11:00 12:00
  • Michael Cohen

In this talk, I will take an autobiographical approach to explain both where we have come from in computer graphics from the early days of rendering, and to point towards where we are going in this new world of smartphones and social media. We are at a point in history where the abilities to express oneself with media is unparalleled. The ubiquity and power of mobile devices coupled with new algorithmic paradigms is opening new expressive possibilities weekly. At the same time, these new creative media (composite imagery, augmented imagery, short form video, 3D photos) also offer unprecedented abilities to move freely between what is real and unreal. I will focus on the spaces in between images and video, and in between objective and subjective reality. Finally, I will close with some lessons learned along the way.


  • Mariacarla Memeo
  • MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

The increasing availability of on-line resources and the widespread practice of storing data over the internet arise the problem of their accessibility for visually impaired people. A translation from the visual domain to the available modalities is therefore necessary to study if this access is somewhat possible. However, the translation of information from vision to touch is necessarily impaired due to the superiority of vision during the acquisition process. Yet, compromises exist as visual information can be simplified, sketched. A picture can become a map. An object can become a geometrical shape. Under some circumstances, and with a reasonable loss of generality, touch can substitute vision. In particular, when touch substitutes vision, data can be differentiated by adding a further dimension to the tactile feedback, i.e. extending tactile feedback to three dimensions instead of two. This mode has been chosen because it mimics our natural way of following object profiles with fingers. Specifically, regardless if a hand lying on an object is moving or not, our tactile and proprioceptive systems are both stimulated and tell us something about which object we are manipulating, what can be its shape and size. The goal of this talk is to describe how to exploit tactile stimulation to render digital information non visually, so that cognitive maps associated with this information can be efficiently elicited from visually impaired persons. In particular, the focus is to deliver geometrical information in a learning scenario. Moreover, a completely blind interaction with virtual environment in a learning scenario is something little investigated because visually impaired subjects are often passive agents of exercises with fixed environment constraints. For this reason, during the talk I will provide my personal answer to the question: can visually impaired people manipulate dynamic virtual content through touch? This process is much more challenging than only exploring and learning a virtual content, but at the same time it leads to a more conscious and dynamic creation of the spatial understanding of an environment during tactile exploration.

Organizers: Katherine Kuchenbecker