Writing and maintaining programs for robots poses some interesting challenges. It is hard to generalize them, as their targets are more than computing platforms. It can be deceptive to see them as input to output mappings, as interesting environments result in unpredictable inputs, and mixing reactive and deliberative behavior make intended outputs hard to define. Given the wide and fragmented landscape of components, from hardware to software, and the parties involved in providing and using them, integration is also a non-trivial aspect. The talk will illustrate the work ongoing at Fraunhofer IPA to tackle these challenges, how Open Source is its common trait, and how this translates into the industrial field thanks to the ROS-Industrial initiative.
Organizers: Vincent Berenz
Performance metrics are a key component of machine learning systems, and are ideally constructed to reflect real world tradeoffs. In contrast, much of the literature simply focuses on algorithms for maximizing accuracy. With the increasing integration of machine learning into real systems, it is clear that accuracy is an insufficient measure of performance for many problems of interest. Unfortunately, unlike accuracy, many real world performance metrics are non-decomposable i.e. cannot be computed as a sum of losses for each instance. Thus, known algorithms and associated analysis are not trivially extended, and direct approaches require expensive combinatorial optimization. I will outline recent results characterizing population optimal classifiers for large families of binary and multilabel classification metrics, including such nonlinear metrics as F-measure and Jaccard measure. Perhaps surprisingly, the prediction which maximizes the utility for a range of such metrics takes a simple form. This results in simple and scalable procedures for optimizing complex metrics in practice. I will also outline how the same analysis gives optimal procedures for selecting point estimates from complex posterior distributions for structured objects such as graphs. Joint work with Nagarajan Natarajan, Bowei Yan, Kai Zhong, Pradeep Ravikumar and Inderjit Dhillon.
Organizers: Mijung Park
Colloquium on haptics: Two guests of the department "Haptic Intelligence" (Dept. Kuchenbecker), will each give a short talk this Friday (May 5) in Tübingen. The talks will be broadcasted to Stuttgart, room 2 P4.
Estimating human pose, shape, and motion from images and video are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL: a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.
Organizers: Dimitris Tzionas
Human-centric robotic applications often require the robots to learn new skills by interacting with the end-users. From a machine learning perspective, the challenge is to acquire skills from only few interactions, with strong generalization demands. It requires: 1) the development of intuitive active learning interfaces to acquire meaningful demonstrations; 2) the development of models that can exploit the structure and geometry of the acquired data in an efficient way; 3) the development of adaptive control techniques that can exploit the learned task variations and coordination patterns. The developed models often need to serve several purposes (recognition, prediction, online synthesis), and be compatible with different learning strategies (imitation, emulation, exploration). For the reproduction of skills, these models need to be enriched with force and impedance information to enable human-robot collaboration and to generate safe and natural movements. I will present an approach combining model predictive control and statistical learning of movement primitives in multiple coordinate systems. The proposed approach will be illustrated in various applications, with robots either close to us (robot for dressing assistance), part of us (prosthetic hand with EMG and tactile sensing), or far from us (teleoperation of bimanual robot in deep water).
Organizers: Ludovic Righetti
This talk will survey recent work to achieve multi-contact locomotion control of humanoid and legged robots. I will start by presenting some results on robust optimization-based control. We exploited robust optimization techniques, either stochastic or worst-case, to improve the robustness of Task-Space Inverse Dynamics (TSID), a well-known control framework for legged robots. We modeled uncertainties in the joint torques, and we immunized the constraints of the system to any of the realizations of these uncertainties. We also applied the same methodology to ensure the balance of the robot despite bounded errors in the its inertial parameters. Extensive simulations in a realistic environment show that the proposed robust controllers greatly outperform the classic one. Then I will present preliminary results on a new capturability criterion for legged robots in multi-contact. "N-step capturability" is the ability of a system to come to a stop by taking N or fewer steps. Simplified models to compute N-step capturability already exist and are widely used, but they are limited to locomotion on flat terrains. We propose a new efficient algorithm to compute 0-step capturability for a robot in arbitrary contact scenarios. Finally, I will present our recent efforts to transfer the above-mentioned techniques to the real humanoid robot HRP-2, on which we recently implemented joint torque control.
Organizers: Ludovic Righetti
The retina in the eye performs complex computations, to transmit only behaviourally relevant information about our visual environment to the brain. These computations are implemented by numerous different cell types that form complex circuits. New experimental and computational methods make it possible to study the cellular diversity of the retina in detail – the goal of obtaining a complete list of all the cell types in the retina and, thus, its “building blocks”, is within reach. I will review our recent contributions in this area, showing how analyzing multimodal datasets from electron microscopy and functional imaging can yield insights into the cellular organization of retinal circuits.
Organizers: Philipp Hennig
From gait, dance to martial art, human movements provide rich, complex yet coherent spatiotemporal patterns reflecting characteristics of a group or an individual. We develop computer algorithms to automatically learn such quality discriminative features from multimodal data. In this talk, I present a trilogy on learning from human movements: (1) Gait analysis from video data: based on frieze patterns (7 frieze groups), a video sequence of silhouettes is mapped into a pair of spatiotemporal patterns that are near-periodic along the time axis. A group theoretical analysis of periodic patterns allows us to determine the dynamic time warping and affine scaling that aligns two gait sequences from similar viewpoints for human identification. (2) Dance analysis and synthesis (mocap, music, ratings from Mechanical Turks): we explore the complex relationship between perceived dance quality/dancer's gender and dance movements respectively. As a feasibility study, we construct a computational framework for an analysis-synthesis-feedback loop using a novel multimedia dance-texture representation for joint angular displacement, velocity and acceleration. Furthermore, we integrate crowd sourcing, music and motion-capture data, and machine learning-based methods for dance segmentation, analysis and synthesis of new dancers. A quantitative validation of this framework on a motion-capture dataset of 172 dancers evaluated by more than 400 independent on-line raters demonstrates significant correlation between human perception and the algorithmically intended dance quality or gender of the synthesized dancers. (3) Tai Chi performance evaluation (mocap + video): I shall also discuss the feasibility of utilizing spatiotemporal synchronization and, ultimately, machine learning to evaluate Tai Chi routines performed by different subjects in our current project of “Tai Chi + Advanced Technology for Smart Health”.
There has been significant prior work on learning realistic, articulated, 3D statistical shape models of the human body. In contrast, there are few such models for animals, despite their many applications in biology, neuroscience, agriculture, and entertainment. The main challenge is that animals are much less cooperative subjects than humans: the best human body models are learned from thousands of 3D scans of people in specific poses, which is infeasible with live animals. In the talk I will illustrate how we extend a state-of-the-art articulated 3D human body model (SMPL) to animals learning from toys a multi-family shape space that can represent lions, cats, dogs, horses, cows and hippos. The generalization of the model is illustrated by fitting it to images of real animals, where it captures realistic animal shapes, even for new species not seen in training.
Moritz Hardt will review some progress and challenges towards preventing discrimination based on sensitive attributes in supervised learning.
In this talk we will give an overview of research efforts within autonomous manipulation at the AASS Research Center, Örebro University, Sweden. We intend to give a holistic view on the historically separated subjects of robot motion planning and control. In particular, viewing motion behavior generation as an optimal control problem allows for a unified formulation that is uncluttered by a-priori domain assumptions and simplified solution strategies. Furthermore, We will also discuss the problems of workspace modeling and perception and how to integrate them in the overarching problem of autonomous manipulation.
Organizers: Ludovic Righetti
As large tensor-variate data increasingly become the norm in applied machine learning and statistics, complex analysis methods similarly increase in prevalence. Such a trend offers the opportunity to understand more intricate features of the data that, ostensibly, could not be studied with simpler datasets or simpler methodologies. While promising, these advances are also perilous: these novel analysis techniques do not always consider the possibility that their results are in fact an expected consequence of some simpler, already-known feature of simpler data (for example, treating the tensor like a matrix or a univariate quantity) or simpler statistic (for example, the mean and covariance of one of the tensor modes). I will present two works that address this growing problem, the first of which uses Kronecker algebra to derive a tensor-variate maximum entropy distribution that shares modal moments with the real data. This distribution of surrogate data forms the basis of a statistical hypothesis test, and I use this method to answer a question of epiphenomenal tensor structure in populations of neural recordings in the motor and prefrontal cortex. In the second part, I will discuss how to extend this maximum entropy formulation to arbitrary constraints using deep neural network architectures in the flavor of implicit generative modeling, and I will use this method in a texture synthesis application.
Organizers: Philipp Hennig