The ability to predict how an environment changes based on forces applied to it is fundamental for a robot to achieve specific goals. Traditionally in robotics, this problem is addressed through the use of pre-specified models or physics simulators, taking advantage of prior knowledge of the problem structure. While these models are general and have broad applicability, they depend on accurate estimation of model parameters such as object shape, mass, friction etc. On the other hand, learning based methods such as Predictive State Representations or more recent deep learning approaches have looked at learning these models directly from raw perceptual information in a model-free manner. These methods operate on raw data without any intermediate parameter estimation, but lack the structure and generality of model-based techniques. In this talk, I will present some work that tries to bridge the gap between these two paradigms by proposing a specific class of deep visual dynamics models (SE3-Nets) that explicitly encode strong physical and 3D geometric priors (specifically, rigid body dynamics) in their structure. As opposed to traditional deep models that reason about dynamics/motion a pixel level, we show that the physical priors implicit in our network architectures enable them to reason about dynamics at the object level - our network learns to identify objects in the scene and to predict rigid body rotation and translation per object. I will present results on applying our deep architectures to two specific problems: 1) Modeling scene dynamics where the task is to predict future depth observations given the current observation and an applied action and 2) Real-time visuomotor control of a Baxter manipulator based only on raw depth data. We show that: 1) Our proposed architectures significantly outperform baseline deep models on dynamics modelling and 2) Our architectures perform comparably or better than baseline models for visuomotor control while operating at camera rates (30Hz) and relying on far less information.
Organizers: Franzi Meier
Machine learning has become a popular application domain for modern optimization techniques, pushing its algorithmic frontier. The need for large scale optimization algorithms which can handle millions of dimensions or data points, typical for the big data era, have brought a resurgence of interest for first order algorithms, making us revisit the venerable stochastic gradient method [Robbins-Monro 1951] as well as the Frank-Wolfe algorithm [Frank-Wolfe 1956]. In this talk, I will review recent improvements on these algorithms which can exploit the structure of modern machine learning approaches. I will explain why the Frank-Wolfe algorithm has become so popular lately; and present a surprising tweak on the stochastic gradient method which yields a fast linear convergence rate. Motivating applications will include weakly supervised video analysis and structured prediction problems.
Organizers: Philipp Hennig
Creating convincing human facial animation is challenging. Face animation is often hand-crafted by artists separately from body motion. Alternatively, if the face animation is derived from motion capture, it is typically performed while the actor is relatively still. Recombining the isolated face animation with body motion is non-trivial and often results in uncanny results if the body dynamics are not properly reflected on the face (e.g. cheeks wiggling when running). In this talk, I will discuss the challenges of human soft tissue simulation and control. I will then present our method for adding physical effects to facial blendshape animation. Unlike previous methods that try to add physics to face rigs, our method can combine facial animation and rigid body motion consistently while preserving the original animation as closely as possible. Our novel simulation framework uses the original animation as per-frame rest-poses without adding spurious forces. We also propose the concept of blendmaterials to give artists an intuitive means to control the changing material properties due to muscle activation.
Organizers: Timo Bolkart
Performance metrics are a key component of machine learning systems, and are ideally constructed to reflect real world tradeoffs. In contrast, much of the literature simply focuses on algorithms for maximizing accuracy. With the increasing integration of machine learning into real systems, it is clear that accuracy is an insufficient measure of performance for many problems of interest. Unfortunately, unlike accuracy, many real world performance metrics are non-decomposable i.e. cannot be computed as a sum of losses for each instance. Thus, known algorithms and associated analysis are not trivially extended, and direct approaches require expensive combinatorial optimization. I will outline recent results characterizing population optimal classifiers for large families of binary and multilabel classification metrics, including such nonlinear metrics as F-measure and Jaccard measure. Perhaps surprisingly, the prediction which maximizes the utility for a range of such metrics takes a simple form. This results in simple and scalable procedures for optimizing complex metrics in practice. I will also outline how the same analysis gives optimal procedures for selecting point estimates from complex posterior distributions for structured objects such as graphs. Joint work with Nagarajan Natarajan, Bowei Yan, Kai Zhong, Pradeep Ravikumar and Inderjit Dhillon.
Organizers: Mijung Park
Writing and maintaining programs for robots poses some interesting challenges. It is hard to generalize them, as their targets are more than computing platforms. It can be deceptive to see them as input to output mappings, as interesting environments result in unpredictable inputs, and mixing reactive and deliberative behavior make intended outputs hard to define. Given the wide and fragmented landscape of components, from hardware to software, and the parties involved in providing and using them, integration is also a non-trivial aspect. The talk will illustrate the work ongoing at Fraunhofer IPA to tackle these challenges, how Open Source is its common trait, and how this translates into the industrial field thanks to the ROS-Industrial initiative.
Organizers: Vincent Berenz
We present a way to set the step size of Stochastic Gradient Descent, as the solution of a distance minimization problem. The obtained result has an intuitive interpretation and resembles the update rules of well known optimization algorithms. Also, asymptotic results to its relation to the optimal learning rate of Gradient Descent are discussed. In addition, we talk about two different estimators, with applications in Variational inference problems, and present approximate results about their variance. Finally, we combine all of the above, to present an optimization algorithm that can be used on both mini-batch optimization and Variational problems.
Organizers: Philipp Hennig
How do young children learn so much about the world, and so efficiently? This talk presents the recent studies investigating theoretically and empirically how children actively seek information in their physical and social environments as evidence to test and dynamically revise their hypotheses and theories over time. In particular, it will focus on how children adapt their active learning strategies. such as question-asking and explorative behavior, in response to the task characteristics, to the statistical structure of the hypothesis space, and to the feedback received. Such adaptiveness and flexibility is crucial to achieve efficiency in situations of uncertainty, when testing alternative hypotheses, making decisions, drawing causal inferences and solving categorization tasks.
Neural networks have taken the world of computing in general and AI in particular by storm. But in the future, AI will need to revisit generative models. There are several reasons for this – system robustness, precision, transparency, and the high cost of labelling data. This is particularly true of perceptual AI, as needed for autonomous vehicles, where also the need for simulators and the need to confront novel situations, also will demand generative, probabilistic models.
Recently, deep learning proved to be successful also on low level vision tasks such as stereo matching. Another recent trend in this latter field is represented by confidence measures, with increasing effectiveness when coupled with random forest classifiers or CNNs. Despite their excellent accuracy in outliers detection, few other applications rely on them. In the first part of the talk, we'll take a look at the latest proposal in terms of confidence measures for stereo matching, as well as at some novel methodologies exploiting these very accurate cues. In the second part, we'll talk about GC-net, a deep network currently representing the state-of-the-art on the KITTI datasets, and its extension to motion stereo processing.
Organizers: Yiyi Liao
Bioelectronics integrates principles of electrical engineering and materials science to biology, medicine and ultimately health. Soft bioelectronics focus on designing and manufacturing electronic devices with mechanical properties close to those of the host biological tissue so that long-term reliability and minimal perturbation are induced in vivo and/or truly wearable systems become possible. We illustrate the potential of this soft technology with examples ranging from prosthetic tactile skins to soft multimodal neural implants.
Organizers: Diana Rebmann
Vaccine refusal can lead to outbreaks of previously eradicated diseases and is an increasing problem worldwide. Vaccinating decisions exemplify a complex, coupled system where vaccinating behavior and disease dynamics influence one another. Complex systems often exhibit characteristic dynamics near a tipping point to a new dynamical regime. For instance, critical slowing down -- the tendency for a system to start `wobbling'-- can increase close to a tipping point. We used a linear support vector machine to classify the sentiment of geo-located United States and California tweets concerning measles vaccination from 2011 to 2016. We also extracted data on internet searches on measles from Google Trends. We found evidence for critical slowing down in both datasets in the years before and after the 2014-15 Disneyland, California measles outbreak, suggesting that the population approached a tipping point corresponding to widespread vaccine refusal, but then receded from the tipping point in the face of the outbreak. A differential equation model of coupled behaviour-disease dynamics is shown to illustrate the same patterns. We conclude that studying critical phenomena in online social media data can help us develop analytical tools based on dynamical systems theory to identify populations at heightened risk of widespread vaccine refusal.
Organizers: Diana Rebmann