Abstract: Sequential Monte Carlo (SMC) methods (including the particle filters and smoothers) allows us to compute probabilistic representations of the unknown objects in models used to represent for example nonlinear dynamical systems. This talk has three connected parts: 1. A (hopefully pedagogical) introduction to probabilistic modelling of dynamical systems and an explanation of the SMC method. 2. In learning unknown parameters appearing in nonlinear state-space models using maximum likelihood it is natural to make use of SMC to compute unbiased estimates of the intractable likelihood. The challenge is that the resulting optimization problem is stochastic, which recently inspired us to construct a new solution to this problem. 3. A challenge with the above (and in fact with most use of SMC) is that it all quickly becomes very technical. This is indeed the key challenging in spreading the use of SMC methods to a wider group of users. At the same time there are many researchers who would benefit a lot from having access to these methods in their daily work and for those of us already working with them it is essential to reduce the amount of time spent on new problems. We believe that the solution to this can be provided by probabilistic programming. We are currently developing a new probabilistic programming language that we call Birch. A pre-release is available from birch-lang.org/ It allow users to use SMC methods without having to implement the algorithms on their own.
Organizers: Philipp Hennig
Today’s advances in tactile sensing and wearable, IOT and context-aware computing are spurring new ideas about how to configure touch-centered interactions in terms of roles and utility, which in turn expose new technical and social design questions. But while haptic actuation, sensing and control are improving, incorporating them into a real-world design process is challenging and poses a major obstacle to adoption into everyday technology. Some classes of haptic devices, e.g., grounded force feedback, remain expensive and limited in range. I’ll describe some recent highlights of an ongoing effort to understand how to support haptic designers and end-users. These include a wealth of online experimental design tools, and DIY open sourced hardware and accessible means of creating, for example, expressive physical robot motions and evolve physically sensed expressive tactile languages. Elsewhere, we are establishing the value of haptic force feedback in embodied learning environments, to help kids understand physics and math concepts. This has inspired the invention of a low-cost, handheld and large motion force feedback device that can be used in online environments or collaborative scenarios, and could be suitable for K-12 school contexts; this is ongoing research with innovative education and technological elements. All our work is available online, where possible as web tools, and we plan to push our research into a broader openhaptics effort.
Organizers: Katherine Kuchenbecker
Why cannot the current robots act intelligently in the real-world environment? A major challenge lies in the lack of adequate tactile sensing technologies. Robots need tactile sensing to understand the physical environment, and detect the contact states during manipulation. Progress requires advances in the sensing hardware, but also advances in the software that can exploit the tactile signals. We developed a high-resolution tactile sensor, GelSight, which measures the geometry and traction field of the contact surface. For interpreting the high-resolution tactile signal, we utilize both traditional statistical models and deep neural networks. I will describe my research on both exploration and manipulation. For exploration, I use active touch to estimate the physical properties of the objects. The work has included learning the hardness of artificial objects, as well as estimating the general properties of natural objects via autonomous tactile exploration. For manipulation, I study the robot’s ability to detect slip or incipient slip with tactile sensing during grasping. The research helps robots to better understand and flexibly interact with the physical world.
Organizers: Katherine Kuchenbecker
Gliding evolved at least nine times in mammals. Despite the abundance and diversity of gliding mammals, little is known about their convergent morphology and mechanisms of aerodynamic control. Many gliding animals are capable of impressive and agile aerial behaviors and their flight performance depends on the aerodynamic forces resulting from airflow interacting with a flexible, membranous wing (patagium). Although the mechanisms that gliders use to control dynamic flight are poorly understood, the shape of the gliding membrane (e.g., angle of attack, camber) is likely a primary factor governing the control of the interaction between aerodynamic forces and the animal’s body. Data from field studies of gliding behavior, lab experiments examining membrane shape changes during glides and morphological and materials testing data of gliding membranes will be presented that can aid our understanding of the mechanisms gliding mammals use to control their membranous wings and potentially provide insights into the design of man-made flexible wings.
Modern technology allows us to collect, process, and share more data than ever before. This data revolution opens up new ways to design control and learning algorithms, which will form the algorithmic foundation for future intelligent systems that shall act autonomously in the physical world. Starting from a discussion of the special challenges when combining machine learning and control, I will present some of our recent research in this exciting area. Using the example of the Apollo robot learning to balance a stick in its hand, I will explain how intelligent agents can learn new behavior from just a few experimental trails. I will also discuss the need for theoretical guarantees in learning-based control, and how we can obtain them by combining learning and control theory.
In 1995 Fraunhofer IPA embarked on a mission towards designing a personal robot assistant for everyday tasks. In the following years Care-O-bot developed into a long-term experiment for exploring and demonstrating new robot technologies and future product visions. The recent fourth generation of the Care-O-bot, introduced in 2014 aimed at designing an integrated system which addressed a number of innovations such as modularity, “low-cost” by making use of new manufacturing processes, and advanced human-user interaction. Some 15 systems were built and the intellectual property (IP) generated by over 20 years of research was recently licensed to a start-up. The presentation will review the path from an experimental platform for building up expertise in various robotic disciplines to recent pilot applications based on the now commercial Care-O-bot hardware.
With the ubiquity of catalyzed reactions in manufacturing, the emergence of the device laden internet of things, and global challenges with respect to water and energy, it has never been more important to understand atomic interactions in the functional materials that can provide solutions in these spaces.
Estimating 3D shape from monocular 2D images is a challenging and ill-posed problem. Some of these challenges can be alleviated if 3D shape priors are taken into account. In the field of human body shape estimation, research has shown that accurate 3D body estimations can be achieved through optimization, by minimizing error functions on image cues, such as e.g. the silhouette. These methods though, tend to be slow and typically require manual interactions (e.g. for pose estimation). In this talk, we present some recent works that try to overcome such limitations, achieving interactive rates, by learning mappings from 2D image to 3D shape spaces, utilizing data-driven priors, generated from statistically learned parametric shape models. We demonstrate this, either by extracting handcrafted features or directly utilizing CNN-s. Furthermore, we introduce the notion and application of cross-modal or multi-view learning, where abundance of data coming from various views representing the same object at training time, can be leveraged in a semi-supervised setting to boost estimations at test time. Additionally, we show similar applications of the above techniques for the task of 3D garment estimation from a single image.
Organizers: Gerard Pons-Moll
Human observers can classify photographs of real-world scenes after only a very brief exposure to the image (Potter & Levy, 1969; Thorpe, Fize, Marlot, et al., 1996; VanRullen & Thorpe, 2001). Line drawings of natural scenes have been shown to capture essential structural information required for successful scene categorization (Walther et al., 2011). Here, we investigate how the spatial relationships between lines and line segments in the line drawings affect scene classification. In one experiment, we tested the effect of removing either the junctions or the middle segments between junctions. Surprisingly, participants performed better when shown the middle segments (47.5%) than when shown the junctions (42.2%). It appeared as if the images with middle segments tended to maintain the most parallel/locally symmetric portions of the contours. In order to test this hypothesis, in a second experiment, we either removed the most symmetric half of the contour pixels or the least symmetric half of the contour pixels using a novel method of measuring the local symmetry of each contour pixel in the image. Participants were much better at categorizing images containing the most symmetric contour pixels (49.7%) than the least symmetric (38.2%). Thus, results from both experiments demonstrate that local contour symmetry is a crucial organizing principle in complex real-world scenes. Joint work with John Wilder (UofT CS, Psych), Morteza Rezanejad (McGill CS), Kaleem Siddiqi (McGill CS), Allan Jepson (UofT CS), and Dirk Bernhardt-Walther (UofT Psych), to be presented at VSS 2017.
Organizers: Ahmed Osman
Probabilistic deep learning methods have recently made great progress for generative and discriminative modeling. I will give a brief overview of recent developments and then present two contributions. The first is on a generalization of generative adversarial networks (GAN), extending their use considerably. GANs can be shown to approximately minimize the Jensen-Shannon divergence between two distributions, the true sampling distribution and the model distribution. We extend GANs to the class of f-divergences which include popular divergences such as the Kullback-Leibler divergence. This enables applications to variational inference and likelihood-free maximum likelihood, as well as enables GAN models to become basic building blocks in larger models. The second contribution is to consider representation learning using variational autoencoder models. To make learned representations of data useful we need ground them in semantic concepts. We propose a generative model that can decompose an observation into multiple separate latent factors, each of which represents a separate concept. Such disentangled representation is useful for recognition and for precise control in generative modeling. We learn our representations using weak supervision in the form of groups of observations where all samples within a group share the same value in a given latent factor. To make such learning feasible we generalize recent methods for amortized probabilistic inference to the dependent case. Joint work with: Ryota Tomioka (MSR Cambridge), Botond Cseke (MSR Cambridge), Diane Bouchacourt (Oxford)
Organizers: Lars Mescheder
Dynamic events such as family gatherings, concerts or sports events are often photographed by a group of people. The set of still images obtained this way is rich in dynamic content. We consider the question of whether such a set of still images, rather the traditional video sequences, can be used for analyzing the dynamic content of the scene. This talk will describe several instances of this problem, their solutions and directions for future studies. In particular, we will present a method to extend epipolar geometry to predict location of a moving feature in CrowdCam images. The method assumes that the temporal order of the set of images, namely photo-sequencing, is given. We will briefly describe our method to compute photo-sequencing using geometric considerations and rank aggregation. We will also present a method for identifying the moving regions in a scene, which is a basic component in dynamic scene analysis. Finally, we will consider a new vision of developing collaborative CrowdCam, and a first step toward this goal.
Organizers: Jonas Wulff
Deep Learning is one of the most successful machine learning approaches to artificial intelligence. In this talk I discuss the geometry of neural networks as a way to study the success of Deep Learning at a mathematical level and to develop a theoretical basis for making further advances, especially in situations with limited amounts of data and challenging problems in reinforcement learning. I present a few recent results on the representational power of neural networks and then demonstrate how to align this with structures from perception-action problems in order to obtain more efficient learning systems.
Organizers: Jane Walters
Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs. I will give an overview of this framework and present some of its recent applications within the context of approximate Bayesian inference. Further, I will discuss a recent modification of MMD which aims to encode invariance to additive symmetric noise and leads to learning on distributions robust to the distributional covariate shift, e.g. where measurement noise on the training data differs from that on the testing data.
Organizers: Philipp Hennig
This talk addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a “visual memory” in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given video frames as input, our approach first assigns each pixel an object or background label obtained with an encoder-decoder network that takes as input optical flow and is trained on synthetic data. Next, a “visual memory” specific to the video is acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. This is joint work with K. Alahari and P. Tokmakov.
Organizers: Osman Ulusoy
Many of the existing Robotics & Automation (R&A) technologies are at a sufficient level of maturity and are widely accepted by the academic (and to a lesser extent by the industrial) community after having undergone the scientific rigor and peer reviews that accompany such works. I believe that most of the past and current research and development efforts in robotics and automation have been squarely aimed at increasing the Standard of Living (SoL) in developed economies where housing, running water, transportation, schools, access to healthcare, to name a few, are taken for granted. Humanitarian R&A, on the other hand, can be taken to mean technologies that can make a fundamental difference in people’s lives by alleviating their suffering in times of need, such as during natural or man-made disasters or in pockets of the population where the most basic needs of humanity are not met, thus improving their Quality of Life (QoL) and not just SoL. My current work focuses on the applied use of robotics and automation technologies for the benefit of under-served and under-developed communities by working closely with them to develop solutions that showcase the effectiveness of R&A solutions in domains that strike a chord with the beneficiaries. This is made possible by bringing together researchers, practitioners from industry, academia, local governments, and various entities such as the IEEE Robotics Automation Society’s Special Interest Group on Humanitarian Technology (RAS-SIGHT), NGOs, and NPOs across the globe. I will share some of my efforts and thoughts on challenges that need to be taken into consideration including sustainability of developed solutions. I will also outline my recent efforts in the technology and public policy domains with emphasis on socio-economic, cultural, privacy, and security issues in developing and developed economies.
Organizers: Ludovic Righetti
I'll present my master thesis "Biquadratic Forms and Semi-Definite Relaxations". It is about biquadratic optimization programs (which are NP-hard generally) and examines a condition under which there exists an algorithm that finds a solution to every instance of the problem in polynomial time. I'll present a counterexample for which this is not possible generally and face the question of what happens if further knowledge about the variables over which we optimise is applied.
Organizers: Fatma Güney
A large part of image analysis is about breaking things into pieces. Decompositions of a graph are a mathematical abstraction of the possible outcomes. This talk is about optimization problems whose feasible solutions define decompositions of a graph. One example is the correlation clustering problem whose feasible solutions relate one-to-one to the decompositions of a graph, and whose objective function puts a cost or reward on neighboring nodes ending up in distinct components. This talk shows applications of this problem and proposed generalizations to diverse image analysis tasks. It sketches algorithms for finding feasible solutions for large instances in practice, solutions that are often superior in the metrics of application-specific benchmarks. It also sketches algorithms for finding lower bounds and points to new findings and open problems of polyhedral geometry in this context.
Organizers: Christoph Lassner