Dynamic events such as family gatherings, concerts or sports events are often photographed by a group of people. The set of still images obtained this way is rich in dynamic content. We consider the question of whether such a set of still images, rather the traditional video sequences, can be used for analyzing the dynamic content of the scene. This talk will describe several instances of this problem, their solutions and directions for future studies. In particular, we will present a method to extend epipolar geometry to predict location of a moving feature in CrowdCam images. The method assumes that the temporal order of the set of images, namely photo-sequencing, is given. We will briefly describe our method to compute photo-sequencing using geometric considerations and rank aggregation. We will also present a method for identifying the moving regions in a scene, which is a basic component in dynamic scene analysis. Finally, we will consider a new vision of developing collaborative CrowdCam, and a first step toward this goal.
Organizers: Jonas Wulff
Deep Learning is one of the most successful machine learning approaches to artificial intelligence. In this talk I discuss the geometry of neural networks as a way to study the success of Deep Learning at a mathematical level and to develop a theoretical basis for making further advances, especially in situations with limited amounts of data and challenging problems in reinforcement learning. I present a few recent results on the representational power of neural networks and then demonstrate how to align this with structures from perception-action problems in order to obtain more efficient learning systems.
Organizers: Jane Walters
Human observers can classify photographs of real-world scenes after only a very brief exposure to the image (Potter & Levy, 1969; Thorpe, Fize, Marlot, et al., 1996; VanRullen & Thorpe, 2001). Line drawings of natural scenes have been shown to capture essential structural information required for successful scene categorization (Walther et al., 2011). Here, we investigate how the spatial relationships between lines and line segments in the line drawings affect scene classification. In one experiment, we tested the effect of removing either the junctions or the middle segments between junctions. Surprisingly, participants performed better when shown the middle segments (47.5%) than when shown the junctions (42.2%). It appeared as if the images with middle segments tended to maintain the most parallel/locally symmetric portions of the contours. In order to test this hypothesis, in a second experiment, we either removed the most symmetric half of the contour pixels or the least symmetric half of the contour pixels using a novel method of measuring the local symmetry of each contour pixel in the image. Participants were much better at categorizing images containing the most symmetric contour pixels (49.7%) than the least symmetric (38.2%). Thus, results from both experiments demonstrate that local contour symmetry is a crucial organizing principle in complex real-world scenes. Joint work with John Wilder (UofT CS, Psych), Morteza Rezanejad (McGill CS), Kaleem Siddiqi (McGill CS), Allan Jepson (UofT CS), and Dirk Bernhardt-Walther (UofT Psych), to be presented at VSS 2017.
Organizers: Ahmed Osman
From gait, dance to martial art, human movements provide rich, complex yet coherent spatiotemporal patterns reflecting characteristics of a group or an individual. We develop computer algorithms to automatically learn such quality discriminative features from multimodal data. In this talk, I present a trilogy on learning from human movements: (1) Gait analysis from video data: based on frieze patterns (7 frieze groups), a video sequence of silhouettes is mapped into a pair of spatiotemporal patterns that are near-periodic along the time axis. A group theoretical analysis of periodic patterns allows us to determine the dynamic time warping and affine scaling that aligns two gait sequences from similar viewpoints for human identification. (2) Dance analysis and synthesis (mocap, music, ratings from Mechanical Turks): we explore the complex relationship between perceived dance quality/dancer's gender and dance movements respectively. As a feasibility study, we construct a computational framework for an analysis-synthesis-feedback loop using a novel multimedia dance-texture representation for joint angular displacement, velocity and acceleration. Furthermore, we integrate crowd sourcing, music and motion-capture data, and machine learning-based methods for dance segmentation, analysis and synthesis of new dancers. A quantitative validation of this framework on a motion-capture dataset of 172 dancers evaluated by more than 400 independent on-line raters demonstrates significant correlation between human perception and the algorithmically intended dance quality or gender of the synthesized dancers. (3) Tai Chi performance evaluation (mocap + video): I shall also discuss the feasibility of utilizing spatiotemporal synchronization and, ultimately, machine learning to evaluate Tai Chi routines performed by different subjects in our current project of “Tai Chi + Advanced Technology for Smart Health”.
There has been significant prior work on learning realistic, articulated, 3D statistical shape models of the human body. In contrast, there are few such models for animals, despite their many applications in biology, neuroscience, agriculture, and entertainment. The main challenge is that animals are much less cooperative subjects than humans: the best human body models are learned from thousands of 3D scans of people in specific poses, which is infeasible with live animals. In the talk I will illustrate how we extend a state-of-the-art articulated 3D human body model (SMPL) to animals learning from toys a multi-family shape space that can represent lions, cats, dogs, horses, cows and hippos. The generalization of the model is illustrated by fitting it to images of real animals, where it captures realistic animal shapes, even for new species not seen in training.
Moritz Hardt will review some progress and challenges towards preventing discrimination based on sensitive attributes in supervised learning.
In this talk we will give an overview of research efforts within autonomous manipulation at the AASS Research Center, Örebro University, Sweden. We intend to give a holistic view on the historically separated subjects of robot motion planning and control. In particular, viewing motion behavior generation as an optimal control problem allows for a unified formulation that is uncluttered by a-priori domain assumptions and simplified solution strategies. Furthermore, We will also discuss the problems of workspace modeling and perception and how to integrate them in the overarching problem of autonomous manipulation.
Organizers: Ludovic Righetti
As large tensor-variate data increasingly become the norm in applied machine learning and statistics, complex analysis methods similarly increase in prevalence. Such a trend offers the opportunity to understand more intricate features of the data that, ostensibly, could not be studied with simpler datasets or simpler methodologies. While promising, these advances are also perilous: these novel analysis techniques do not always consider the possibility that their results are in fact an expected consequence of some simpler, already-known feature of simpler data (for example, treating the tensor like a matrix or a univariate quantity) or simpler statistic (for example, the mean and covariance of one of the tensor modes). I will present two works that address this growing problem, the first of which uses Kronecker algebra to derive a tensor-variate maximum entropy distribution that shares modal moments with the real data. This distribution of surrogate data forms the basis of a statistical hypothesis test, and I use this method to answer a question of epiphenomenal tensor structure in populations of neural recordings in the motor and prefrontal cortex. In the second part, I will discuss how to extend this maximum entropy formulation to arbitrary constraints using deep neural network architectures in the flavor of implicit generative modeling, and I will use this method in a texture synthesis application.
Organizers: Philipp Hennig
In classical reinforcement learning agents accept arbitrary short term loss for long term gain when exploring their environment. This is infeasible for safety critical applications such as robotics, where even a single unsafe action may cause system failure or harm the environment. In this work, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an a priori unknown safety constraint that depends on states and actions and satisfies certain regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm, SAFEMDP, for this task and prove that it completely explores the safely reachable part of the MDP without violating the safety constraint. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.
Organizers: Sebastian Trimpe
Organizers: Moritz Grosse-Wentrup
This is the story of the novel model predictive control (MPC) solution for ABB’s largest drive, the Megadrive LCI. LCI stands for load commutated inverter, a type of current source converter which powers large machineries in many industries such as marine, mining or oil & gas. Starting from a small software project at ABB Corporate Research, this novel control solution turned out to become the first time ever MPC was employed in a 48 MW commercial drive. Subsequently it was commissioned at Kollsnes, a key facility of the natural gas delivery chain, in order to increase the plant’s availability. In this presentation I will talk about the magic behind this success story, the so-called Embedded MPC algorithms, and my objective will be to demonstrate the possibilities when power meets computation.
Organizers: Sebastian Trimpe
Brain-Computer Interfaces (BCIs) are systems that can translate brain activity patterns of a user into messages or commands for an interactive application. Such brain activity is typically measured using Electroencephalography (EEG), before being processed and classified by the system. EEG-based BCIs have proven promising for a wide range of applications ranging from communication and control for motor impaired users, to gaming targeted at the general public, real-time mental state monitoring and stroke rehabilitation, to name a few. Despite this promising potential, BCIs are still scarcely used outside laboratories for practical applications. The main reason preventing EEG-based BCIs from being widely used is arguably their poor usability, which is notably due to their low robustness and reliability, as well as their long training times. In this talk I present some of our research aimed at addressing these points in order to make EEG-based BCIs usable, i.e., to increase their efficacy and efficiency. In particular, I will present a set of contributions towards this goal 1) at the user training level, to ensure that users can learn to control a BCI efficiently and effectively, and 2) at the usage level, to explore novel applications of BCIs for which the current reliability can already be useful, e.g., for neuroergonomics or real-time brain activity and mental state visualization.
The predictive simulation of engineering systems increasingly rests on the synthesis of physical models and experimental data. In this context, Bayesian inference establishes a framework for quantifying the encountered uncertainties and fusing the available information. A summary and discussion of some recently emerged methods for uncertainty propagation (polynomial chaos expansions) and related MCMC-free techniques for posterior computation (spectral likelihood expansions, optimal transportation theory) is presented.
Organizers: Philipp Hennig