Header logo is

Institute Talks

Learning Non-rigid Optimization

Talk
  • 10 July 2020 • 15:00—16:00
  • Matthias Nießner
  • Remote talk on Zoom

Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. One recent approach proposes self-supervision based on non-rigid reconstruction. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense interframe correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 2,537 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms both existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision.

Organizers: Vassilis Choutas

Towards Commodity 3D Scanning for Content Creation

Talk
  • 16 July 2020 • 16:00—17:30
  • Angela Dai

In recent years, commodity 3D sensors have become widely available, spawning significant interest in both offline and real-time 3D reconstruction. While state-of-the-art reconstruction results from commodity RGB-D sensors are visually appealing, they are far from usable in practical computer graphics applications since they do not match the high quality of artist-modeled 3D graphics content. One of the biggest challenges in this context is that obtained 3D scans suffer from occlusions, thus resulting in incomplete 3D models. In this talk, I will present a data-driven approach towards generating high quality 3D models from commodity scan data, and the use of these geometrically complete 3D models towards semantic and texture understanding of real-world environments.

Organizers: Yinghao Huang

  • Aayush Bansal
  • zoom

Licklider and Taylor (1968) envisioned computational machinery that could enable better communication between humans than face-to-face interaction. In the last fifty years, we have used computing to develop various means of communication, such as mail, messaging, phone calls, video conversation, and virtual reality. These are, however, a proxy of face-to-face communication that aims at encoding words, expressions, emotions, and body language at the source and decoding them reliably at the destination. The true revolution of personal computing has not begun yet because we have not been able to tap the real potential of computing for social communication. A computational machinery that can understand and create a four-dimensional audio-visual world can enable humans to describe their imagination and share it with others. In this talk, I will introduce the Computational Studio: an environment that allows non-specialists to construct and creatively edit the 4D audio-visual world from sparse audio and video samples. The Computational Studio aims to enable everyone to relive old memories through a form of virtual time travel, to automatically create new experiences, and share them with others using everyday computational devices. There are three essential components of the Computational Studio: (1) how can we capture 4D audio-visual world?; (2) how can we synthesize the audio-visual world using examples?; and (3) how can we interactively create and edit the audio-visual world? The first part of this talk introduces the work on capturing and browsing in-the-wild 4D audio-visual world in a self-supervised manner and efforts on building a multi-agent capture system. The applications of this work apply to social communication and to digitizing intangible cultural heritage, capturing tribal dances and wildlife in the natural environment, and understanding the social behavior of human beings. In the second part, I will talk about the example-based audio-visual synthesis in an unsupervised manner. Example-based audio-visual synthesis allows us to express ourselves easily. Finally, I will talk about the interactive visual synthesis that allows us to manually create and edit visual experiences. Here I will also stress the importance of thinking about a human user and computational devices when designing content creation applications. The Computational Studio is a first step towards unlocking the full degree of creative imagination, which is currently limited to the human mind by the limits of the individual's expressivity and skill. It has the potential to change the way we audio-visually communicate with others.

Organizers: Arjun Chandrasekaran Chun-Hao Paul Huang


Water anomalies: from ice age to carbon

Physics Colloquium
  • 12 May 2020 • 16:15—18:15
  • Marcia Babosa
  • WebEx

Technological advances in laser and vacuum technology have allowed realizing a dream of the early days of quantum mechanics: controlling single, laser-cooled atoms at a quantum level. Interfacing individual atoms with ultracold gases offer new experimental approaches to unsolved problems of nonequilibrium quantum physics. Moreover, such systems allow experimentally addressing the question if and how quantum properties can boost the performance of atomic-scale devices. In this talk, I will discuss how single atoms can be controlled and probed in an ultracold gas. Understanding the impurity-gas interaction at the atomic level allows employing inelastic spinexchange collisions, which are usually considered harmful, for quantum applications. First, I will show how the inelastic spin-exchange can map information about the gas temperature or the surrounding magnetic field to the quantum-spin distribution of single impurity atoms. Interestingly, the nonequilibrium spin dynamics before reaching the steady-state increases the sensitivity of the probe while reducing the perturbation of the gas compared to the steady-state. Second, I will discuss how the quantized energy transfer during inelastic collisions allows operating a single-atom quantum engine. We over-come the limitations imposed by using thermal states and run a quantum-enhanced Otto cycle operating at orders of magnitude larger powers compared to a thermal case, alternating between positive and negative temperature regimes at maximum efficiency. I will discuss the properties of the engine as well as limitations originating from the quantum aspects resulting in fluctuations of power.


The d-wave paradigm of unconventional superconductors

Physics Colloquium
  • 05 May 2020 • 16:15—18:15
  • Ronny Thomale
  • WebEx

As famously introduced in the context of copper oxide superconductors, Cooper pairing ofelectrons through a d-wave order parameter constitutes a central departure from theconventional microscopic picture of phonon-mediated s-wave superconductivity. Fordecades, copper oxide superconductors remained the predominant arena for d-wavepairing. In recent years, however, d-wave superconductivity witnesses significant diversification in terms of materials realizations, such as Na-doped cobaltates, pnictides atstrong hole doping, and, most recently, infinite layer nickelates as well as strontiumruthenate. In this colloquium, I intend to provide an overview over recents developments,and to work out a future perspective on new d-wave superconductors


Engineering single-atom devices in ultracold gases

Physics Colloquium
  • 28 April 2020 • 16:15—18:15
  • Artur Widera
  • WebEx (https://mpi-is.webex.com/mpi-is/onstage/g.php?MTID=e0a11930fa916a065c44064aea17abe75)

Technological advances in laser and vacuum technology have allowed realizing a dream of the early days of quantum mechanics: controlling single, laser-cooled atoms at a quantum level. Interfacing individual atoms with ultracold gases offer new experimental approaches to unsolved problems of nonequilibrium quantum physics. Moreover, such systems allow experimentally addressing the question if and how quantum properties can boost the performance of atomic-scale devices. In this talk, I will discuss how single atoms can be controlled and probed in an ultracold gas. Understanding the impurity-gas interaction at the atomic level allows employing inelastic spinexchange collisions, which are usually considered harmful, for quantum applications. First, I will show how the inelastic spin-exchange can map information about the gas temperature or the surrounding magnetic field to the quantum-spin distribution of single impurity atoms. Interestingly, the nonequilibrium spin dynamics before reaching the steady-state increases the sensitivity of the probe while reducing the perturbation of the gas compared to the steady-state. Second, I will discuss how the quantized energy transfer during inelastic collisions allows operating a single-atom quantum engine. We over-come the limitations imposed by using thermal states and run a quantum-enhanced Otto cycle operating at orders of magnitude larger powers compared to a thermal case, alternating between positive and negative temperature regimes at maximum efficiency. I will discuss the properties of the engine as well as limitations originating from the quantum aspects resulting in fluctuations of power.


Ultrafast Surface Dynamics and Local Spectroscopy at the Nanoscale

Physics Colloquium
  • 21 April 2020 • 16:15—18:15
  • Martin Wolf
  • WebEx

In a Born-Oppenheimer description, atomic motions evolve across a potential energy surface determined by the occupation of electronic states as a function of atom positions. Ultrafast photo-induced phase transitions provide a test case for how the forces and resulting nuclear motion along the reaction co-ordinate originate from a non-equilibrium population of excited electronic states. Here I discuss recent advances in time-resolved photoemission spectroscopy allowing for direct probing of the underlying fundamental steps and the transiently evolving band structure in the ultrafast phase transition in indium nanowires on Si(111) [1]. Furthermore, I will discuss some recent attempts to access the space-time limit in surface dynamics using local optical excitation of controlled plasmonic nano-junctions and tip-enhanced Raman scattering (TERS) [2,3].

Organizers: Joachim Gräfe


  • Chunyu Wang
  • remote talk on zoom

Accurate 3D human pose estimation has been a longstanding goal in computer vision. However, till now, it has only gained limited success in easy scenarios such as studios which have little occlusion. In this talk, I will present our two works aiming to address the occlusion problem in realistic scenarios. In the first work, we present an approach to recover absolute 3D human pose of single person from multi-view images by incorporating multi-view geometric priors in our model. It consists of two separate steps: (1) estimating the 2D poses in multi-view images and (2) recovering the 3D poses from the multi-view 2D poses. First, we introduce a cross-view fusion scheme into CNN to jointly estimate 2D poses for multiple views. Consequently, the 2D pose estimation for each view already benefits from other views. Second, we present a recursive Pictorial Structure Model to recover the 3D pose from the multi-view 2D poses. It gradually improves the accuracy of 3D pose with affordable computational cost. In the second work, we present a 3D pose estimator which allows us to reliably estimate and track people in crowded scenes. In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the 3D space, therefore avoids making incorrect hard decisions in the 2D space. To achieve this goal, the features in all camera views are warped and aggregated in a common 3D space, and fed to Cuboid Proposal Network (CPN) to coarsely localize all people. Then we propose Pose Regression Network (PRN) to estimate a detailed 3D pose for each proposal. The approach is robust to occlusion which occurs frequently in practice. Without bells and whistles, it significantly outperforms the state-of-the-arts on the benchmark datasets.

Organizers: Chun-Hao Paul Huang


How to tie an optical field into a knot

Physics Colloquium
  • 14 April 2020 • 16:15—18:15
  • Mark Dennis
  • WebEx

Tying a knot in a piece of string can be a hard practical problem.


  • Marco Pasini
  • N3.022 (Aquarium)

Traditional voice conversion methods rely on parallel recordings of multiple speakers pronouncing the same sentences. For real-world applications however, parallel data is rarely available. We propose MelGAN-VC, a voice conversion method that relies on non-parallel speech data and is able to convert audio signals of arbitrary length from a source voice to a target voice. We firstly compute spectrograms from waveform data and then perform a domain translation using a Generative Adversarial Network (GAN) architecture. An additional siamese network helps preserving speech information in the translation process, without sacrificing the ability to flexibly model the style of the target speaker. We test our framework with a dataset of clean speech recordings, as well as with a collection of noisy real-world speech examples. Finally, we apply the same method to perform music style transfer, translating arbitrarily long music samples from one genre to another, and showing that our framework is flexible and can be used for audio manipulation applications different from voice conversion.


Learning to Model 3D Human Face Geometry

Talk
  • 26 March 2020 • 11:00—12:00
  • Victoria Fernández Abrevaya
  • Remote talk on zoom

In this talk I will present an overview of our recent works that learn deep geometric models for the 3D face from large datasets of scans. Priors for the 3D face are crucial for many applications: to constrain ill posed problems such as 3D reconstruction from monocular input, for efficient generation and animation of 3D virtual avatars, or even in medical domains such as recognition of craniofacial disorders. Generative models of the face have been widely used for this task, as well as deep learning approaches that have recently emerged as a robust alternative. Barring a few exceptions, most of these data-driven approaches were built from either a relatively limited number of samples (in the case of linear models of the shape), or by synthetic data augmentation (for deep-learning based approaches), mainly due to the difficulty in obtaining large-scale and accurate 3D scans of the face. Yet, there is a substantial amount of 3D information that can be gathered when considering publicly available datasets that have been captured over the last decade. I will discuss here our works that tackle the challenges of building rich geometric models out of these large and varied datasets, with the goal of modeling the facial shape, expression (i.e. motion) or geometric details. Concretely, I will talk about (1) an efficient and fully automatic approach for registration of large datasets of 3D faces in motion; (2) deep learning methods for modeling the facial geometry that can disentangle the shape and expression aspects of the face; and (3) a multi-modal learning approach for capturing geometric details from images in-the-wild, by simultaneously encoding both facial surface normal and natural image information.

Organizers: Jinlong Yang


Cancelled: Electro-active Ionic Elastomers

Talk
  • 23 March 2020 • 11:00—12:00
  • Prof. Antal Jákli
  • 2P04

Motivated by the low voltage driven actuation of ionic Electroactive Polymers (iEAPs) [1] [2], recently we began investigating ionic elastomers. In this talk I will discuss the preparation, physical characterization and electric bending actuation properties of two novel ionic elastomers; ionic polymer electrolyte membranes (iPEM)[3], and ionic liquid crystal elastomers (iLCE).[4] Both materials can be actuated by low frequency AC or DC voltages of less than 1 V. The bending actuation properties of the iPEMs are outperforming most of the well-developed iEAPs, and the not optimized first iLCEs are already comparable to them. Ionic liquid crystal elastomers also exhibit superior features, such as the alignment dependent actuation, which offers the possibility of pre-programed actuation pattern at the level of cross-linking process. Additionally, multiple (thermal, optical and electric) actuations are also possible. I will also discuss issues with compliant electrodes and possible soft robotic applications. [1] Y. Bar-Cohen, Electroactive Polyer Actuators as Artficial Muscles: Reality, Potential and Challenges, SPIE Press, Bellingham, 2004. [2] O. Kim, S. J. Kim, M. J. Park, Chem. Commun. 2018, 54, 4895. [3] C. P. H. Rajapaksha, C. Feng, C. Piedrahita, J. Cao, V. Kaphle, B. Lüssem, T. Kyu, A. Jákli, Macromol. Rapid Commun. 2020, in print. [4] C. Feng, C. P. H. Rajapaksha, J. M. Cedillo, C. Piedrahita, J. Cao, V. Kaphle, B. Lussem, T. Kyu, A. I. Jákli, Macromol. Rapid Commun. 2019, 1900299.