Header logo is


2017


Thumb xl surrealin
Learning from Synthetic Humans

Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M. J., Laptev, I., Schmid, C.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.

ps

arXiv project data Project Page Project Page [BibTex]

2017


arXiv project data Project Page Project Page [BibTex]


Thumb xl martinez
On human motion prediction using recurrent neural networks

Martinez, J., Black, M. J., Romero, J.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion, with the goal of learning time-dependent representations that perform tasks such as short-term motion prediction and long-term human motion synthesis. We examine recent work, with a focus on the evaluation methodologies commonly used in the literature, and show that, surprisingly, state-of-the-art performance can be achieved by a simple baseline that does not attempt to model motion at all. We investigate this result, and analyze recent RNN methods by looking at the architectures, loss functions, and training procedures used in state-of-the-art approaches. We propose three changes to the standard RNN models typically used for human motion, which result in a simple and scalable RNN architecture that obtains state-of-the-art performance on human motion prediction.

ps

arXiv Project Page [BibTex]

arXiv Project Page [BibTex]


Thumb xl joel slow flow crop
Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data

Janai, J., Güney, F., Wulff, J., Black, M., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 1406-1416, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Existing optical flow datasets are limited in size and variability due to the difficulty of capturing dense ground truth. In this paper, we tackle this problem by tracking pixels through densely sampled space-time volumes recorded with a high-speed video camera. Our model exploits the linearity of small motions and reasons about occlusions from multiple frames. Using our technique, we are able to establish accurate reference flow fields outside the laboratory in natural environments. Besides, we show how our predictions can be used to augment the input images with realistic motion blur. We demonstrate the quality of the produced flow fields on synthetic and real-world datasets. Finally, we collect a novel challenging optical flow dataset by applying our technique on data from a high-speed camera and analyze the performance of the state-of-the-art in optical flow under various levels of motion blur.

avg ps

pdf suppmat Project page Video DOI Project Page [BibTex]

pdf suppmat Project page Video DOI Project Page [BibTex]


Thumb xl mrflow
Optical Flow in Mostly Rigid Scenes

Wulff, J., Sevilla-Lara, L., Black, M. J.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 6911-6920, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
The optical flow of natural scenes is a combination of the motion of the observer and the independent motion of objects. Existing algorithms typically focus on either recovering motion and structure under the assumption of a purely static world or optical flow for general unconstrained scenes. We combine these approaches in an optical flow algorithm that estimates an explicit segmentation of moving objects from appearance and physical constraints. In static regions we take advantage of strong constraints to jointly estimate the camera motion and the 3D structure of the scene over multiple frames. This allows us to also regularize the structure instead of the motion. Our formulation uses a Plane+Parallax framework, which works even under small baselines, and reduces the motion estimation to a one-dimensional search problem, resulting in more accurate estimation. In moving regions the flow is treated as unconstrained, and computed with an existing optical flow method. The resulting Mostly-Rigid Flow (MR-Flow) method achieves state-of-the-art results on both the MPISintel and KITTI-2015 benchmarks.

ps

pdf SupMat video code Project Page [BibTex]

pdf SupMat video code Project Page [BibTex]


Thumb xl img03
OctNet: Learning Deep 3D Representations at High Resolutions

Riegler, G., Ulusoy, O., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.

avg ps

pdf suppmat Project Page Video Project Page [BibTex]

pdf suppmat Project Page Video Project Page [BibTex]


no image
Flexible Spatio-Temporal Networks for Video Prediction

Lu, C., Hirsch, M., Schölkopf, B.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 2137-2145, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (conference)

ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl 71341 r guided
Reflectance Adaptive Filtering Improves Intrinsic Image Estimation

Nestmeyer, T., Gehler, P. V.

In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 1771-1780, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

ps

pre-print DOI Project Page Project Page [BibTex]

pre-print DOI Project Page Project Page [BibTex]


Thumb xl web teaser
Detailed, accurate, human shape estimation from clothed 3D scan sequences

Zhang, C., Pujades, S., Black, M., Pons-Moll, G.

In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, Washington, DC, USA, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), July 2017, Spotlight (inproceedings)

Abstract
We address the problem of estimating human body shape from 3D scans over time. Reliable estimation of 3D body shape is necessary for many applications including virtual try-on, health monitoring, and avatar creation for virtual reality. Scanning bodies in minimal clothing, however, presents a practical barrier to these applications. We address this problem by estimating body shape under clothing from a sequence of 3D scans. Previous methods that have exploited statistical models of body shape produce overly smooth shapes lacking personalized details. In this paper we contribute a new approach to recover not only an approximate shape of the person, but also their detailed shape. Our approach allows the estimated shape to deviate from a parametric model to fit the 3D scans. We demonstrate the method using high quality 4D data as well as sequences of visual hulls extracted from multi-view images. We also make available a new high quality 4D dataset that enables quantitative evaluation. Our method outperforms the previous state of the art, both qualitatively and quantitatively.

ps

arxiv_preprint video dataset pdf supplemental DOI Project Page [BibTex]

arxiv_preprint video dataset pdf supplemental DOI Project Page [BibTex]


no image
Discovering Causal Signals in Images

Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., Bottou, L.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 58-66, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (conference)

ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl slide1
3D Menagerie: Modeling the 3D Shape and Pose of Animals

Zuffi, S., Kanazawa, A., Jacobs, D., Black, M. J.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 5524-5532, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
There has been significant work on learning realistic, articulated, 3D models of the human body. In contrast, there are few such models of animals, despite many applications. The main challenge is that animals are much less cooperative than humans. The best human body models are learned from thousands of 3D scans of people in specific poses, which is infeasible with live animals. Consequently, we learn our model from a small set of 3D scans of toy figurines in arbitrary poses. We employ a novel part-based shape model to compute an initial registration to the scans. We then normalize their pose, learn a statistical shape model, and refine the registrations and the model together. In this way, we accurately align animal scans from different quadruped families with very different shapes and poses. With the registration to a common template we learn a shape space representing animals including lions, cats, dogs, horses, cows and hippos. Animal shapes can be sampled from the model, posed, animated, and fit to data. We demonstrate generalization by fitting it to images of real animals including species not seen in training.

ps

pdf video Project Page [BibTex]

pdf video Project Page [BibTex]


Thumb xl pyramid
Optical Flow Estimation using a Spatial Pyramid Network

Ranjan, A., Black, M.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
We learn to compute optical flow by combining a classical spatial-pyramid formulation with deep learning. This estimates large motions in a coarse-to-fine approach by warping one image of a pair at each pyramid level by the current flow estimate and computing an update to the flow. Instead of the standard minimization of an objective function at each pyramid level, we train one deep network per level to compute the flow update. Unlike the recent FlowNet approach, the networks do not need to deal with large motions; these are dealt with by the pyramid. This has several advantages. First, our Spatial Pyramid Network (SPyNet) is much simpler and 96% smaller than FlowNet in terms of model parameters. This makes it more efficient and appropriate for embedded applications. Second, since the flow at each pyramid level is small (< 1 pixel), a convolutional approach applied to pairs of warped images is appropriate. Third, unlike FlowNet, the learned convolution filters appear similar to classical spatio-temporal filters, giving insight into the method and how to improve it. Our results are more accurate than FlowNet on most standard benchmarks, suggesting a new direction of combining classical flow methods with deep learning.

ps

pdf SupMat project/code [BibTex]

pdf SupMat project/code [BibTex]


Thumb xl imgidx 00197
Multiple People Tracking by Lifted Multicut and Person Re-identification

Tang, S., Andriluka, M., Andres, B., Schiele, B.

In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 3701-3710, IEEE Computer Society, Washington, DC, USA, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

ps

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl vpn teaser
Video Propagation Networks

Jampani, V., Gadde, R., Gehler, P. V.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

ps

pdf supplementary arXiv project page code Project Page [BibTex]

pdf supplementary arXiv project page code Project Page [BibTex]


no image
Dynamic Time-of-Flight

Schober, M., Adam, A., Yair, O., Mazor, S., Nowozin, S.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 170-179, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (conference)

ei pn

DOI [BibTex]

DOI [BibTex]


Thumb xl anja
Generating Descriptions with Grounded and Co-Referenced People

Rohrbach, A., Rohrbach, M., Tang, S., Oh, S. J., Schiele, B.

In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 4196-4206, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

ps

PDF DOI Project Page [BibTex]

PDF DOI Project Page [BibTex]


Thumb xl cvpr2017 landpsace
Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels

Ulusoy, A. O., Black, M. J., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.

avg ps

YouTube pdf suppmat Project Page [BibTex]

YouTube pdf suppmat Project Page [BibTex]


Thumb xl judith
Deep representation learning for human motion prediction and classification

Bütepage, J., Black, M., Kragic, D., Kjellström, H.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Generative models of 3D human motion are often restricted to a small number of activities and can therefore not generalize well to novel movements or applications. In this work we propose a deep learning framework for human motion capture data that learns a generic representation from a large corpus of motion capture data and generalizes well to new, unseen, motions. Using an encoding-decoding network that learns to predict future 3D poses from the most recent past, we extract a feature representation of human motion. Most work on deep learning for sequence prediction focuses on video and speech. Since skeletal data has a different structure, we present and evaluate different network architectures that make different assumptions about time dependencies and limb correlations. To quantify the learned features, we use the output of different layers for action classification and visualize the receptive fields of the network units. Our method outperforms the recent state of the art in skeletal motion prediction even though these use action specific training data. Our results show that deep feedforward networks, trained from a generic mocap database, can successfully be used for feature extraction from human motion data and that this representation can be used as a foundation for classification and prediction.

ps

arXiv Project Page [BibTex]

arXiv Project Page [BibTex]


Thumb xl teasercrop
Unite the People: Closing the Loop Between 3D and 2D Human Representations

Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M. J., Gehler, P. V.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
3D models provide a common ground for different representations of human bodies. In turn, robust 2D estimation has proven to be a powerful tool to obtain 3D fits “in-the-wild”. However, depending on the level of detail, it can be hard to impossible to acquire labeled data for training 2D estimators on large scale. We propose a hybrid approach to this problem: with an extended version of the recently introduced SMPLify method, we obtain high quality 3D body model fits for multiple human pose datasets. Human annotators solely sort good and bad fits. This procedure leads to an initial dataset, UP-3D, with rich annotations. With a comprehensive set of experiments, we show how this data can be used to train discriminative models that produce results with an unprecedented level of detail: our models predict 31 segments and 91 landmark locations on the body. Using the 91 landmark pose estimator, we present state-of-the art results for 3D human pose and shape estimation using an order of magnitude less training data and without assumptions about gender or pose in the fitting procedure. We show that UP-3D can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale. The data, code and models are available for research purposes.

ps

arXiv project/code/data Project Page [BibTex]

arXiv project/code/data Project Page [BibTex]


no image
Strategic exploration in human adaptive control

Schulz, E., Klenske, E., Bramley, N., Speekenbrink, M.

Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci), (Editors: Glenn Gunzelmann, Andrew Howes, Thora Tenbrink and Eddy J. Davelaar), cognitivesciencesociety.org, July 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
State-Regularized Policy Search for Linearized Dynamical Systems

Abdulsamad, H., Arenz, O., Peters, J., Neumann, G.

Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, (ICAPS), pages: 419-424, (Editors: Laura Barbulescu, Jeremy Frank, Mausam and Stephen F. Smith), AAAI Press, June 2017 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

Gu*, S., Holly*, E., Lillicrap, T., Levine, S.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017, *equal contribution (conference)

ei

Arxiv Project Page [BibTex]

Arxiv Project Page [BibTex]


no image
Context-Driven Movement Primitive Adaptation

Wilbers, D., Lioutikov, R., Peters, J.

IEEE International Conference on Robotics and Automation (ICRA), pages: 3469-3475, IEEE, May 2017 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
A Learning-based Shared Control Architecture for Interactive Task Execution

Farraj, F. B., Osa, T., Pedemonte, N., Peters, J., Neumann, G., Giordano, P.

IEEE International Conference on Robotics and Automation (ICRA), pages: 329-335, IEEE, May 2017 (conference)

ei

DOI Project Page Project Page [BibTex]

DOI Project Page Project Page [BibTex]


no image
Frequency Peak Features for Low-Channel Classification in Motor Imagery Paradigms

Jayaram, V., Schölkopf, B., Grosse-Wentrup, M.

Proceedings of the 8th International IEEE/EMBS Conference on Neural Engineering (NER), pages: 321-324, May 2017 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Empowered skills

Gabriel, A., Akrour, R., Peters, J., Neumann, G.

IEEE International Conference on Robotics and Automation (ICRA), pages: 6435-6441, IEEE, May 2017 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Layered direct policy search for learning hierarchical skills

End, F., Akrour, R., Peters, J., Neumann, G.

IEEE International Conference on Robotics and Automation (ICRA), pages: 6442-6448, IEEE, May 2017 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

Gu, S., Lillicrap, T., Ghahramani, Z., Turner, R. E., Levine, S.

Proceedings International Conference on Learning Representations (ICLR), OpenReviews.net, International Conference on Learning Representations, April 2017 (conference)

ei

PDF link (url) Project Page [BibTex]

PDF link (url) Project Page [BibTex]


no image
Categorical Reparametrization with Gumbel-Softmax

Jang, E., Gu, S., Poole, B.

Proceedings International Conference on Learning Representations 2017, OpenReviews.net, International Conference on Learning Representations, April 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
DeepCoder: Learning to Write Programs

Balog, M., Gaunt, A. L., Brockschmidt, M., Nowozin, S., Tarlow, D.

Proceedings International Conference on Learning Representations 2017, OpenReviews.net, International Conference on Learning Representations, April 2017 (conference)

ei

Arxiv link (url) Project Page [BibTex]

Arxiv link (url) Project Page [BibTex]


Thumb xl reliability icon
Distilling Information Reliability and Source Trustworthiness from Digital Traces

Tabibian, B., Valera, I., Farajtabar, M., Song, L., Schölkopf, B., Gomez Rodriguez, M.

Proceedings of the 26th International Conference on World Wide Web (WWW), pages: 847-855, (Editors: Barrett, R., Cummings, R., Agichtein, E. and Gabrilovich, E. ), ACM, April 2017 (conference)

ei

Project DOI Project Page Project Page [BibTex]

Project DOI Project Page Project Page [BibTex]


no image
Local Group Invariant Representations via Orbit Embeddings

Raj, A., Kumar, A., Mroueh, Y., Fletcher, T., Schölkopf, B.

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 54, pages: 1225-1235, Proceedings of Machine Learning Research, (Editors: Aarti Singh and Jerry Zhu), April 2017 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Pre-Movement Contralateral EEG Low Beta Power Is Modulated with Motor Adaptation Learning

Ozdenizci, O., Yalcin, M., Erdogan, A., Patoglu, V., Grosse-Wentrup, M., Cetin, M.

International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages: 934-938, March 2017 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Automatic detection of motion artifacts in MR images using CNNs

Meding, K., Loktyushin, A., Hirsch, M.

42nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages: 811-815, March 2017 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Catching heuristics are optimal control policies

Belousov, B., Neumann, G., Rothkopf, C., Peters, J.

Proceedings of the Thirteenth Karniel Computational Motor Control Workshop, March 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
DiSMEC – Distributed Sparse Machines for Extreme Multi-label Classification

Babbar, R., Schölkopf, B.

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM), pages: 721-729, Febuary 2017 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Policy Search with High-Dimensional Context Variables

Tangkaratt, V., van Hoof, H., Parisi, S., Neumann, G., Peters, J., Sugiyama, M.

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI), pages: 2632-2638, (Editors: Satinder P. Singh and Shaul Markovitch), AAAI Press, Febuary 2017 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Iterative Feedback-basierte Korrekturstrategien beim Bewegungslernen von Mensch-Roboter-Dyaden

Ewerton, M., Kollegger, G., Maeda, G., Wiemeyer, J., Peters, J.

In DVS Sportmotorik 2017, 2017 (inproceedings)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
BIMROB - Bidirectional Interaction between human and robot for the learning of movements - Robot trains human - Human trains robot

Kollegger, G., Wiemeyer, J., Ewerton, M., Peters, J.

In Inovation & Technologie im Sport - 23. Sportwissenschaftlicher Hochschultag der deutschen Vereinigung für Sportwissenschaft, pages: 179, (Editors: A. Schwirtz, F. Mess, Y. Demetriou & V. Senner ), Czwalina-Feldhaus, 2017 (inproceedings)

ei

[BibTex]

[BibTex]


no image
BIMROB – Bidirektionale Interaktion von Mensch und Roboter beim Bewegungslernen

Wiemeyer, J., Peters, J., Kollegger, G., Ewerton, M.

DVS Sportmotorik 2017, 2017 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl muvs
Towards Accurate Marker-less Human Shape and Pose Estimation over Time

Huang, Y., Bogo, F., Lassner, C., Kanazawa, A., Gehler, P. V., Romero, J., Akhter, I., Black, M. J.

In International Conference on 3D Vision (3DV), pages: 421-430, 2017 (inproceedings)

Abstract
Existing markerless motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, limiting their application scenarios. Here we present a fully automatic method that, given multiview videos, estimates 3D human pose and body shape. We take the recently proposed SMPLify method [12] as the base method and extend it in several ways. First we fit a 3D human body model to 2D features detected in multi-view images. Second, we use a CNN method to segment the person in each image and fit the 3D body model to the contours, further improving accuracy. Third we utilize a generic and robust DCT temporal prior to handle the left and right side swapping issue sometimes introduced by the 2D pose estimator. Validation on standard benchmarks shows our results are comparable to the state of the art and also provide a realistic 3D shape avatar. We also demonstrate accurate results on HumanEva and on challenging monocular sequences of dancing from YouTube.

ps

Code pdf DOI Project Page [BibTex]

2011


no image
Statistical estimation for optimization problems on graphs

Langovoy, M., Sra, S.

In pages: 1-6, NIPS Workshop on Discrete Optimization in Machine Learning (DISCML): Uncertainty, Generalization and Feedback , December 2011 (inproceedings)

Abstract
Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics — e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees. We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.

ei

PDF Web [BibTex]

2011


PDF Web [BibTex]


no image
On the discardability of data in Support Vector Classification problems

Del Favero, S., Varagnolo, D., Dinuzzo, F., Schenato, L., Pillonetto, G.

In pages: 3210-3215, IEEE, Piscataway, NJ, USA, 50th IEEE Conference on Decision and Control and European Control Conference (CDC - ECC), December 2011 (inproceedings)

Abstract
We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Information, learning and falsification

Balduzzi, D.

In pages: 1-4, NIPS Philosophy and Machine Learning Workshop, December 2011 (inproceedings)

Abstract
There are (at least) three approaches to quantifying information. The first, algorithmic information or Kolmogorov complexity, takes events as strings and, given a universal Turing machine, quantifies the information content of a string as the length of the shortest program producing it [1]. The second, Shannon information, takes events as belonging to ensembles and quantifies the information resulting from observing the given event in terms of the number of alternate events that have been ruled out [2]. The third, statistical learning theory, has introduced measures of capacity that control (in part) the expected risk of classifiers [3]. These capacities quantify the expectations regarding future data that learning algorithms embed into classifiers. Solomonoff and Hutter have applied algorithmic information to prove remarkable results on universal induction. Shannon information provides the mathematical foundation for communication and coding theory. However, both approaches have shortcomings. Algorithmic information is not computable, severely limiting its practical usefulness. Shannon information refers to ensembles rather than actual events: it makes no sense to compute the Shannon information of a single string – or rather, there are many answers to this question depending on how a related ensemble is constructed. Although there are asymptotic results linking algorithmic and Shannon information, it is unsatisfying that there is such a large gap – a difference in kind – between the two measures. This note describes a new method of quantifying information, effective information, that links algorithmic information to Shannon information, and also links both to capacities arising in statistical learning theory [4, 5]. After introducing the measure, we show that it provides a non-universal analog of Kolmogorov complexity. We then apply it to derive basic capacities in statistical learning theory: empirical VC-entropy and empirical Rademacher complexity. A nice byproduct of our approach is an interpretation of the explanatory power of a learning algorithm in terms of the number of hypotheses it falsifies [6], counted in two different ways for the two capacities. We also discuss how effective information relates to information gain, Shannon and mutual information.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A general linear non-Gaussian state-space model: Identifiability, identification, and applications

Zhang, K., Hyvärinen, A.

In JMLR Workshop and Conference Proceedings Volume 20, pages: 113-128, (Editors: Hsu, C.-N. , W.S. Lee ), MIT Press, Cambridge, MA, USA, 3rd Asian Conference on Machine Learning (ACML), November 2011 (inproceedings)

Abstract
State-space modeling provides a powerful tool for system identification and prediction. In linear state-space models the data are usually assumed to be Gaussian and the models have certain structural constraints such that they are identifiable. In this paper we propose a non-Gaussian state-space model which does not have such constraints. We prove that this model is fully identifiable. We then propose an efficient two-step method for parameter estimation: one first extracts the subspace of the latent processes based on the temporal information of the data, and then performs multichannel blind deconvolution, making use of both the temporal information and non-Gaussianity. We conduct a series of simulations to illustrate the performance of the proposed method. Finally, we apply the proposed model and parameter estimation method on real data, including major world stock indices and magnetoencephalography (MEG) recordings. Experimental results are encouraging and show the practical usefulness of the proposed model and method.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Non-stationary correction of optical aberrations

Schuler, C., Hirsch, M., Harmeling, S., Schölkopf, B.

In pages: 659-666 , (Editors: DN Metaxas and L Quan and A Sanfeliu and LJ Van Gool), IEEE, Piscataway, NJ, USA, 13th IEEE International Conference on Computer Vision (ICCV), November 2011 (inproceedings)

Abstract
Taking a sharp photo at several megapixel resolution traditionally relies on high grade lenses. In this paper, we present an approach to alleviate image degradations caused by imperfect optics. We rely on a calibration step to encode the optical aberrations in a space-variant point spread function and obtain a corrected image by non-stationary deconvolution. By including the Bayer array in our image formation model, we can perform demosaicing as part of the deconvolution.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning low-rank output kernels

Dinuzzo, F., Fukumizu, K.

In JMLR Workshop and Conference Proceedings Volume 20, pages: 181-196, (Editors: Hsu, C.-N. , W.S. Lee), JMLR, Cambridge, MA, USA, 3rd Asian Conference on Machine Learning (ACML) , November 2011 (inproceedings)

Abstract
Output kernel learning techniques allow to simultaneously learn a vector-valued function and a positive semidefinite matrix which describes the relationships between the outputs. In this paper, we introduce a new formulation that imposes a low-rank constraint on the output kernel and operates directly on a factor of the kernel matrix. First, we investigate the connection between output kernel learning and a regularization problem for an architecture with two layers. Then, we show that a variety of methods such as nuclear norm regularized regression, reduced-rank regression, principal component analysis, and low rank matrix approximation can be seen as special cases of the output kernel learning framework. Finally, we introduce a block coordinate descent strategy for learning low-rank output kernels.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Stability Condition for Teleoperation System with Packet Loss

Hong, A., Cho, JH., Lee, DY.

In pages: 760-761, 2011 KSME Annual Fall Conference, November 2011 (inproceedings)

Abstract
This paper focuses on the stability condition of teleoperation system where there is a packet loss in communication channel. Communication channel between master and slave cause packet loss and it obviously leads to a performance degradation and instability of teleoperation system. We consider two-channel control architecture for teleoperation system, and control inputs to remote site are produced by position of master and slave. In this paper, teleoperation system is modeled in discrete domain to include packet loss process. Also, the stability condition for teleoperation system with packet loss is discussed with input-to-state stability. Finally, the stability condition is presented in LMI approach.

ei

[BibTex]

[BibTex]


no image
Fast removal of non-uniform camera shake

Hirsch, M., Schuler, C., Harmeling, S., Schölkopf, B.

In pages: 463-470 , (Editors: DN Metaxas and L Quan and A Sanfeliu and LJ Van Gool), IEEE, Piscataway, NJ, USA, 13th IEEE International Conference on Computer Vision (ICCV), November 2011 (inproceedings)

Abstract
Camera shake leads to non-uniform image blurs. State-of-the-art methods for removing camera shake model the blur as a linear combination of homographically transformed versions of the true image. While this is conceptually interesting, the resulting algorithms are computationally demanding. In this paper we develop a forward model based on the efficient filter flow framework, incorporating the particularities of camera shake, and show how an efficient algorithm for blur removal can be obtained. Comprehensive comparisons on a number of real-world blurry images show that our approach is not only substantially faster, but it also leads to better deblurring results.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


Thumb xl iccv2011homepageimage notext small
Home 3D body scans from noisy image and range data

Weiss, A., Hirshberg, D., Black, M.

In Int. Conf. on Computer Vision (ICCV), pages: 1951-1958, IEEE, Barcelona, November 2011 (inproceedings)

Abstract
The 3D shape of the human body is useful for applications in fitness, games and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We propose a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.

ps

pdf YouTube poster Project Page Project Page [BibTex]

pdf YouTube poster Project Page Project Page [BibTex]


no image
Attenuation correction in MR-BrainPET with segmented T1-weighted MR images of the patient’s head: A comparative study with CT

Wagenknecht, G., Rota Kops, E., Mantlik, F., Fried, E., Pilz, T., Hautzel, H., Tellmann, L., Pichler, B., Herzog, H.

In pages: 2261-2266 , IEEE, Piscataway, NJ, USA, IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), October 2011 (inproceedings)

Abstract
Our method for attenuation correction (AC) in MR-BrainPET with segmented T1-weighted MR images of the pa-tient's head was applied to data from different MR-BrainPET scanners (Jülich, Tübingen) and compared to CT-based results. The study objectives presented in this paper are twofold. The first objective is to examine if the segmentation method developed for and successfully applied to 3D MP-RAGE data can also be used to segment other T1-weighted MR data such as 3D FLASH data. The second aim is to show if the similarity of segmented MR-based (SBA) and CT-based AC (CBA) obtained at HR+ PET can also be confirmed for BrainPET for which the new AC method is intended for. In order to reach the first objective, 14 segmented MR data sets (three 3D MP-RAGE data sets from Jülich and eleven 3D FLASH data sets from Tubingen) were compared to the resp. CT data based on the Dice coefficient and scatter plots. For bone, a CT threshold HU>;500 was applied. Dice coefficients (mean±std) for the upper cranial part of the skull, the skull above cavities, and in the caudal part including the cerebellum are 0.73±0.1, 0.79±0.04, and 0.49±0.02 for the Jülich data and 0.7U0.1, 0.72±0.1, and 0.60±0.05 for the Tubingen data. To reach the second aim, SBA and CBA were compared for six subjects based on VOI (AAL atlas) analysis. Mean absolute relative difference (maRD) values are maRD(JUFVBWl-FDG): 0.99%±0.83%, maRD(JüFVBW2-FDG): 0.90%±0.89%, and maRD(JUEP-Fluma- zenil): 1.85%±1.25% for the Jülich data and maRD(TuTP02- FDG): 2.99%±1.65%, maRD(TuNP01-FDG): 5.37%±2.29%, and maRD(TuNP02-FDG): 6.52%±1.69% for the three best-segmented Tübingen data sets. The results show similar segmentation quality for both Tl- weighted MR sequence types. The application to AC in BrainPET - hows a high similarity to CT-based AC if the standardized ACF value for bone used in SBA is in good accordance to the bone density of the patient in question.

ei

Web DOI [BibTex]

Web DOI [BibTex]