Header logo is


2018


Thumb xl screen shot 2018 04 19 at 14.57.08
Motion-based Object Segmentation based on Dense RGB-D Scene Flow

Shao, L., Shah, P., Dwaracherla, V., Bohg, J.

IEEE Robotics and Automation Letters, 3(4):3797-3804, IEEE, IEEE/RSJ International Conference on Intelligent Robots and Systems, October 2018 (conference)

Abstract
Given two consecutive RGB-D images, we propose a model that estimates a dense 3D motion field, also known as scene flow. We take advantage of the fact that in robot manipulation scenarios, scenes often consist of a set of rigidly moving objects. Our model jointly estimates (i) the segmentation of the scene into an unknown but finite number of objects, (ii) the motion trajectories of these objects and (iii) the object scene flow. We employ an hourglass, deep neural network architecture. In the encoding stage, the RGB and depth images undergo spatial compression and correlation. In the decoding stage, the model outputs three images containing a per-pixel estimate of the corresponding object center as well as object translation and rotation. This forms the basis for inferring the object segmentation and final object scene flow. To evaluate our model, we generated a new and challenging, large-scale, synthetic dataset that is specifically targeted at robotic manipulation: It contains a large number of scenes with a very diverse set of simultaneously moving 3D objects and is recorded with a commonly-used RGB-D camera. In quantitative experiments, we show that we significantly outperform state-of-the-art scene flow and motion-segmentation methods. In qualitative experiments, we show how our learned model transfers to challenging real-world scenes, visually generating significantly better results than existing methods.

am

Project Page arXiv DOI [BibTex]

2018


Project Page arXiv DOI [BibTex]


Thumb xl screenshot from 2018 06 15 22 59 30
A Value-Driven Eldercare Robot: Virtual and Physical Instantiations of a Case-Supported Principle-Based Behavior Paradigm

Anderson, M., Anderson, S., Berenz, V.

Proceedings of the IEEE, pages: 1,15, October 2018 (article)

Abstract
In this paper, a case-supported principle-based behavior paradigm is proposed to help ensure ethical behavior of autonomous machines. We argue that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. Such a consensus is likely to emerge in many areas in which autonomous systems are apt to be deployed and for the actions they are liable to undertake. We believe that this is the case since we are more likely to agree on how machines ought to treat us than on how human beings ought to treat one another. Given such a consensus, particular cases of ethical dilemmas where ethicists agree on the ethically relevant features and the right course of action can be used to help discover principles that balance these features when they are in conflict. Such principles not only help ensure ethical behavior of complex and dynamic systems but also can serve as a basis for justification of this behavior. The requirements, methods, implementation, and evaluation components of the paradigm are detailed as well as its instantiation in both a simulated and real robot functioning in the domain of eldercare.

am

link (url) DOI [BibTex]


Thumb xl screenshot from 2017 07 27 17 24 14
Playful: Reactive Programming for Orchestrating Robotic Behavior

Berenz, V., Schaal, S.

IEEE Robotics Automation Magazine, 25(3):49-60, September 2018 (article) In press

Abstract
For many service robots, reactivity to changes in their surroundings is a must. However, developing software suitable for dynamic environments is difficult. Existing robotic middleware allows engineers to design behavior graphs by organizing communication between components. But because these graphs are structurally inflexible, they hardly support the development of complex reactive behavior. To address this limitation, we propose Playful, a software platform that applies reactive programming to the specification of robotic behavior.

am

playful website playful_IEEE_RAM link (url) DOI [BibTex]


Thumb xl screen shot 2018 09 19 at 09.33.59
ClusterNet: Instance Segmentation in RGB-D Images

Shao, L., Tian, Y., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.

am

link (url) [BibTex]

link (url) [BibTex]


Thumb xl grasping
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


Thumb xl teaser image
Probabilistic Recurrent State-Space Models

Doerr, A., Daniel, C., Schiegg, M., Nguyen-Tuong, D., Schaal, S., Toussaint, M., Trimpe, S.

In Proceedings of the International Conference on Machine Learning (ICML), International Conference on Machine Learning (ICML), July 2018 (inproceedings)

Abstract
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g., LSTMs) proved extremely successful in modeling complex time-series data. Fully probabilistic SSMs, however, unfortunately often prove hard to train, even for smaller problems. To overcome this limitation, we propose a scalable initialization and training algorithm based on doubly stochastic variational inference and Gaussian processes. In the variational approximation we propose in contrast to related approaches to fully capture the latent state temporal correlations to allow for robust training.

am ics

arXiv pdf Project Page [BibTex]

arXiv pdf Project Page [BibTex]


Thumb xl octo turned
Real-time Perception meets Reactive Motion Generation

(Best Systems Paper Finalists - Amazon Robotics Best Paper Awards in Manipulation)

Kappler, D., Meier, F., Issac, J., Mainprice, J., Garcia Cifuentes, C., Wüthrich, M., Berenz, V., Schaal, S., Ratliff, N., Bohg, J.

IEEE Robotics and Automation Letters, 3(3):1864-1871, July 2018 (article)

Abstract
We address the challenging problem of robotic grasping and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. Our approach emphasizes the importance of continuous, real-time perception and its tight integration with reactive motion generation methods. We present a fully integrated system where real-time object and robot tracking as well as ambient world modeling provides the necessary input to feedback controllers and continuous motion optimizers. Specifically, they provide attractive and repulsive potentials based on which the controllers and motion optimizer can online compute movement policies at different time intervals. We extensively evaluate the proposed system on a real robotic platform in four scenarios that exhibit either challenging workspace geometry or a dynamic environment. We compare the proposed integrated system with a more traditional sense-plan-act approach that is still widely used. In 333 experiments, we show the robustness and accuracy of the proposed system.

am

arxiv video video link (url) DOI Project Page [BibTex]


Thumb xl meta learning overview
Online Learning of a Memory for Learning Rates

(nominated for best paper award)

Meier, F., Kappler, D., Schaal, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018, accepted (inproceedings)

Abstract
The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.

am

pdf video code [BibTex]

pdf video code [BibTex]


Thumb xl learning ct w asm block diagram detailed
Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks

Sutanto, G., Su, Z., Schaal, S., Meier, F.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

am

pdf video [BibTex]

pdf video [BibTex]


no image
Distributed Event-Based State Estimation for Networked Systems: An LMI Approach

Muehlebach, M., Trimpe, S.

IEEE Transactions on Automatic Control, 63(1):269-276, January 2018 (article)

am ics

arXiv (extended version) DOI Project Page [BibTex]

arXiv (extended version) DOI Project Page [BibTex]


no image
Memristor-enhanced humanoid robot control system–Part I: theory behind the novel memcomputing paradigm

Ascoli, A., Baumann, D., Tetzlaff, R., Chua, L. O., Hild, M.

International Journal of Circuit Theory and Applications, 46(1):155-183, 2018 (article)

am

DOI [BibTex]

DOI [BibTex]


Thumb xl img
Combining learned and analytical models for predicting action effects

Kloss, A., Schaal, S., Bohg, J.

arXiv, 2018 (article) Submitted

Abstract
One of the most basic skills a robot should possess is predicting the effect of physical interactions with objects in the environment. This enables optimal action selection to reach a certain goal state. Traditionally, dynamics are approximated by physics-based analytical models. These models rely on specific state representations that may be hard to obtain from raw sensory data, especially if no knowledge of the object shape is assumed. More recently, we have seen learning approaches that can predict the effect of complex physical interactions directly from sensory input. It is however an open question how far these models generalize beyond their training data. In this work, we investigate the advantages and limitations of neural network based learning approaches for predicting the effects of actions based on sensory input and show how analytical and learned models can be combined to leverage the best of both worlds. As physical interaction task, we use planar pushing, for which there exists a well-known analytical model and a large real-world dataset. We propose to use a convolutional neural network to convert raw depth images or organized point clouds into a suitable representation for the analytical model and compare this approach to using neural networks for both, perception and prediction. A systematic evaluation of the proposed approach on a very large real-world dataset shows two main advantages of the hybrid architecture. Compared to a pure neural network, it significantly (i) reduces required training data and (ii) improves generalization to novel physical interaction.

am

arXiv pdf link (url) [BibTex]


no image
Memristor-enhanced humanoid robot control system–Part II: circuit theoretic model and performance analysis

Baumann, D., Ascoli, A., Tetzlaff, R., Chua, L. O., Hild, M.

International Journal of Circuit Theory and Applications, 46(1):184-220, 2018 (article)

am

DOI [BibTex]

DOI [BibTex]


no image
On Time Optimization of Centroidal Momentum Dynamics

Ponton, B., Herzog, A., Del Prete, A., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 5776-5782, IEEE, Brisbane, Australia, May 2018 (inproceedings)

Abstract
Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing † †Implementation details and demos can be found in the source code available at https://git-amd.tuebingen.mpg.de/bponton/timeoptimization.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Unsupervised Contact Learning for Humanoid Estimation and Control

Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 411-417, IEEE, Brisbane, Australia, 2018 (inproceedings)

Abstract
This work presents a method for contact state estimation using fuzzy clustering to learn contact probability for full, six-dimensional humanoid contacts. The data required for training is solely from proprioceptive sensors - endeffector contact wrench sensors and inertial measurement units (IMUs) - and the method is completely unsupervised. The resulting cluster means are used to efficiently compute the probability of contact in each of the six endeffector degrees of freedom (DoFs) independently. This clustering-based contact probability estimator is validated in a kinematics-based base state estimator in a simulation environment with realistic added sensor noise for locomotion over rough, low-friction terrain on which the robot is subject to foot slip and rotation. The proposed base state estimator which utilizes these six DoF contact probability estimates is shown to perform considerably better than that which determines kinematic contact constraints purely based on measured normal force.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Task-Specific Dynamics to Improve Whole-Body Control

Gams, A., Mason, S., Ude, A., Schaal, S., Righetti, L.

In Hua, IEEE, Beijing, China, November 2018 (inproceedings)

Abstract
In task-based inverse dynamics control, reference accelerations used to follow a desired plan can be broken down into feedforward and feedback trajectories. The feedback term accounts for tracking errors that are caused from inaccurate dynamic models or external disturbances. On underactuated, free-floating robots, such as humanoids, high feedback terms can be used to improve tracking accuracy; however, this can lead to very stiff behavior or poor tracking accuracy due to limited control bandwidth. In this paper, we show how to reduce the required contribution of the feedback controller by incorporating learned task-space reference accelerations. Thus, we i) improve the execution of the given specific task, and ii) offer the means to reduce feedback gains, providing for greater compliance of the system. With a systematic approach we also reduce heuristic tuning of the model parameters and feedback gains, often present in real-world experiments. In contrast to learning task-specific joint-torques, which might produce a similar effect but can lead to poor generalization, our approach directly learns the task-space dynamics of the center of mass of a humanoid robot. Simulated and real-world results on the lower part of the Sarcos Hermes humanoid robot demonstrate the applicability of the approach.

am mg

link (url) [BibTex]

link (url) [BibTex]


no image
An MPC Walking Framework With External Contact Forces

Mason, S., Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 1785-1790, IEEE, Brisbane, Australia, May 2018 (inproceedings)

Abstract
In this work, we present an extension to a linear Model Predictive Control (MPC) scheme that plans external contact forces for the robot when given multiple contact locations and their corresponding friction cone. To this end, we set up a two-step optimization problem. In the first optimization, we compute the Center of Mass (CoM) trajectory, foot step locations, and introduce slack variables to account for violating the imposed constraints on the Zero Moment Point (ZMP). We then use the slack variables to trigger the second optimization, in which we calculate the optimal external force that compensates for the ZMP tracking error. This optimization considers multiple contacts positions within the environment by formulating the problem as a Mixed Integer Quadratic Program (MIQP) that can be solved at a speed between 100-300 Hz. Once contact is created, the MIQP reduces to a single Quadratic Program (QP) that can be solved in real-time ({\textless}; 1kHz). Simulations show that the presented walking control scheme can withstand disturbances 2-3× larger with the additional force provided by a hand contact.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2017


Thumb xl amd intentiongan
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets

Hausman, K., Chebotar, Y., Schaal, S., Sukhatme, G., Lim, J.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

am

pdf video [BibTex]

2017


pdf video [BibTex]


Thumb xl fig toyex lqr1kernel 1
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


Thumb xl robot legos
Interactive Perception: Leveraging Action in Perception and Perception in Action

Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., Sukhatme, G.

IEEE Transactions on Robotics, 33, pages: 1273-1291, December 2017 (article)

Abstract
Recent approaches in robotics follow the insight that perception is facilitated by interactivity with the environment. These approaches are subsumed under the term of Interactive Perception (IP). We argue that IP provides the following benefits: (i) any type of forceful interaction with the environment creates a new type of informative sensory signal that would otherwise not be present and (ii) any prior knowledge about the nature of the interaction supports the interpretation of the signal. This is facilitated by knowledge of the regularity in the combined space of sensory information and action parameters. The goal of this survey is to postulate this as a principle and collect evidence in support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of Interactive Perception. We close this survey by discussing the remaining open questions. Thereby, we hope to define a field and inspire future work.

am

arXiv DOI Project Page [BibTex]

arXiv DOI Project Page [BibTex]


Thumb xl teaser
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

am ics

PDF Project Page [BibTex]

PDF Project Page [BibTex]


Thumb xl qg net rev
Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning

Li, W., Bohg, J., Fritz, M.

arXiv, November 2017 (article) Submitted

Abstract
Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

am

arXiv [BibTex]


no image
Learning optimal gait parameters and impedance profiles for legged locomotion

Heijmink, E., Radulescu, A., Ponton, B., Barasuol, V., Caldwell, D., Semini, C.

Proceedings International Conference on Humanoid Robots, IEEE, 2017 IEEE-RAS 17th International Conference on Humanoid Robots, November 2017 (conference)

Abstract
The successful execution of complex modern robotic tasks often relies on the correct tuning of a large number of parameters. In this paper we present a methodology for improving the performance of a trotting gait by learning the gait parameters, impedance profile and the gains of the control architecture. We show results on a set of terrains, for various speeds using a realistic simulation of a hydraulically actuated system. Our method achieves a reduction in the gait's mechanical energy consumption during locomotion of up to 26%. The simulation results are validated in experimental trials on the hardware system.

am

paper [BibTex]

paper [BibTex]


no image
A New Data Source for Inverse Dynamics Learning

Kappler, D., Meier, F., Ratliff, N., Schaal, S.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

am

[BibTex]

[BibTex]


no image
Bayesian Regression for Artifact Correction in Electroencephalography

Fiebig, K., Jayaram, V., Hesse, T., Blank, A., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 131-136, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


no image
Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces

Grossberger, L., Hohmann, M. R., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 160-164, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


Thumb xl screen shot 2017 08 01 at 15.41.10
On the relevance of grasp metrics for predicting grasp success

Rubert, C., Kappler, D., Morales, A., Schaal, S., Bohg, J.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings) Accepted

Abstract
We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

am

Project Page [BibTex]

Project Page [BibTex]


no image
Local Bayesian Optimization of Motor Skills

Akrour, R., Sorokin, D., Peters, J., Neumann, G.

Proceedings of the 34th International Conference on Machine Learning, 70, pages: 41-50, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl pilqr cover
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.

Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


no image
Event-based State Estimation: An Emulation-based Approach

Trimpe, S.

IET Control Theory & Applications, 11(11):1684-1693, July 2017 (article)

Abstract
An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

am ics

arXiv Supplementary material PDF DOI Project Page [BibTex]

arXiv Supplementary material PDF DOI Project Page [BibTex]


Thumb xl apollo system2 croped
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Thumb xl learning ct block diagram v2
Learning Feedback Terms for Reactive Planning and Control

Rai, A., Sutanto, G., Schaal, S., Meier, F.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl this one
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Thumb xl cover
Path Integral Guided Policy Search

Chebotar, Y., Kalakrishnan, M., Yahya, A., Li, A., Schaal, S., Levine, S.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl fig  quali  arm
Probabilistic Articulated Real-Time Tracking for Robot Manipulation

(Best Paper of RA-L 2017, Finalist of Best Robotic Vision Paper Award of ICRA 2017)

Garcia Cifuentes, C., Issac, J., Wüthrich, M., Schaal, S., Bohg, J.

IEEE Robotics and Automation Letters (RA-L), 2(2):577-584, April 2017 (article)

Abstract
We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods. Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter. We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

am

arXiv video code and dataset video PDF DOI Project Page [BibTex]


no image
Anticipatory Action Selection for Human-Robot Table Tennis

Wang, Z., Boularias, A., Mülling, K., Schölkopf, B., Peters, J.

Artificial Intelligence, 247, pages: 399-414, 2017, Special Issue on AI and Robotics (article)

Abstract
Abstract Anticipation can enhance the capability of a robot in its interaction with humans, where the robot predicts the humans' intention for selecting its own action. We present a novel framework of anticipatory action selection for human-robot interaction, which is capable to handle nonlinear and stochastic human behaviors such as table tennis strokes and allows the robot to choose the optimal action based on prediction of the human partner's intention with uncertainty. The presented framework is generic and can be used in many human-robot interaction scenarios, for example, in navigation and human-robot co-manipulation. In this article, we conduct a case study on human-robot table tennis. Due to the limited amount of time for executing hitting movements, a robot usually needs to initiate its hitting movement before the opponent hits the ball, which requires the robot to be anticipatory based on visual observation of the opponent's movement. Previous work on Intention-Driven Dynamics Models (IDDM) allowed the robot to predict the intended target of the opponent. In this article, we address the problem of action selection and optimal timing for initiating a chosen action by formulating the anticipatory action selection as a Partially Observable Markov Decision Process (POMDP), where the transition and observation are modeled by the \{IDDM\} framework. We present two approaches to anticipatory action selection based on the \{POMDP\} formulation, i.e., a model-free policy learning method based on Least-Squares Policy Iteration (LSPI) that employs the \{IDDM\} for belief updates, and a model-based Monte-Carlo Planning (MCP) method, which benefits from the transition and observation model by the IDDM. Experimental results using real data in a simulated environment show the importance of anticipatory action selection, and that \{POMDPs\} are suitable to formulate the anticipatory action selection problem by taking into account the uncertainties in prediction. We also show that existing algorithms for POMDPs, such as \{LSPI\} and MCP, can be applied to substantially improve the robot's performance in its interaction with humans.

am ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Robot Learning

Peters, J., Lee, D., Kober, J., Nguyen-Tuong, D., Bagnell, J., Schaal, S.

In Springer Handbook of Robotics, pages: 357-394, 15, 2nd, (Editors: Siciliano, Bruno and Khatib, Oussama), Springer International Publishing, 2017 (inbook)

am ei

Project Page [BibTex]

Project Page [BibTex]

2011


Thumb xl screen shot 2015 08 23 at 15.47.13
Multi-Modal Scene Understanding for Robotic Grasping

Bohg, J.

(2011:17):vi, 194, Trita-CSC-A, KTH Royal Institute of Technology, KTH, Computer Vision and Active Perception, CVAP, Centre for Autonomous Systems, CAS, KTH, Centre for Autonomous Systems, CAS, December 2011 (phdthesis)

Abstract
Current robotics research is largely driven by the vision of creating an intelligent being that can perform dangerous, difficult or unpopular tasks. These can for example be exploring the surface of planet mars or the bottom of the ocean, maintaining a furnace or assembling a car. They can also be more mundane such as cleaning an apartment or fetching groceries. This vision has been pursued since the 1960s when the first robots were built. Some of the tasks mentioned above, especially those in industrial manufacturing, are already frequently performed by robots. Others are still completely out of reach. Especially, household robots are far away from being deployable as general purpose devices. Although advancements have been made in this research area, robots are not yet able to perform household chores robustly in unstructured and open-ended environments given unexpected events and uncertainty in perception and execution.In this thesis, we are analyzing which perceptual and motor capabilities are necessary for the robot to perform common tasks in a household scenario. In that context, an essential capability is to understand the scene that the robot has to interact with. This involves separating objects from the background but also from each other.Once this is achieved, many other tasks become much easier. Configuration of object scan be determined; they can be identified or categorized; their pose can be estimated; free and occupied space in the environment can be outlined.This kind of scene model can then inform grasp planning algorithms to finally pick up objects.However, scene understanding is not a trivial problem and even state-of-the-art methods may fail. Given an incomplete, noisy and potentially erroneously segmented scene model, the questions remain how suitable grasps can be planned and how they can be executed robustly.In this thesis, we propose to equip the robot with a set of prediction mechanisms that allow it to hypothesize about parts of the scene it has not yet observed. Additionally, the robot can also quantify how uncertain it is about this prediction allowing it to plan actions for exploring the scene at specifically uncertain places. We consider multiple modalities including monocular and stereo vision, haptic sensing and information obtained through a human-robot dialog system. We also study several scene representations of different complexity and their applicability to a grasping scenario. Given an improved scene model from this multi-modal exploration, grasps can be inferred for each object hypothesis. Dependent on whether the objects are known, familiar or unknown, different methodologies for grasp inference apply. In this thesis, we propose novel methods for each of these cases. Furthermore,we demonstrate the execution of these grasp both in a closed and open-loop manner showing the effectiveness of the proposed methods in real-world scenarios.

am

pdf [BibTex]

2011


pdf [BibTex]


Thumb xl kthexecution
Mind the gap - robotic grasping under incomplete observation

Bohg, J., Johnson-Roberson, M., Leon, B., Felip, J., Gratal, X., Bergstrom, N., Kragic, D., Morales, A.

In Robotics and Automation (ICRA), 2011 IEEE International Conference on, pages: 686-693, May 2011 (inproceedings)

Abstract
We consider the problem of grasp and manipulation planning when the state of the world is only partially observable. Specifically, we address the task of picking up unknown objects from a table top. The proposed approach to object shape prediction aims at closing the knowledge gaps in the robot's understanding of the world. A completed state estimate of the environment can then be provided to a simulator in which stable grasps and collision-free movements are planned. The proposed approach is based on the observation that many objects commonly in use in a service robotic scenario possess symmetries. We search for the optimal parameters of these symmetries given visibility constraints. Once found, the point cloud is completed and a surface mesh reconstructed. Quantitative experiments show that the predictions are valid approximations of the real object shape. By demonstrating the approach on two very different robotic platforms its generality is emphasized.

am

pdf video code data DOI Project Page [BibTex]

pdf video code data DOI Project Page [BibTex]


no image
Learning, planning, and control for quadruped locomotion over challenging terrain

Kalakrishnan, Mrinal, Buchli, Jonas, Pastor, Peter, Mistry, Michael, Schaal, S.

International Journal of Robotics Research, 30(2):236-258, February 2011 (article)

am

[BibTex]

[BibTex]


no image
STOMP: Stochastic trajectory optimization for motion planning

Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a dual-arm mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based optimizers like CHOMP can get stuck in.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
An Experimental Demonstration of a Distributed and Event-based State Estimation Algorithm

(Best Interactive Paper Award (top out of 450))

Trimpe, S., D’Andrea, R.

In Proceedings of the 18th IFAC World Congress, 2011 (inproceedings)

am ics

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Path Integral Control and Bounded Rationality

Braun, D. A., Ortega, P. A., Theodorou, E., Schaal, S.

In IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011, clmc (inproceedings)

Abstract
Path integral methods [7], [15],[1] have recently been shown to be applicable to a very general class of optimal control problems. Here we examine the path integral formalism from a decision-theoretic point of view, since an optimal controller can always be regarded as an instance of a perfectly rational decision-maker that chooses its actions so as to maximize its expected utility [8]. The problem with perfect rationality is, however, that finding optimal actions is often very difficult due to prohibitive computational resource costs that are not taken into account. In contrast, a bounded rational decision-maker has only limited resources and therefore needs to strike some compromise between the desired utility and the required resource costs [14]. In particular, we suggest an information-theoretic measure of resource costs that can be derived axiomatically [11]. As a consequence we obtain a variational principle for choice probabilities that trades off maximizing a given utility criterion and avoiding resource costs that arise due to deviating from initially given default choice probabilities. The resulting bounded rational policies are in general probabilistic. We show that the solutions found by the path integral formalism are such bounded rational policies. Furthermore, we show that the same formalism generalizes to discrete control problems, leading to linearly solvable bounded rational control policies in the case of Markov systems. Importantly, Bellman?s optimality principle is not presupposed by this variational principle, but it can be derived as a limit case. This suggests that the information- theoretic formalization of bounded rationality might serve as a general principle in control design that unifies a number of recently reported approximate optimal control methods both in the continuous and discrete domain.

am

PDF [BibTex]

PDF [BibTex]


no image
Skill learning and task outcome prediction for manipulation

Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
Learning complex motor skills for real world tasks is a hard problem in robotic manipulation that often requires painstaking manual tuning and design by a human expert. In this work, we present a Reinforcement Learning based approach to acquiring new motor skills from demonstration. Our approach allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration. Our approach aims to incorporate task domain knowledge, where appropriate, by working in a space consistent with the constraints of a specific task. In addition, we also present an approach to using sensor feedback to learn a predictive model of the task outcome. This allows our system to learn the proprioceptive sensor feedback needed to monitor subsequent executions of the task online and abort execution in the event of predicted failure. We illustrate our approach using two example tasks executed with the PR2 dual-arm robot: a straight and accurate pool stroke and a box flipping task using two chopsticks as tools.

am

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Theodorou, E., Stulp, F., Buchli, J., Schaal, S.

In Proceedings of the 18th World Congress of the International Federation of Automatic Control, 2011, clmc (inproceedings)

Abstract
Recent work on path integral stochastic optimal control theory Theodorou et al. (2010a); Theodorou (2011) has shown promising results in planning and control of nonlinear systems in high dimensional state spaces. The path integral control framework relies on the transformation of the nonlinear Hamilton Jacobi Bellman (HJB) partial differential equation (PDE) into a linear PDE and the approximation of its solution via the use of the Feynman Kac lemma. In this work, we are reviewing the generalized version of path integral stochastic optimal control formalism Theodorou et al. (2010a), used for optimal control and planing of stochastic dynamical systems with state dependent control and diffusion matrices. Moreover we present the iterative path integral control approach, the so called Policy Improvement with Path Integrals or (PI2 ) which is capable of scaling in high dimensional robotic control problems. Furthermore we present a convergence analysis of the proposed algorithm and we apply the proposed framework to a variety of robotic tasks. Finally with the goal to perform locomotion the iterative path integral control is applied for learning nonlinear limit cycle attractors with adjustable land scape.

am

PDF [BibTex]

PDF [BibTex]


Thumb xl multi modal 2
Enhanced visual scene understanding through human-robot dialog

Johnson-Roberson, M., Bohg, J., Skantze, G., Gustafson, J., Carlson, R., Rasolzadeh, B., Kragic, D.

In Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, pages: 3342-3348, 2011 (inproceedings)

Abstract
We propose a novel human-robot-interaction framework for robust visual scene understanding. Without any a-priori knowledge about the objects, the task of the robot is to correctly enumerate how many of them are in the scene and segment them from the background. Our approach builds on top of state-of-the-art computer vision methods, generating object hypotheses through segmentation. This process is combined with a natural dialog system, thus including a `human in the loop' where, by exploiting the natural conversation of an advanced dialog system, the robot gains knowledge about ambiguous situations. We present an entropy-based system allowing the robot to detect the poorest object hypotheses and query the user for arbitration. Based on the information obtained from the human-robot dialog, the scene segmentation can be re-seeded and thereby improved. We present experimental results on real data that show an improved segmentation performance compared to segmentation without interaction.

am

pdf video DOI Project Page [BibTex]

pdf video DOI Project Page [BibTex]


Thumb xl battery2
Risk and gain battery management for self-docking mobile robots

Berenz, V., Suzuki, K.

In Robotics and Biomimetics (ROBIO), 2011 IEEE International Conference on, pages: 1766-1771, 2011 (inproceedings)

am

DOI [BibTex]

DOI [BibTex]


no image
Reduced Communication State Estimation for Control of an Unstable Networked Control System

Trimpe, S., D’Andrea, R.

In Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, 2011 (inproceedings)

am ics

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Neuromuscular Stochastic Optimal Control of a Tendon Driven Index Finger

Theodorou, E. A., Todorov, E., Valero-Cuevas, F.

In Proceedings of American Control Conference (ACC), 2011, clmc (inproceedings)

Abstract
With the goal to build robotic hands which can reach the levels of dexterity and robustness of the hand, the question of what are the candidate control principles that can handle the nonlinearities, the high dimensionality and the internal noise of biomechanical structures of the complexity of the hand, is still open. In this work we present the first stochastic optimal feedback controller applied to a full tendon driven simulated robotic index finger. In our model we do take into account the full tendon structure of the index finger which consist of 11 tendons based on the underlying physiology and we consider muscle with the typical force - length and force velocity properties. Our feedback controller show robustness against noise and perturbation of the dynamics while it can also successfully handle the nonlinearities and high dimensionality of the robotic index finger. Furthermore as it is shown in the evaluations, it provides the complete time history of the tendon excursions and the tendon velocities of the index finger for the tasks of tapping with zero and nonzero terminal velocities.

am

PDF [BibTex]

PDF [BibTex]


no image
Bayesian robot system identification with input and output noise

Ting, J., D’Souza, A., Schaal, S.

Neural Networks, 24(1):99-108, 2011, clmc (article)

Abstract
For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods

am

link (url) [BibTex]

link (url) [BibTex]