Header logo is


2014


no image
Multi-Task Policy Search for Robotics

Deisenroth, M., Englert, P., Peters, J., Fox, D.

In Proceedings of 2014 IEEE International Conference on Robotics and Automation, pages: 3876-3881, IEEE, ICRA, 2014 (inproceedings)

ei

PDF DOI [BibTex]

2014


PDF DOI [BibTex]


no image
Sample-Based Information-Theoretic Stochastic Optimal Control

Lioutikov, R., Paraschos, A., Peters, J., Neumann, G.

In Proceedings of 2014 IEEE International Conference on Robotics and Automation, pages: 3896-3902, IEEE, ICRA, 2014 (inproceedings)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Probabilistic Shortest Path Tractography in DTI Using Gaussian Process ODE Solvers

Schober, M., Kasenburg, N., Feragen, A., Hennig, P., Hauberg, S.

In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, Lecture Notes in Computer Science Vol. 8675, pages: 265-272, (Editors: P. Golland, N. Hata, C. Barillot, J. Hornegger and R. Howe), Springer, Heidelberg, MICCAI, 2014 (inproceedings)

ei pn

DOI [BibTex]

DOI [BibTex]


no image
Estimating Causal Effects by Bounding Confounding

Geiger, P., Janzing, D., Schölkopf, B.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence , pages: 240-249 , (Editors: Nevin L. Zhang and Jin Tian), AUAI Press Corvallis, Oregon , UAI, 2014 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Re-ranking Approach to Classification in Large-scale Power-law Distributed Category Systems

Babbar, R., Partalas, I., Gaussier, E., Amini, M.

In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages: 1059-1062, (Editors: S Geva and A Trotman and P Bruza and CLA Clarke and K Järvelin), ACM, New York, NY, USA, SIGIR, 2014 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Kernel Mean Estimation and Stein Effect

Muandet, K., Fukumizu, K., Sriperumbudur, B., Gretton, A., Schölkopf, B.

In Proceedings of the 31st International Conference on Machine Learning, W&CP 32 (1), pages: 10-18, (Editors: Eric P. Xing and Tony Jebara), JMLR, ICML, 2014 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Active Reward Learning

Daniel, C., Viering, M., Metz, J., Kroemer, O., Peters, J.

In Proceedings of Robotics: Science & Systems, (Editors: Fox, D., Kavraki, LE., and Kurniawati, H.), RSS, 2014 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Multi-modal filtering for non-linear estimation

Kamthe, S., Peters, J., Deisenroth, M.

In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages: 7979-7983, IEEE, ICASSP, 2014 (inproceedings)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Inferring latent structures via information inequalities

Chaves, R., Luft, L., Maciel, T., Gross, D., Janzing, D., Schölkopf, B.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 112-121, (Editors: NL Zhang and J Tian), AUAI Press, Corvallis, Oregon, UAI, 2014 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Policy Search For Learning Robot Control Using Sparse Data

Bischoff, B., Nguyen-Tuong, D., van Hoof, H., McHutchon, A., Rasmussen, C., Knoll, A., Peters, J., Deisenroth, M.

In Proceedings of 2014 IEEE International Conference on Robotics and Automation, pages: 3882-3887, IEEE, ICRA, 2014 (inproceedings)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Learning to Unscrew a Light Bulb from Demonstrations

Manschitz, S., Kober, J., Gienger, M., Peters, J.

In Proceedings for the joint conference of ISR 2014, 45th International Symposium on Robotics and Robotik 2014, 2014 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Towards Neurofeedback Training of Associative Brain Areas for Stroke Rehabilitation

Özdenizci, O., Meyer, T., Cetin, M., Grosse-Wentrup, M.

In Proceedings of the 6th International Brain-Computer Interface Conference, (Editors: G Müller-Putz and G Bauernfeind and C Brunner and D Steyrl and S Wriessnegger and R Scherer), 2014 (inproceedings)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

Gunter, T., Osborne, M., Garnett, R., Hennig, P., Roberts, S.

In Advances in Neural Information Processing Systems 27, pages: 2789-2797, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei pn

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
A Self-Tuning LQR Approach Demonstrated on an Inverted Pendulum

Trimpe, S., Millane, A., Doessegger, S., D’Andrea, R.

In Proceedings of the 19th IFAC World Congress, Cape Town, South Africa, 2014 (inproceedings)

am ics

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Fast Newton methods for the group fused lasso

Wytock, M., Sra, S., Kolter, J. Z.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 888-897, (Editors: Zhang, N. L. and Tian, J.), AUAI Press, UAI, 2014 (inproceedings)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Learning coupling terms for obstacle avoidance

Rai, A., Meier, F., Ijspeert, A., Schaal, S.

In International Conference on Humanoid Robotics, pages: 512-518, IEEE, 2014, clmc (inproceedings)

Abstract
Autonomous manipulation in dynamic environments is important for robots to perform everyday tasks. For this, a manipulator should be capable of interpreting the environment and planning an appropriate movement. At least, two possible approaches exist for this in literature. Usually, a planning system is used to generate a complex movement plan that satisfies all constraints. Alternatively, a simple plan could be chosen and modified with sensory feedback to accommodate additional constraints by equipping the controller with features that remain dormant most of the time, except when specific situations arise. Dynamic Movement Primitives (DMPs) form a robust and versatile starting point for such a controller that can be modified online using a non-linear term, called the coupling term. This can prove to be a fast and reactive way of obstacle avoidance in a human-like fashion. We propose a method to learn this coupling term from human demonstrations starting with simple features and making it more robust to avoid a larger range of obstacles. We test the ability of our coupling term to model different kinds of obstacle avoidance behaviours in humans and use this learnt coupling term to avoid obstacles in a reactive manner. This line of research aims at pushing the boundary of reactive control strategies to more complex scenarios, such that complex and usually computationally more expensive planning methods can be avoided as much as possible.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Efficient Structured Matrix Rank Minimization

Yu, A. W., Ma, W., Yu, Y., Carbonell, J., Sra, S.

Advances in Neural Information Processing Systems 27, pages: 1350-1358, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Towards building a Crowd-Sourced Sky Map

Lang, D., Hogg, D., Schölkopf, B.

In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, JMLR W\&CP 33, pages: 549–557, (Editors: S. Kaski and J. Corander), JMLR.org, AISTATS, 2014 (inproceedings)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl toc image
Active Microrheology of the Vitreous of the Eye applied to Nanorobot Propulsion

Qiu, T., Schamel, D., Mark, A. G., Fischer, P.

In 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), pages: 3801-3806, IEEE International Conference on Robotics and Automation ICRA, 2014, Best Automation Paper Award – Finalist. (inproceedings)

Abstract
Biomedical applications of micro or nanorobots require active movement through complex biological fluids. These are generally non-Newtonian (viscoelastic) fluids that are characterized by complicated networks of macromolecules that have size-dependent rheological properties. It has been suggested that an untethered microrobot could assist in retinal surgical procedures. To do this it must navigate the vitreous humor, a hydrated double network of collagen fibrils and high molecular-weight, polyanionic hyaluronan macromolecules. Here, we examine the characteristic size that potential robots must have to traverse vitreous relatively unhindered. We have constructed magnetic tweezers that provide a large gradient of up to 320 T/m to pull sub-micron paramagnetic beads through biological fluids. A novel two-step electrical discharge machining (EDM) approach is used to construct the tips of the magnetic tweezers with a resolution of 30 mu m and high aspect ratio of similar to 17:1 that restricts the magnetic field gradient to the plane of observation. We report measurements on porcine vitreous. In agreement with structural data and passive Brownian diffusion studies we find that the unhindered active propulsion through the eye calls for nanorobots with cross-sections of less than 500 nm.

Best Automation Paper Award – Finalist.

pf

[BibTex]

[BibTex]


no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

am ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Learning to Deblur

Schuler, C. J., Hirsch, M., Harmeling, S., Schölkopf, B.

In NIPS 2014 Deep Learning and Representation Learning Workshop, 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

am ei pn

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
Stability Analysis of Distributed Event-Based State Estimation

Trimpe, S.

In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, 2014 (inproceedings)

Abstract
An approach for distributed and event-based state estimation that was proposed in previous work [1] is analyzed and extended to practical networked systems in this paper. Multiple sensor-actuator-agents observe a dynamic process, sporadically exchange their measurements over a broadcast network according to an event-based protocol, and estimate the process state from the received data. The event-based approach was shown in [1] to mimic a centralized Luenberger observer up to guaranteed bounds, under the assumption of identical estimates on all agents. This assumption, however, is unrealistic (it is violated by a single packet drop or slight numerical inaccuracy) and removed herein. By means of a simulation example, it is shown that non-identical estimates can actually destabilize the overall system. To achieve stability, the event-based communication scheme is supplemented by periodic (but infrequent) exchange of the agentsâ?? estimates and reset to their joint average. When the local estimates are used for feedback control, the stability guarantee for the estimation problem extends to the event-based control system.

am ics

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


no image
Towards an optimal stochastic alternating direction method of multipliers

Azadi, S., Sra, S.

Proceedings of the 31st International Conference on Machine Learning, 32, pages: 620-628, (Editors: Xing, E. P. and Jebara, T.), JMLR, ICML, 2014 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Open Problem: Finding Good Cascade Sampling Processes for the Network Inference Problem

Gomez Rodriguez, M., Song, L., Schölkopf, B.

Proceedings of the 27th Conference on Learning Theory, 35, pages: 1276-1279, (Editors: Balcan, M.-F. and Szepesvári, C.), JMLR.org, COLT, 2014 (conference)

ei

PDF [BibTex]

PDF [BibTex]


no image
Three-dimensional robotic manipulation and transport of micro-scale objects by a magnetically driven capillary micro-gripper

Giltinan, J., Diller, E., Mayda, C., Sitti, M.

In Robotics and Automation (ICRA), 2014 IEEE International Conference on, pages: 2077-2082, 2014 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Increasing the sensor performance using Au modified high temperature superconducting YBa2Cu3O7-delta thin films

Katzer, C., Stahl, C., Michalowski, P., Treiber, S., Westernhausen, M., Schmidl, F., Seidel, P., Schütz, G., Albrecht, J.

In 507, IOP Pub., Genova, Italy, 2014 (inproceedings)

mms

DOI [BibTex]

DOI [BibTex]


no image
Self-Exploration of the Stumpy Robot with Predictive Information Maximization

Martius, G., Jahn, L., Hauser, H., V. Hafner, V.

In Proc. From Animals to Animats, SAB 2014, 8575, pages: 32-42, LNCS, Springer, 2014 (inproceedings)

al

[BibTex]

[BibTex]


no image
Dual Execution of Optimized Contact Interaction Trajectories

Toussaint, M., Ratliff, N., Bohg, J., Righetti, L., Englert, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 47-54, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Efficient manipulation requires contact to reduce uncertainty. The manipulation literature refers to this as funneling: a methodology for increasing reliability and robustness by leveraging haptic feedback and control of environmental interaction. However, there is a fundamental gap between traditional approaches to trajectory optimization and this concept of robustness by funneling: traditional trajectory optimizers do not discover force feedback strategies. From a POMDP perspective, these behaviors could be regarded as explicit observation actions planned to sufficiently reduce uncertainty thereby enabling a task. While we are sympathetic to the full POMDP view, solving full continuous-space POMDPs in high-dimensions is hard. In this paper, we propose an alternative approach in which trajectory optimization objectives are augmented with new terms that reward uncertainty reduction through contacts, explicitly promoting funneling. This augmentation shifts the responsibility of robustness toward the actual execution of the optimized trajectories. Directly tracing trajectories through configuration space would lose all robustness-dual execution achieves robustness by devising force controllers to reproduce the temporal interaction profile encoded in the dual solution of the optimization problem. This work introduces dual execution in depth and analyze its performance through robustness experiments in both simulation and on a real-world robotic platform.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Robotic assembly of hydrogels for tissue engineering and regenerative medicine

Tasoglu, S, Diller, E, Guven, S, Sitti, M, Demirci, U

In Journal of Tissue Engineering and Regenerative Medicine, 8, pages: 181-182, 2014 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Learning and Exploration in a Novel Dimensionality-Reduction Task

Ebert, J, Kim, S, Schweighofer, N., Sternad, D, Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Amsterdam, Netherlands, 2014 (inproceedings)

am

[BibTex]

[BibTex]


no image
Versatile non-contact micro-manipulation method using rotational flows locally induced by magnetic microrobots

Ye, Z., Edington, C., Russell, A. J., Sitti, M.

In Advanced Intelligent Mechatronics (AIM), 2014 IEEE/ASME International Conference on, pages: 26-31, 2014 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Balancing experiments on a torque-controlled humanoid with hierarchical inverse dynamics

Herzog, A., Righetti, L., Grimminger, F., Pastor, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 981-988, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Recently several hierarchical inverse dynamics controllers based on cascades of quadratic programs have been proposed for application on torque controlled robots. They have important theoretical benefits but have never been implemented on a torque controlled robot where model inaccuracies and real-time computation requirements can be problematic. In this contribution we present an experimental evaluation of these algorithms in the context of balance control for a humanoid robot. The presented experiments demonstrate the applicability of the approach under real robot conditions (i.e. model uncertainty, estimation errors, etc). We propose a simplification of the optimization problem that allows us to decrease computation time enough to implement it in a fast torque control loop. We implement a momentum-based balance controller which shows robust performance in face of unknown disturbances, even when the robot is standing on only one foot. In a second experiment, a tracking task is evaluated to demonstrate the performance of the controller with more complicated hierarchies. Our results show that hierarchical inverse dynamics controllers can be used for feedback control of humanoid robots and that momentum-based balance control can be efficiently implemented on a real robot.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Full Dynamics LQR Control of a Humanoid Robot: An Experimental Study on Balancing and Squatting

Mason, S., Righetti, L., Schaal, S.

In 2014 IEEE-RAS International Conference on Humanoid Robots, pages: 374-379, IEEE, Madrid, Spain, 2014 (inproceedings)

Abstract
Humanoid robots operating in human environments require whole-body controllers that can offer precise tracking and well-defined disturbance rejection behavior. In this contribution, we propose an experimental evaluation of a linear quadratic regulator (LQR) using a linearization of the full robot dynamics together with the contact constraints. The advantage of the controller is that it explicitly takes into account the coupling between the different joints to create optimal feedback controllers for whole-body control. We also propose a method to explicitly regulate other tasks of interest, such as the regulation of the center of mass of the robot or its angular momentum. In order to evaluate the performance of linear optimal control designs in a real-world scenario (model uncertainty, sensor noise, imperfect state estimation, etc), we test the controllers in a variety of tracking and balancing experiments on a torque controlled humanoid (e.g. balancing, split plane balancing, squatting, pushes while squatting, and balancing on a wheeled platform). The proposed control framework shows a reliable push recovery behavior competitive with more sophisticated balance controllers, rejecting impulses up to 11.7 Ns with peak forces of 650 N, with the added advantage of great computational simplicity. Furthermore, the controller is able to track squatting trajectories up to 1 Hz without relinearization, suggesting that the linearized dynamics is sufficient for significant ranges of motion.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Curiosity-driven learning with Context Tree Weighting

Peng, Z, Braun, DA

pages: 366-367, IEEE, Piscataway, NJ, USA, 4th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (IEEE ICDL-EPIROB), October 2014 (conference)

Abstract
In the first simulation, the intrinsic motivation of the agent was given by measuring learning progress through reduction in informational surprise (Figure 1 A-C). This way the agent should first learn the action that is easiest to learn (a1), and then switch to other actions that still allow for learning (a2) and ignore actions that cannot be learned at all (a3). This is exactly what we found in our simple environment. Compared to the original developmental learning algorithm based on learning progress proposed by Oudeyer [2], our Context Tree Weighting approach does not require local experts to do prediction, rather it learns the conditional probability distribution over observations given action in one structure. In the second simulation, the intrinsic motivation of the agent was given by measuring compression progress through improvement in compressibility (Figure 1 D-F). The agent behaves similarly: the agent first concentrates on the action with the most predictable consequence and then switches over to the regular action where the consequence is more difficult to predict, but still learnable. Unlike the previous simulation, random actions are also interesting to some extent because the compressed symbol strings use 8-bit representations, while only 2 bits are required for our observation space. Our preliminary results suggest that Context Tree Weighting might provide a useful representation to study problems of development.

ei

DOI [BibTex]

DOI [BibTex]


no image
Structural optimization method towards synthesis of small scale flexure-based mobile grippers

Lum, G. Z., Diller, E., Sitti, M.

In Robotics and Automation (ICRA), 2014 IEEE International Conference on, pages: 2339-2344, 2014 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Six-Degrees-of-Freedom Remote Actuation of Magnetic Microrobots.

Diller, E. D., Giltinan, J., Lum, G. Z., Ye, Z., Sitti, M.

In Robotics: Science and Systems, 2014 (inproceedings)

pi

[BibTex]

[BibTex]


no image
State Estimation for a Humanoid Robot

Rotella, N., Bloesch, M., Righetti, L., Schaal, S.

In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 952-958, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
This paper introduces a framework for state estimation on a humanoid robot platform using only common proprioceptive sensors and knowledge of leg kinematics. The presented approach extends that detailed in prior work on a point-foot quadruped platform by adding the rotational constraints imposed by the humanoid's flat feet. As in previous work, the proposed Extended Kalman Filter accommodates contact switching and makes no assumptions about gait or terrain, making it applicable on any humanoid platform for use in any task. A nonlinear observability analysis is performed on both the point-foot and flat-foot filters and it is concluded that the addition of rotational constraints significantly simplifies singular cases and improves the observability characteristics of the system. Results on a simulated walking dataset demonstrate the performance gain of the flat-foot filter as well as confirm the results of the presented observability analysis.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Monte Carlo methods for exact & efficient solution of the generalized optimality equations

Ortega, PA, Braun, DA, Tishby, N

pages: 4322-4327, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), June 2014 (conference)

Abstract
Previous work has shown that classical sequential decision making rules, including expectimax and minimax, are limit cases of a more general class of bounded rational planning problems that trade off the value and the complexity of the solution, as measured by its information divergence from a given reference. This allows modeling a range of novel planning problems having varying degrees of control due to resource constraints, risk-sensitivity, trust and model uncertainty. However, so far it has been unclear in what sense information constraints relate to the complexity of planning. In this paper, we introduce Monte Carlo methods to solve the generalized optimality equations in an efficient \& exact way when the inverse temperatures in a generalized decision tree are of the same sign. These methods highlight a fundamental relation between inverse temperatures and the number of Monte Carlo proposals. In particular, it is seen that the number of proposals is essentially independent of the size of the decision tree.

ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2010


no image
Reinforcement learning of full-body humanoid motor skills

Stulp, F., Buchli, J., Theodorou, E., Schaal, S.

In Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on, pages: 405-410, December 2010, clmc (inproceedings)

Abstract
Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, called Policy Improvement with Path Integrals (PI2), has a surprisingly simple form, has no open tuning parameters besides the exploration noise, is model-free, and performs numerically robustly in high dimensional learning problems. We demonstrate how PI2 is able to learn full-body motor skills on a 34-DOF humanoid robot. To demonstrate the generality of our approach, we also apply PI2 in the context of variable impedance control, where both planned trajectories and gain schedules for each joint are optimized simultaneously.

am

link (url) [BibTex]

2010


link (url) [BibTex]


no image
Learning Table Tennis with a Mixture of Motor Primitives

Mülling, K., Kober, J., Peters, J.

In Proceedings of the 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010), pages: 411-416, IEEE, Piscataway, NJ, USA, 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids), December 2010 (inproceedings)

Abstract
Table tennis is a sufficiently complex motor task for studying complete skill learning systems. It consists of several elementary motions and requires fast movements, accurate control, and online adaptation. To represent the elementary movements needed for robot table tennis, we rely on dynamic systems motor primitives (DMP). While such DMPs have been successfully used for learning a variety of simple motor tasks, they only represent single elementary actions. In order to select and generalize among different striking movements, we present a new approach, called Mixture of Motor Primitives that uses a gating network to activate appropriate motor primitives. The resulting policy enables us to select among the appropriate motor primitives as well as to generalize between them. In order to obtain a fully learned robot table tennis setup, we also address the problem of predicting the necessary context information, i.e., the hitting point in time and space where we want to hit the ball. We show that the resulting setup was capable of playing rudimentary table tennis using an anthropomorphic robot arm.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Learning an interactive segmentation system

Nickisch, H., Rother, C., Kohli, P., Rhemann, C.

In Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2010), pages: 274-281, (Editors: Chellapa, R. , P. Anandan, A. N. Rajagopalan, P. J. Narayanan, P. Torr), ACM Press, Nw York, NY, USA, Seventh Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), December 2010 (inproceedings)

Abstract
Many successful applications of computer vision to image or video manipulation are interactive by nature. However, parameters of such systems are often trained neglecting the user. Traditionally, interactive systems have been treated in the same manner as their fully automatic counterparts. Their performance is evaluated by computing the accuracy of their solutions under some fixed set of user interactions. This paper proposes a new evaluation and learning method which brings the user in the loop. It is based on the use of an active robot user -- a simulated model of a human user. We show how this approach can be used to evaluate and learn parameters of state-of-the-art interactive segmentation systems. We also show how simulated user models can be integrated into the popular max-margin method for parameter learning and propose an algorithm to solve the resulting optimisation problem.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Using an Infinite Von Mises-Fisher Mixture Model to Cluster Treatment Beam Directions in External Radiation Therapy

Bangert, M., Hennig, P., Oelfke, U.

In pages: 746-751 , (Editors: Draghici, S. , T.M. Khoshgoftaar, V. Palade, W. Pedrycz, M.A. Wani, X. Zhu), IEEE, Piscataway, NJ, USA, Ninth International Conference on Machine Learning and Applications (ICMLA), December 2010 (inproceedings)

Abstract
We present a method for fully automated selection of treatment beam ensembles for external radiation therapy. We reformulate the beam angle selection problem as a clustering problem of locally ideal beam orientations distributed on the unit sphere. For this purpose we construct an infinite mixture of von Mises-Fisher distributions, which is suited in general for density estimation from data on the D-dimensional sphere. Using a nonparametric Dirichlet process prior, our model infers probability distributions over both the number of clusters and their parameter values. We describe an efficient Markov chain Monte Carlo inference algorithm for posterior inference from experimental data in this model. The performance of the suggested beam angle selection framework is illustrated for one intra-cranial, pancreas, and prostate case each. The infinite von Mises-Fisher mixture model (iMFMM) creates between 18 and 32 clusters, depending on the patient anatomy. This suggests to use the iMFMM directly for beam ensemble selection in robotic radio surgery, or to generate low-dimensional input for both subsequent optimization of trajectories for arc therapy and beam ensemble selection for conventional radiation therapy.

ei pn

Web DOI [BibTex]

Web DOI [BibTex]


no image
Online algorithms for submodular minimization with combinatorial constraints

Jegelka, S., Bilmes, J.

In pages: 1-6, NIPS Workshop on Discrete Optimization in Machine Learning: Structures, Algorithms and Applications (DISCML), December 2010 (inproceedings)

Abstract
Building on recent results for submodular minimization with combinatorial constraints, and on online submodular minimization, we address online approximation algorithms for submodular minimization with combinatorial constraints. We discuss two types of algorithms and outline approximation algorithms that integrate into those.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Multi-agent random walks for local clustering

Alamgir, M., von Luxburg, U.

In Proceedings of the IEEE International Conference on Data Mining (ICDM 2010), pages: 18-27, (Editors: Webb, G. I., B. Liu, C. Zhang, D. Gunopulos, X. Wu), IEEE, Piscataway, NJ, USA, IEEE International Conference on Data Mining (ICDM), December 2010 (inproceedings)

Abstract
We consider the problem of local graph clustering where the aim is to discover the local cluster corresponding to a point of interest. The most popular algorithms to solve this problem start a random walk at the point of interest and let it run until some stopping criterion is met. The vertices visited are then considered the local cluster. We suggest a more powerful alternative, the multi-agent random walk. It consists of several “agents” connected by a fixed rope of length l. All agents move independently like a standard random walk on the graph, but they are constrained to have distance at most l from each other. The main insight is that for several agents it is harder to simultaneously travel over the bottleneck of a graph than for just one agent. Hence, the multi-agent random walk has less tendency to mistakenly merge two different clusters than the original random walk. In our paper we analyze the multi-agent random walk theoretically and compare it experimentally to the major local graph clustering algorithms from the literature. We find that our multi-agent random walk consistently outperforms these algorithms.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Effects of Packet Losses to Stability in Bilateral Teleoperation Systems

Hong, A., Cho, JH., Lee, DY.

In pages: 1043-1044, Korean Society of Mechanical Engineers, Seoul, South Korea, KSME Fall Annual Meeting, November 2010 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Combining Real-Time Brain-Computer Interfacing and Robot Control for Stroke Rehabilitation

Gomez Rodriguez, M., Peters, J., Hill, J., Gharabaghi, A., Schölkopf, B., Grosse-Wentrup, M.

In Proceedings of SIMPAR 2010 Workshops, pages: 59-63, Brain-Computer Interface Workshop at SIMPAR: 2nd International Conference on Simulation, Modeling, and Programming for Autonomous Robots, November 2010 (inproceedings)

Abstract
Brain-Computer Interfaces based on electrocorticography (ECoG) or electroencephalography (EEG), in combination with robot-assisted active physical therapy, may support traditional rehabilitation procedures for patients with severe motor impairment due to cerebrovascular brain damage caused by stroke. In this short report, we briefly review the state-of-the art in this exciting new field, give an overview of the work carried out at the Max Planck Institute for Biological Cybernetics and the University of T{\"u}bingen, and discuss challenges that need to be addressed in order to move from basic research to clinical studies.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning as a key ability for Human-Friendly Robots

Peters, J., Kober, J., Mülling, K., Krömer, O., Nguyen-Tuong, D., Wang, Z., Rodriguez Gomez, M., Grosse-Wentrup, M.

In pages: 1-2, 3rd Workshop for Young Researchers on Human-Friendly Robotics (HFR), October 2010 (inproceedings)

ei

Web [BibTex]

Web [BibTex]


no image
Closing the sensorimotor loop: Haptic feedback facilitates decoding of arm movement imagery

Gomez Rodriguez, M., Peters, J., Hill, J., Schölkopf, B., Gharabaghi, A., Grosse-Wentrup, M.

In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC 2010), pages: 121-126, IEEE, Piscataway, NJ, USA, IEEE International Conference on Systems, Man and Cybernetics (SMC), October 2010 (inproceedings)

Abstract
Brain-Computer Interfaces (BCIs) in combination with robot-assisted physical therapy may become a valuable tool for neurorehabilitation of patients with severe hemiparetic syndromes due to cerebrovascular brain damage (stroke) and other neurological conditions. A key aspect of this approach is reestablishing the disrupted sensorimotor feedback loop, i.e., determining the intended movement using a BCI and helping a human with impaired motor function to move the arm using a robot. It has not been studied yet, however, how artificially closing the sensorimotor feedback loop affects the BCI decoding performance. In this article, we investigate this issue in six healthy subjects, and present evidence that haptic feedback facilitates the decoding of arm movement intention. The results provide evidence of the feasibility of future rehabilitative efforts combining robot-assisted physical therapy with BCIs.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning Probabilistic Discriminative Models of Grasp Affordances under Limited Supervision

Erkan, A., Kroemer, O., Detry, R., Altun, Y., Piater, J., Peters, J.

In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010), pages: 1586-1591, IEEE, Piscataway, NJ, USA, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2010 (inproceedings)

Abstract
This paper addresses the problem of learning and efficiently representing discriminative probabilistic models of object-specific grasp affordances particularly when the number of labeled grasps is extremely limited. The proposed method does not require an explicit 3D model but rather learns an implicit manifold on which it defines a probability distribution over grasp affordances. We obtain hypothetical grasp configurations from visual descriptors that are associated with the contours of an object. While these hypothetical configurations are abundant, labeled configurations are very scarce as these are acquired via time-costly experiments carried out by the robot. Kernel logistic regression (KLR) via joint kernel maps is trained to map the hypothesis space of grasps into continuous class-conditional probability values indicating their achievability. We propose a soft-supervised extension of KLR and a framework to combine the merits of semi-supervised and active learning approaches to tackle the scarcity of labeled grasps. Experimental evaluation shows that combining active and semi-supervised learning is favorable in the existence of an oracle. Furthermore, semi-supervised learning outperforms supervised learning, particularly when the labeled data is very limited.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]