Header logo is


2013


Thumb xl impact battery
Probabilistic Object Tracking Using a Range Camera

Wüthrich, M., Pastor, P., Kalakrishnan, M., Bohg, J., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3195-3202, IEEE, November 2013 (inproceedings)

Abstract
We address the problem of tracking the 6-DoF pose of an object while it is being manipulated by a human or a robot. We use a dynamic Bayesian network to perform inference and compute a posterior distribution over the current object pose. Depending on whether a robot or a human manipulates the object, we employ a process model with or without knowledge of control inputs. Observations are obtained from a range camera. As opposed to previous object tracking methods, we explicitly model self-occlusions and occlusions from the environment, e.g, the human or robotic hand. This leads to a strongly non-linear observation model and additional dependencies in the Bayesian network. We employ a Rao-Blackwellised particle filter to compute an estimate of the object pose at every time step. In a set of experiments, we demonstrate the ability of our method to accurately and robustly track the object pose in real-time while it is being manipulated by a human or a robot.

am

arXiv Video Code Video DOI Project Page [BibTex]

2013


arXiv Video Code Video DOI Project Page [BibTex]


Thumb xl multi modal
3-D Object Reconstruction of Symmetric Objects by Fusing Visual and Tactile Sensing

Illonen, J., Bohg, J., Kyrki, V.

The International Journal of Robotics Research, 33(2):321-341, Sage, October 2013 (article)

Abstract
In this work, we propose to reconstruct a complete 3-D model of an unknown object by fusion of visual and tactile information while the object is grasped. Assuming the object is symmetric, a first hypothesis of its complete 3-D shape is generated. A grasp is executed on the object with a robotic manipulator equipped with tactile sensors. Given the detected contacts between the fingers and the object, the initial full object model including the symmetry parameters can be refined. This refined model will then allow the planning of more complex manipulation tasks. The main contribution of this work is an optimal estimation approach for the fusion of visual and tactile data applying the constraint of object symmetry. The fusion is formulated as a state estimation problem and solved with an iterative extended Kalman filter. The approach is validated experimentally using both artificial and real data from two different robotic platforms.

am

Web DOI Project Page [BibTex]

Web DOI Project Page [BibTex]


Thumb xl submodularity nips
Learning and Optimization with Submodular Functions

Sankaran, B., Ghazvininejad, M., He, X., Kale, D., Cohen, L.

ArXiv, May 2013 (techreport)

Abstract
In many naturally occurring optimization problems one needs to ensure that the definition of the optimization problem lends itself to solutions that are tractable to compute. In cases where exact solutions cannot be computed tractably, it is beneficial to have strong guarantees on the tractable approximate solutions. In order operate under these criterion most optimization problems are cast under the umbrella of convexity or submodularity. In this report we will study design and optimization over a common class of functions called submodular functions. Set functions, and specifically submodular set functions, characterize a wide variety of naturally occurring optimization problems, and the property of submodularity of set functions has deep theoretical consequences with wide ranging applications. Informally, the property of submodularity of set functions concerns the intuitive principle of diminishing returns. This property states that adding an element to a smaller set has more value than adding it to a larger set. Common examples of submodular monotone functions are entropies, concave functions of cardinality, and matroid rank functions; non-monotone examples include graph cuts, network flows, and mutual information. In this paper we will review the formal definition of submodularity; the optimization of submodular functions, both maximization and minimization; and finally discuss some applications in relation to learning and reasoning using submodular functions.

am

arxiv link (url) [BibTex]

arxiv link (url) [BibTex]


Thumb xl featureextraction
Hypothesis Testing Framework for Active Object Detection

Sankaran, B., Atanasov, N., Le Ny, J., Koletschka, T., Pappas, G., Daniilidis, K.

In IEEE International Conference on Robotics and Automation (ICRA), May 2013, clmc (inproceedings)

Abstract
One of the central problems in computer vision is the detection of semantically important objects and the estimation of their pose. Most of the work in object detection has been based on single image processing and its performance is limited by occlusions and ambiguity in appearance and geometry. This paper proposes an active approach to object detection by controlling the point of view of a mobile depth camera. When an initial static detection phase identifies an object of interest, several hypotheses are made about its class and orientation. The sensor then plans a sequence of view-points, which balances the amount of energy used to move with the chance of identifying the correct hypothesis. We formulate an active M-ary hypothesis testing problem, which includes sensor mobility, and solve it using a point-based approximate POMDP algorithm. The validity of our approach is verified through simulation and experiments with real scenes captured by a kinect sensor. The results suggest a significant improvement over static object detection.

am

pdf [BibTex]

pdf [BibTex]


no image
Action and Goal Related Decision Variables Modulate the Competition Between Multiple Potential Targets

Enachescu, V, Christopoulos, Vassilios N, Schrater, P. R., Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2013), February 2013 (inproceedings)

am

[BibTex]

[BibTex]


Thumb xl toc image
Hybrid nanocolloids with programmed three-dimensional shape and material composition

Mark, A. G., Gibbs, J. G., Lee, T., Fischer, P.

NATURE MATERIALS, 12(9):802-807, 2013, Max Planck Press Release. (article)

Abstract
Tuning the optical(1,2), electromagnetic(3,4) and mechanical properties of a material requires simultaneous control over its composition and shape(5). This is particularly challenging for complex structures at the nanoscale because surface-energy minimization generally causes small structures to be highly symmetric(5). Here we combine low-temperature shadow deposition with nanoscale patterning to realize nanocolloids with anisotropic three-dimensional shapes, feature sizes down to 20 nm and a wide choice of materials. We demonstrate the versatility of the fabrication scheme by growing three-dimensional hybrid nanostructures that contain several functional materials with the lowest possible symmetry, and by fabricating hundreds of billions of plasmonic nanohelices, which we use as chiral metafluids with record circular dichroism and tunable chiroptical properties.

Max Planck Press Release.

pf

Video - Fabrication of Designer Nanostructures DOI [BibTex]


no image
Optimal control of reaching includes kinematic constraints

Mistry, M., Theodorou, E., Schaal, S., Kawato, M.

Journal of Neurophysiology, 2013, clmc (article)

Abstract
We investigate adaptation under a reaching task with an acceleration-based force field perturbation designed to alter the nominal straight hand trajectory in a potentially benign manner:pushing the hand of course in one direction before subsequently restoring towards the target. In this particular task, an explicit strategy to reduce motor effort requires a distinct deviation from the nominal rectilinear hand trajectory. Rather, our results display a clear directional preference during learning, as subjects adapted perturbed curved trajectories towards their initial baselines. We model this behavior using the framework of stochastic optimal control theory and an objective function that trades-of the discordant requirements of 1) target accuracy, 2) motor effort, and 3) desired trajectory. Our work addresses the underlying objective of a reaching movement, and we suggest that robustness, particularly against internal model uncertainly, is as essential to the reaching task as terminal accuracy and energy effciency.

am

PDF [BibTex]

PDF [BibTex]


Thumb xl fig1
Chiral Colloidal Molecules And Observation of The Propeller Effect

Schamel, D., Pfeifer, M., Gibbs, J. G., Miksch, B., Mark, A. G., Fischer, P.

JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 135(33):12353-12359, 2013 (article)

Abstract
Chiral molecules play an important role in biological and chemical processes, but physical effects due to their symmetry-breaking are generally weak. Several physical chiral separation schemes which could potentially be useful, including the propeller effect, have therefore not yet been demonstrated at the molecular scale. However, it has been proposed that complex nonspherical colloidal particles could act as ``colloidal molecules{''} in mesoscopic model systems to permit the visualization of molecular phenomena that are otherwise difficult to observe. Unfortunately, it is difficult to synthesize such colloids because surface minimization generally favors the growth of symmetric particles. Here we demonstrate the production of large numbers of complex colloids with glancing angle physical vapor deposition. We use chiral colloids to demonstrate the Baranova and Zel'dovich (Baranova, N. B.; Zel'dovich, B. Y. Chem. Phys. Lett. 1978, 57, 435) propeller effect: the separation of a racemic mixture by application of a rotating field that couples to the dipole moment of the enantiomers and screw propels them in opposite directions. The handedness of the colloidal suspensions is monitored with circular differential light scattering. An exact solution for the colloid's propulsion is derived, and comparisons between the colloidal system and the corresponding effect at the molecular scale are made.

pf

Video - Nanospropellers DOI [BibTex]

Video - Nanospropellers DOI [BibTex]


Thumb xl screen shot 2015 08 23 at 00.29.36
Fusing visual and tactile sensing for 3-D object reconstruction while grasping

Ilonen, J., Bohg, J., Kyrki, V.

In IEEE International Conference on Robotics and Automation (ICRA), pages: 3547-3554, 2013 (inproceedings)

Abstract
In this work, we propose to reconstruct a complete 3-D model of an unknown object by fusion of visual and tactile information while the object is grasped. Assuming the object is symmetric, a first hypothesis of its complete 3-D shape is generated from a single view. This initial model is used to plan a grasp on the object which is then executed with a robotic manipulator equipped with tactile sensors. Given the detected contacts between the fingers and the object, the full object model including the symmetry parameters can be refined. This refined model will then allow the planning of more complex manipulation tasks. The main contribution of this work is an optimal estimation approach for the fusion of visual and tactile data applying the constraint of object symmetry. The fusion is formulated as a state estimation problem and solved with an iterative extended Kalman filter. The approach is validated experimentally using both artificial and real data from two different robotic platforms.

am

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl toc image
Indirect absorption spectroscopy using quantum cascade lasers: mid-infrared refractometry and photothermal spectroscopy

Pfeifer, M., Ruf, A., Fischer, P.

OPTICS EXPRESS, 21(22):25643-25654, 2013 (article)

Abstract
We record vibrational spectra with two indirect schemes that depend on the real part of the index of refraction: mid-infrared refractometry and photothermal spectroscopy. In the former, a quantum cascade laser (QCL) spot is imaged to determine the angles of total internal reflection, which yields the absorption line via a beam profile analysis. In the photothermal measurements, a tunable QCL excites vibrational resonances of a molecular monolayer, which heats the surrounding medium and changes its refractive index. This is observed with a probe laser in the visible. Sub-monolayer sensitivities are demonstrated. (C) 2013 Optical Society of America

pf

DOI [BibTex]


no image
Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors

Ijspeert, A., Nakanishi, J., Pastor, P., Hoffmann, H., Schaal, S.

Neural Computation, (25):328-373, 2013, clmc (article)

Abstract
Nonlinear dynamical systems have been used in many disciplines to model complex behaviors, including biological motor control, robotics, perception, economics, traffic prediction, and neuroscience. While often the unexpected emergent behavior of nonlinear systems is the focus of investigations, it is of equal importance to create goal-directed behavior (e.g., stable locomotion from a system of coupled oscillators under perceptual guidance). Modeling goal-directed behavior with nonlinear systems is, however, rather difficult due to the parameter sensitivity of these systems, their complex phase transitions in response to subtle parameter changes, and the difficulty of analyzing and predicting their long-term behavior; intuition and time-consuming parameter tuning play a major role. This letter presents and reviews dynamical movement primitives, a line of research for modeling attractor behaviors of autonomous nonlinear dynamical systems with the help of statistical learning techniques. The essence of our approach is to start with a simple dynamical system, such as a set of linear differential equations, and transform those into a weakly nonlinear system with prescribed attractor dynamics by meansof a learnable autonomous forcing term. Both point attractors and limit cycle attractors of almost arbitrary complexity can be generated. We explain the design principle of our approach and evaluate its properties in several example applications in motor control and robotics.

am

link (url) [BibTex]

link (url) [BibTex]


Thumb xl applied physics cover vol 103 number 21
Plasmonic nanohelix metamaterials with tailorable giant circular dichroism

Gibbs, J. G., Mark, A. G., Eslami, S., Fischer, P.

APPLIED PHYSICS LETTERS, 103(21), 2013, Featured cover article. (article)

Abstract
Plasmonic nanohelix arrays are shown to interact with electromagnetic fields in ways not typically seen with ordinary matter. Chiral metamaterials (CMMs) with feature sizes small with respect to the wavelength of visible light are a promising route to experimentally achieve such phenomena as negative refraction without the need for simultaneously negative e and mu. Here we not only show that giant circular dichroism in the visible is achievable with hexagonally arranged plasmonic nanohelix arrays, but that we can precisely tune the optical activity via morphology and lattice spacing. The discrete dipole approximation is implemented to support experimental data. (C) 2013 AIP Publishing LLC.

Featured cover article.

pf

DOI [BibTex]

DOI [BibTex]


no image
Learning Objective Functions for Manipulation

Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In 2013 IEEE International Conference on Robotics and Automation, IEEE, Karlsruhe, Germany, 2013 (inproceedings)

Abstract
We present an approach to learning objective functions for robotic manipulation based on inverse reinforcement learning. Our path integral inverse reinforcement learning algorithm can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories. We use L 1 regularization in order to achieve feature selection, and propose an efficient algorithm to minimize the resulting convex objective function. We demonstrate our approach by applying it to two core problems in robotic manipulation. First, we learn a cost function for redundancy resolution in inverse kinematics. Second, we use our method to learn a cost function over trajectories, which is then used in optimization-based motion planning for grasping and manipulation tasks. Experimental results show that our method outperforms previous algorithms in high-dimensional settings.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Using Torque Redundancy to Optimize Contact Forces in Legged Robots

Righetti, L., Buchli, J., Mistry, M., Kalakrishnan, M., Schaal, S.

In Redundancy in Robot Manipulators and Multi-Robot Systems, 57, pages: 35-51, Lecture Notes in Electrical Engineering, Springer Berlin Heidelberg, 2013 (incollection)

Abstract
The development of legged robots for complex environments requires controllers that guarantee both high tracking performance and compliance with the environment. More specifically the control of contact interaction with the environment is of crucial importance to ensure stable, robust and safe motions. In the following, we present an inverse dynamics controller that exploits torque redundancy to directly and explicitly minimize any combination of linear and quadratic costs in the contact constraints and in the commands. Such a result is particularly relevant for legged robots as it allows to use torque redundancy to directly optimize contact interactions. For example, given a desired locomotion behavior, it can guarantee the minimization of contact forces to reduce slipping on difficult terrains while ensuring high tracking performance of the desired motion. The proposed controller is very simple and computationally efficient, and most importantly it can greatly improve the performance of legged locomotion on difficult terrains as can be seen in the experimental results.

am mg

link (url) [BibTex]

link (url) [BibTex]


no image
Optimal distribution of contact forces with inverse-dynamics control

Righetti, L., Buchli, J., Mistry, M., Kalakrishnan, M., Schaal, S.

The International Journal of Robotics Research, 32(3):280-298, March 2013 (article)

Abstract
The development of legged robots for complex environments requires controllers that guarantee both high tracking performance and compliance with the environment. More specifically the control of the contact interaction with the environment is of crucial importance to ensure stable, robust and safe motions. In this contribution we develop an inverse-dynamics controller for floating-base robots under contact constraints that can minimize any combination of linear and quadratic costs in the contact constraints and the commands. Our main result is the exact analytical derivation of the controller. Such a result is particularly relevant for legged robots as it allows us to use torque redundancy to directly optimize contact interactions. For example, given a desired locomotion behavior, we can guarantee the minimization of contact forces to reduce slipping on difficult terrains while ensuring high tracking performance of the desired motion. The main advantages of the controller are its simplicity, computational efficiency and robustness to model inaccuracies. We present detailed experimental results on simulated humanoid and quadruped robots as well as a real quadruped robot. The experiments demonstrate that the controller can greatly improve the robustness of locomotion of the robots.1

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Task Error Models for Manipulation

Pastor, P., Kalakrishnan, M., Binney, J., Kelly, J., Righetti, L., Sukhatme, G. S., Schaal, S.

In 2013 IEEE Conference on Robotics and Automation, IEEE, Karlsruhe, Germany, 2013 (inproceedings)

Abstract
Precise kinematic forward models are important for robots to successfully perform dexterous grasping and manipulation tasks, especially when visual servoing is rendered infeasible due to occlusions. A lot of research has been conducted to estimate geometric and non-geometric parameters of kinematic chains to minimize reconstruction errors. However, kinematic chains can include non-linearities, e.g. due to cable stretch and motor-side encoders, that result in significantly different errors for different parts of the state space. Previous work either does not consider such non-linearities or proposes to estimate non-geometric parameters of carefully engineered models that are robot specific. We propose a data-driven approach that learns task error models that account for such unmodeled non-linearities. We argue that in the context of grasping and manipulation, it is sufficient to achieve high accuracy in the task relevant state space. We identify this relevant state space using previously executed joint configurations and learn error corrections for those. Therefore, our system is developed to generate subsequent executions that are similar to previous ones. The experiments show that our method successfully captures the non-linearities in the head kinematic chain (due to a counterbalancing spring) and the arm kinematic chains (due to cable stretch) of the considered experimental platform, see Fig. 1. The feasibility of the presented error learning approach has also been evaluated in independent DARPA ARM-S testing contributing to successfully complete 67 out of 72 grasping and manipulation tasks.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2011


no image
Learning, planning, and control for quadruped locomotion over challenging terrain

Kalakrishnan, Mrinal, Buchli, Jonas, Pastor, Peter, Mistry, Michael, Schaal, S.

International Journal of Robotics Research, 30(2):236-258, February 2011 (article)

am

[BibTex]

2011


[BibTex]


Thumb xl toc image
Quantum-Cascade Laser-Based Vibrational Circular Dichroism

Luedeke, S., Pfeifer, M., Fischer, P.

JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 133(15):5704-5707, 2011 (article)

Abstract
Vibrational circular dichroism (VCD) spectra were recorded with a tunable external-cavity quantum-cascade laser (QCL). In comparison with standard thermal light sources in the IR, QCLs provide orders of magnitude more power and are therefore promising for VCD studies in strongly absorbing solvents. The brightness of this novel light source is demonstrated with VCD and IR absorption measurements of a number of compounds, including proline in water.

pf

DOI [BibTex]

DOI [BibTex]


no image
STOMP: Stochastic trajectory optimization for motion planning

Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a dual-arm mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based optimizers like CHOMP can get stuck in.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl toc image
Actively coupled cavity ringdown spectroscopy with low-power broadband sources

Petermann, C., Fischer, P.

OPTICS EXPRESS, 19(11):10164-10173, 2011 (article)

Abstract
We demonstrate a coupling scheme for cavity enhanced absorption spectroscopy that makes use of an intracavity acousto-optical modulator to actively switch light into (and out of) a resonator. This allows cavity ringdown spectroscopy (CRDS) to be implemented with broadband nonlaser light sources with spectral power densities of less than 30 mu W/nm. Although the acousto-optical element reduces the ultimate detection limit by introducing additional losses, it permits absorptivities to be measured with a high dynamic range, especially in lossy environments. Absorption measurements for the forbidden transition of gaseous oxygen in air at similar to 760nm are presented using a low-coherence cw-superluminescent diode. The same setup was electronically configured to cover absorption losses from 1.8 x 10(-8)cm(-1) to 7.5\% per roundtrip. This could be of interest in process analytical applications. (C) 2011 Optical Society of America

pf

DOI [BibTex]

DOI [BibTex]


no image
Path Integral Control and Bounded Rationality

Braun, D. A., Ortega, P. A., Theodorou, E., Schaal, S.

In IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011, clmc (inproceedings)

Abstract
Path integral methods [7], [15],[1] have recently been shown to be applicable to a very general class of optimal control problems. Here we examine the path integral formalism from a decision-theoretic point of view, since an optimal controller can always be regarded as an instance of a perfectly rational decision-maker that chooses its actions so as to maximize its expected utility [8]. The problem with perfect rationality is, however, that finding optimal actions is often very difficult due to prohibitive computational resource costs that are not taken into account. In contrast, a bounded rational decision-maker has only limited resources and therefore needs to strike some compromise between the desired utility and the required resource costs [14]. In particular, we suggest an information-theoretic measure of resource costs that can be derived axiomatically [11]. As a consequence we obtain a variational principle for choice probabilities that trades off maximizing a given utility criterion and avoiding resource costs that arise due to deviating from initially given default choice probabilities. The resulting bounded rational policies are in general probabilistic. We show that the solutions found by the path integral formalism are such bounded rational policies. Furthermore, we show that the same formalism generalizes to discrete control problems, leading to linearly solvable bounded rational control policies in the case of Markov systems. Importantly, Bellman?s optimality principle is not presupposed by this variational principle, but it can be derived as a limit case. This suggests that the information- theoretic formalization of bounded rationality might serve as a general principle in control design that unifies a number of recently reported approximate optimal control methods both in the continuous and discrete domain.

am

PDF [BibTex]

PDF [BibTex]


no image
Skill learning and task outcome prediction for manipulation

Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
Learning complex motor skills for real world tasks is a hard problem in robotic manipulation that often requires painstaking manual tuning and design by a human expert. In this work, we present a Reinforcement Learning based approach to acquiring new motor skills from demonstration. Our approach allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration. Our approach aims to incorporate task domain knowledge, where appropriate, by working in a space consistent with the constraints of a specific task. In addition, we also present an approach to using sensor feedback to learn a predictive model of the task outcome. This allows our system to learn the proprioceptive sensor feedback needed to monitor subsequent executions of the task online and abort execution in the event of predicted failure. We illustrate our approach using two example tasks executed with the PR2 dual-arm robot: a straight and accurate pool stroke and a box flipping task using two chopsticks as tools.

am

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Theodorou, E., Stulp, F., Buchli, J., Schaal, S.

In Proceedings of the 18th World Congress of the International Federation of Automatic Control, 2011, clmc (inproceedings)

Abstract
Recent work on path integral stochastic optimal control theory Theodorou et al. (2010a); Theodorou (2011) has shown promising results in planning and control of nonlinear systems in high dimensional state spaces. The path integral control framework relies on the transformation of the nonlinear Hamilton Jacobi Bellman (HJB) partial differential equation (PDE) into a linear PDE and the approximation of its solution via the use of the Feynman Kac lemma. In this work, we are reviewing the generalized version of path integral stochastic optimal control formalism Theodorou et al. (2010a), used for optimal control and planing of stochastic dynamical systems with state dependent control and diffusion matrices. Moreover we present the iterative path integral control approach, the so called Policy Improvement with Path Integrals or (PI2 ) which is capable of scaling in high dimensional robotic control problems. Furthermore we present a convergence analysis of the proposed algorithm and we apply the proposed framework to a variety of robotic tasks. Finally with the goal to perform locomotion the iterative path integral control is applied for learning nonlinear limit cycle attractors with adjustable land scape.

am

PDF [BibTex]

PDF [BibTex]


Thumb xl toc image
Magnetically actuated propulsion at low Reynolds numbers: towards nanoscale control

Fischer, P., Ghosh, A.

NANOSCALE, 3(2):557-563, 2011 (article)

Abstract
Significant progress has been made in the fabrication of micron and sub-micron structures whose motion can be controlled in liquids under ambient conditions. The aim of many of these engineering endeavors is to be able to build and propel an artificial micro-structure that rivals the versatility of biological swimmers of similar size, e. g. motile bacterial cells. Applications for such artificial ``micro-bots'' are envisioned to range from microrheology to targeted drug delivery and microsurgery, and require full motion-control under ambient conditions. In this Mini-Review we discuss the construction, actuation, and operation of several devices that have recently been reported, especially systems that can be controlled by and propelled with homogenous magnetic fields. We describe the fabrication and associated experimental challenges and discuss potential applications.

pf

Video - Nanospropellers DOI [BibTex]


no image
Bayesian robot system identification with input and output noise

Ting, J., D’Souza, A., Schaal, S.

Neural Networks, 24(1):99-108, 2011, clmc (article)

Abstract
For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning variable impedance control

Buchli, J., Stulp, F., Theodorou, E., Schaal, S.

International Journal of Robotics Research, 2011, clmc (article)

Abstract
One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high degree-of-freedom (DOF) robotic tasks. In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PISq ({f P}olicy {f I}mprovement with {f P}ath {f I}ntegrals). PISq is a model-free, sampling based learning method derived from first principles of stochastic optimal control. The PISq algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PISq is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems becomes feasible. We sketch the PISq algorithm and its theoretical properties, and how it is applied to gain scheduling for variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider tasks involving accurate tracking through via-points, and manipulation tasks requiring physical contact with the environment. In these tasks, the optimal strategy requires both tuning of a reference trajectory emph{and} the impedance of the end-effector. The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Iterative path integral stochastic optimal control: Theory and applications to motor control

Theodorou, E. A.

University of Southern California, University of Southern California, Los Angeles, CA, 2011 (phdthesis)

am

PDF [BibTex]

PDF [BibTex]


no image
Learning of grasp selection based on shape-templates

Herzog, A.

Karlsruhe Institute of Technology, 2011 (mastersthesis)

am

[BibTex]

[BibTex]


Thumb xl toc image
Weak value amplified optical activity measurements

Pfeifer, M., Fischer, P.

Opt. Express, 19(17):16508-16517, OSA, 2011 (article)

Abstract
We present a new form of optical activity measurement based on a modified weak value amplification scheme. It has recently been shown experimentally that the left- and right-circular polarization components refract with slightly different angles of refraction at a chiral interface causing a linearly polarized light beam to split into two. By introducing a polarization modulation that does not give rise to a change in the optical rotation it is possible to differentiate between the two circular polarization components even after post-selection with a linear polarizer. We show that such a modified weak value amplification measurement permits the sign of the splitting and thus the handedness of the optically active medium to be determined. Angular beam separations of Δθ ∼ 1 nanoradian, which corresponds to a circular birefringence of Δn ∼ 1 × 10−9, could be measured with a relative error of less than 1%.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Understanding haptics by evolving mechatronic systems

Loeb, G. E., Tsianos, G.A., Fishel, J.A., Wettels, N., Schaal, S.

Progress in Brain Research, 192, pages: 129, 2011 (article)

am

[BibTex]

[BibTex]


no image
Movement segmentation using a primitive library

Meier, F., Theodorou, E., Stulp, F., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), Sept. 25-30, San Francisco, CA, 2011, clmc (inproceedings)

Abstract
Segmenting complex movements into a sequence of primitives remains a difficult problem with many applications in the robotics and vision communities. In this work, we show how the movement segmentation problem can be reduced to a sequential movement recognition problem. To this end, we reformulate the orig-inal Dynamic Movement Primitive (DMP) formulation as a linear dynamical sys-tem with control inputs. Based on this new formulation, we develop an Expecta-tion-Maximization algorithm to estimate the duration and goal position of a par-tially observed trajectory. With the help of this algorithm and the assumption that a library of movement primitives is present, we present a movement seg-mentation framework. We illustrate the usefulness of the new DMP formulation on the two applications of online movement recognition and movement segmen-tation.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Force Control Policies for Compliant Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 4639-4644, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Developing robots capable of fine manipulation skills is of major importance in order to build truly assistive robots. These robots need to be compliant in their actuation and control in order to operate safely in human environments. Manipulation tasks imply complex contact interactions with the external world, and involve reasoning about the forces and torques to be applied. Planning under contact conditions is usually impractical due to computational complexity, and a lack of precise dynamics models of the environment. We present an approach to acquiring manipulation skills on compliant robots through reinforcement learning. The initial position control policy for manipulation is initialized through kinesthetic demonstration. We augment this policy with a force/torque profile to be controlled in combination with the position trajectories. We use the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success. We demonstrate our approach on the Barrett WAM robot arm equipped with a 6-DOF force/torque sensor on two different manipulation tasks: opening a door with a lever door handle, and picking up a pen off the table. We show that the learnt force control policies allow successful, robust execution of the tasks.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Control of legged robots with optimal distribution of contact forces

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 11th IEEE-RAS International Conference on Humanoid Robots, pages: 318-324, IEEE, Bled, Slovenia, 2011 (inproceedings)

Abstract
The development of agile and safe humanoid robots require controllers that guarantee both high tracking performance and compliance with the environment. More specifically, the control of contact interaction is of crucial importance for robots that will actively interact with their environment. Model-based controllers such as inverse dynamics or operational space control are very appealing as they offer both high tracking performance and compliance. However, while widely used for fully actuated systems such as manipulators, they are not yet standard controllers for legged robots such as humanoids. Indeed such robots are fundamentally different from manipulators as they are underactuated due to their floating-base and subject to switching contact constraints. In this paper we present an inverse dynamics controller for legged robots that use torque redundancy to create an optimal distribution of contact constraints. The resulting controller is able to minimize, given a desired motion, any quadratic cost of the contact constraints at each instant of time. In particular we show how this can be used to minimize tangential forces during locomotion, therefore significantly improving the locomotion of legged robots on difficult terrains. In addition to the theoretical result, we present simulations of a humanoid and a quadruped robot, as well as experiments on a real quadruped robot that demonstrate the advantages of the controller.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Motion Primitive Goals for Robust Manipulation

Stulp, F., Theodorou, E., Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 325-331, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object. Finally, there is uncertainty in the object pose, and even the most carefully planned movement may fail if the object is not at the expected position. To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI2; 2) extend PI2 so that it simultaneously learns shape parameters and goal parameters of motion primitives; 3) use shape and goal learning to acquire motion primitives that are robust to object pose uncertainty. We evaluate these contributions on a manipulation platform consisting of a 7-DOF arm with a 4-DOF hand.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inverse Dynamics Control of Floating-Base Robots with External Constraints: a Unified View

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 IEEE International Conference on Robotics and Automation, pages: 1085-1090, IEEE, Shanghai, China, 2011 (inproceedings)

Abstract
Inverse dynamics controllers and operational space controllers have proved to be very efficient for compliant control of fully actuated robots such as fixed base manipulators. However legged robots such as humanoids are inherently different as they are underactuated and subject to switching external contact constraints. Recently several methods have been proposed to create inverse dynamics controllers and operational space controllers for these robots. In an attempt to compare these different approaches, we develop a general framework for inverse dynamics control and show that these methods lead to very similar controllers. We are then able to greatly simplify recent whole-body controllers based on operational space approaches using kinematic projections, bringing them closer to efficient practical implementations. We also generalize these controllers such that they can be optimal under an arbitrary quadratic cost in the commands.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Online movement adaptation based on previous sensor experiences

Pastor, P., Righetti, L., Kalakrishnan, M., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 365-371, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Personal robots can only become widespread if they are capable of safely operating among humans. In uncertain and highly dynamic environments such as human households, robots need to be able to instantly adapt their behavior to unforseen events. In this paper, we propose a general framework to achieve very contact-reactive motions for robotic grasping and manipulation. Associating stereotypical movements to particular tasks enables our system to use previous sensor experiences as a predictive model for subsequent task executions. We use dynamical systems, named Dynamic Movement Primitives (DMPs), to learn goal-directed behaviors from demonstration. We exploit their dynamic properties by coupling them with the measured and predicted sensor traces. This feedback loop allows for online adaptation of the movement plan. Our system can create a rich set of possible motions that account for external perturbations and perception uncertainty to generate truly robust behaviors. As an example, we present an application to grasping with the WAM robot arm.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning to grasp under uncertainty

Stulp, F., Theodorou, E., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2011 IEEE International Conference on, Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
We present an approach that enables robots to learn motion primitives that are robust towards state estimation uncertainties. During reaching and preshaping, the robot learns to use fine manipulation strategies to maneuver the object into a pose at which closing the hand to perform the grasp is more likely to succeed. In contrast, common assumptions in grasp planning and motion planning for reaching are that these tasks can be performed independently, and that the robot has perfect knowledge of the pose of the objects in the environment. We implement our approach using Dynamic Movement Primitives and the probabilistic model-free reinforcement learning algorithm Policy Improvement with Path Integrals (PI2 ). The cost function that PI2 optimizes is a simple boolean that penalizes failed grasps. The key to acquiring robust motion primitives is to sample the actual pose of the object from a distribution that represents the state estimation uncertainty. During learning, the robot will thus optimize the chance of grasping an object from this distribution, rather than at one specific pose. In our empirical evaluation, we demonstrate how the motion primitives become more robust when grasping simple cylindrical objects, as well as more complex, non-convex objects. We also investigate how well the learned motion primitives generalize towards new object positions and other state estimation uncertainty distributions.

am

link (url) [BibTex]

link (url) [BibTex]

2007


Thumb xl toc image
Frequency-domain displacement sensing with a fiber ring-resonator containing a variable gap

Vollmer, F., Fischer, P.

SENSORS AND ACTUATORS A-PHYSICAL, 134(2):410-413, 2007 (article)

Abstract
Ring-resonators are in general not amenable to strain-free (non-contact) displacement measurements. We show that this limitation may be overcome if the ring-resonator, here a fiber-loop, is designed to contain a gap, such that the light traverses a free-space part between two aligned waveguide ends. Displacements are determined with nanometer sensitivity by measuring the associated changes in the resonance frequencies. Miniaturization should increase the sensitivity of the ring-resonator interferometer. Ring geometries that contain an optical circulator can be used to profile reflective samples. (c) 2006 Elsevier B.V. All rights reserved.

pf

DOI [BibTex]

2007


DOI [BibTex]


no image
Towards Machine Learning of Motor Skills

Peters, J., Schaal, S., Schölkopf, B.

In Proceedings of Autonome Mobile Systeme (AMS), pages: 138-144, (Editors: K Berns and T Luksch), 2007, clmc (inproceedings)

Abstract
Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement Learning for Optimal Control of Arm Movements

Theodorou, E., Peters, J., Schaal, S.

In Abstracts of the 37st Meeting of the Society of Neuroscience., Neuroscience, 2007, clmc (inproceedings)

Abstract
Every day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible.

am ei

[BibTex]

[BibTex]


no image
Reinforcement learning by reward-weighted regression for operational space control

Peters, J., Schaal, S.

In Proceedings of the 24th Annual International Conference on Machine Learning, pages: 745-750, ICML, 2007, clmc (inproceedings)

Abstract
Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Policy gradient methods for machine learning

Peters, J., Theodorou, E., Schaal, S.

In Proceedings of the 14th INFORMS Conference of the Applied Probability Society, pages: 97-98, Eindhoven, Netherlands, July 9-11, 2007, 2007, clmc (inproceedings)

Abstract
We present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from ÔvanillaÕ policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics.

am ei

[BibTex]

[BibTex]


no image
Policy Learning for Motor Skills

Peters, J., Schaal, S.

In Proceedings of 14th International Conference on Neural Information Processing (ICONIP), pages: 233-242, (Editors: Ishikawa, M. , K. Doya, H. Miyamoto, T. Yamakawa), 2007, clmc (inproceedings)

Abstract
Policy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement learning for operational space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, pages: 2111-2116, IEEE Computer Society, ICRA, 2007, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

am ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Thumb xl toc image
Observation of the Faraday effect via beam deflection in a longitudinal magnetic field

Ghosh, A., Hill, W., Fischer, P.

PHYSICAL REVIEW A, 76(5), 2007 (article)

Abstract
We show that magnetic-field-induced circular differential deflection of light can be observed in reflection or refraction at a single interface. The difference in the reflection or refraction angles between the two circular polarization components is a function of the magnetic-field strength and the Verdet constant, and permits the observation of the Faraday effect not via polarization rotation in transmission, but via changes in the propagation direction. Deflection measurements do not suffer from n-pi ambiguities and are shown to be another means to map magnetic fields with high axial resolution, or to determine the sign and magnitude of magnetic-field pulses in a single measurement.

pf

DOI [BibTex]


no image
Using reward-weighted regression for reinforcement learning of task space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages: 262-267, Honolulu, Hawaii, April 1-5, 2007, 2007, clmc (inproceedings)

Abstract
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]