Header logo is


2014


Thumb xl thumb schoenbein2014iros
Omnidirectional 3D Reconstruction in Augmented Manhattan Worlds

Schoenbein, M., Geiger, A.

International Conference on Intelligent Robots and Systems, pages: 716 - 723, IEEE, Chicago, IL, USA, IEEE/RSJ International Conference on Intelligent Robots and System, October 2014 (conference)

Abstract
This paper proposes a method for high-quality omnidirectional 3D reconstruction of augmented Manhattan worlds from catadioptric stereo video sequences. In contrast to existing works we do not rely on constructing virtual perspective views, but instead propose to optimize depth jointly in a unified omnidirectional space. Furthermore, we show that plane-based prior models can be applied even though planes in 3D do not project to planes in the omnidirectional domain. Towards this goal, we propose an omnidirectional slanted-plane Markov random field model which relies on plane hypotheses extracted using a novel voting scheme for 3D planes in omnidirectional space. To quantitatively evaluate our method we introduce a dataset which we have captured using our autonomous driving platform AnnieWAY which we equipped with two horizontally aligned catadioptric cameras and a Velodyne HDL-64E laser scanner for precise ground truth depth measurements. As evidenced by our experiments, the proposed method clearly benefits from the unified view and significantly outperforms existing stereo matching techniques both quantitatively and qualitatively. Furthermore, our method is able to reduce noise and the obtained depth maps can be represented very compactly by a small number of image segments and plane parameters.

avg ps

pdf DOI [BibTex]

2014


pdf DOI [BibTex]


Thumb xl screen shot 2014 07 09 at 15.49.27
Robot Arm Pose Estimation through Pixel-Wise Part Classification

Bohg, J., Romero, J., Herzog, A., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA) 2014, pages: 3143-3150, IEEE International Conference on Robotics and Automation (ICRA), June 2014 (inproceedings)

Abstract
We propose to frame the problem of marker-less robot arm pose estimation as a pixel-wise part classification problem. As input, we use a depth image in which each pixel is classified to be either from a particular robot part or the background. The classifier is a random decision forest trained on a large number of synthetically generated and labeled depth images. From all the training samples ending up at a leaf node, a set of offsets is learned that votes for relative joint positions. Pooling these votes over all foreground pixels and subsequent clustering gives us an estimate of the true joint positions. Due to the intrinsic parallelism of pixel-wise classification, this approach can run in super real-time and is more efficient than previous ICP-like methods. We quantitatively evaluate the accuracy of this approach on synthetic data. We also demonstrate that the method produces accurate joint estimates on real data despite being purely trained on synthetic data.

am ps

video code pdf DOI Project Page [BibTex]

video code pdf DOI Project Page [BibTex]


Thumb xl roser
Simultaneous Underwater Visibility Assessment, Enhancement and Improved Stereo

Roser, M., Dunbabin, M., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 3840 - 3847 , Hong Kong, China, IEEE International Conference on Robotics and Automation, June 2014 (conference)

Abstract
Vision-based underwater navigation and obstacle avoidance demands robust computer vision algorithms, particularly for operation in turbid water with reduced visibility. This paper describes a novel method for the simultaneous underwater image quality assessment, visibility enhancement and disparity computation to increase stereo range resolution under dynamic, natural lighting and turbid conditions. The technique estimates the visibility properties from a sparse 3D map of the original degraded image using a physical underwater light attenuation model. Firstly, an iterated distance-adaptive image contrast enhancement enables a dense disparity computation and visibility estimation. Secondly, using a light attenuation model for ocean water, a color corrected stereo underwater image is obtained along with a visibility distance estimate. Experimental results in shallow, naturally lit, high-turbidity coastal environments show the proposed technique improves range estimation over the original images as well as image quality and color for habitat classification. Furthermore, the recursiveness and robustness of the technique allows real-time implementation onboard an Autonomous Underwater Vehicles for improved navigation and obstacle avoidance performance.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl schoenbein
Calibrating and Centering Quasi-Central Catadioptric Cameras

Schoenbein, M., Strauss, T., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 4443 - 4450, Hong Kong, China, IEEE International Conference on Robotics and Automation, June 2014 (conference)

Abstract
Non-central catadioptric models are able to cope with irregular camera setups and inaccuracies in the manufacturing process but are computationally demanding and thus not suitable for robotic applications. On the other hand, calibrating a quasi-central (almost central) system with a central model introduces errors due to a wrong relationship between the viewing ray orientations and the pixels on the image sensor. In this paper, we propose a central approximation to quasi-central catadioptric camera systems that is both accurate and efficient. We observe that the distance to points in 3D is typically large compared to deviations from the single viewpoint. Thus, we first calibrate the system using a state-of-the-art non-central camera model. Next, we show that by remapping the observations we are able to match the orientation of the viewing rays of a much simpler single viewpoint model with the true ray orientations. While our approximation is general and applicable to all quasi-central camera systems, we focus on one of the most common cases in practice: hypercatadioptric cameras. We compare our model to a variety of baselines in synthetic and real localization and motion estimation experiments. We show that by using the proposed model we are able to achieve near non-central accuracy while obtaining speed-ups of more than three orders of magnitude compared to state-of-the-art non-central models.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl fig1
3D nanofabrication on complex seed shapes using glancing angle deposition

Hyeon-Ho, J., Mark, A. G., Gibbs, J. G., Reindl, T., Waizmann, U., Weis, J., Fischer, P.

In 2014 IEEE 27th International Conference on Micro Electro Mechanical Systems (MEMS), pages: 437-440, January 2014 (inproceedings)

Abstract
Three-dimensional (3D) fabrication techniques promise new device architectures and enable the integration of more components, but fabricating 3D nanostructures for device applications remains challenging. Recently, we have performed glancing angle deposition (GLAD) upon a nanoscale hexagonal seed array to create a variety of 3D nanoscale objects including multicomponent rods, helices, and zigzags [1]. Here, in an effort to generalize our technique, we present a step-by-step approach to grow 3D nanostructures on more complex nanoseed shapes and configurations than before. This approach allows us to create 3D nanostructures on nanoseeds regardless of seed sizes and shapes.

pf

DOI [BibTex]

DOI [BibTex]


no image
A Self-Tuning LQR Approach Demonstrated on an Inverted Pendulum

Trimpe, S., Millane, A., Doessegger, S., D’Andrea, R.

In Proceedings of the 19th IFAC World Congress, Cape Town, South Africa, 2014 (inproceedings)

am ics

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Learning coupling terms for obstacle avoidance

Rai, A., Meier, F., Ijspeert, A., Schaal, S.

In International Conference on Humanoid Robotics, pages: 512-518, IEEE, 2014, clmc (inproceedings)

Abstract
Autonomous manipulation in dynamic environments is important for robots to perform everyday tasks. For this, a manipulator should be capable of interpreting the environment and planning an appropriate movement. At least, two possible approaches exist for this in literature. Usually, a planning system is used to generate a complex movement plan that satisfies all constraints. Alternatively, a simple plan could be chosen and modified with sensory feedback to accommodate additional constraints by equipping the controller with features that remain dormant most of the time, except when specific situations arise. Dynamic Movement Primitives (DMPs) form a robust and versatile starting point for such a controller that can be modified online using a non-linear term, called the coupling term. This can prove to be a fast and reactive way of obstacle avoidance in a human-like fashion. We propose a method to learn this coupling term from human demonstrations starting with simple features and making it more robust to avoid a larger range of obstacles. We test the ability of our coupling term to model different kinds of obstacle avoidance behaviours in humans and use this learnt coupling term to avoid obstacles in a reactive manner. This line of research aims at pushing the boundary of reactive control strategies to more complex scenarios, such that complex and usually computationally more expensive planning methods can be avoided as much as possible.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl toc image
Active Microrheology of the Vitreous of the Eye applied to Nanorobot Propulsion

Qiu, T., Schamel, D., Mark, A. G., Fischer, P.

In 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), pages: 3801-3806, IEEE International Conference on Robotics and Automation ICRA, 2014, Best Automation Paper Award – Finalist. (inproceedings)

Abstract
Biomedical applications of micro or nanorobots require active movement through complex biological fluids. These are generally non-Newtonian (viscoelastic) fluids that are characterized by complicated networks of macromolecules that have size-dependent rheological properties. It has been suggested that an untethered microrobot could assist in retinal surgical procedures. To do this it must navigate the vitreous humor, a hydrated double network of collagen fibrils and high molecular-weight, polyanionic hyaluronan macromolecules. Here, we examine the characteristic size that potential robots must have to traverse vitreous relatively unhindered. We have constructed magnetic tweezers that provide a large gradient of up to 320 T/m to pull sub-micron paramagnetic beads through biological fluids. A novel two-step electrical discharge machining (EDM) approach is used to construct the tips of the magnetic tweezers with a resolution of 30 mu m and high aspect ratio of similar to 17:1 that restricts the magnetic field gradient to the plane of observation. We report measurements on porcine vitreous. In agreement with structural data and passive Brownian diffusion studies we find that the unhindered active propulsion through the eye calls for nanorobots with cross-sections of less than 500 nm.

Best Automation Paper Award – Finalist.

pf

[BibTex]

[BibTex]


no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

am ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

am ei pn

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
Stability Analysis of Distributed Event-Based State Estimation

Trimpe, S.

In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, 2014 (inproceedings)

Abstract
An approach for distributed and event-based state estimation that was proposed in previous work [1] is analyzed and extended to practical networked systems in this paper. Multiple sensor-actuator-agents observe a dynamic process, sporadically exchange their measurements over a broadcast network according to an event-based protocol, and estimate the process state from the received data. The event-based approach was shown in [1] to mimic a centralized Luenberger observer up to guaranteed bounds, under the assumption of identical estimates on all agents. This assumption, however, is unrealistic (it is violated by a single packet drop or slight numerical inaccuracy) and removed herein. By means of a simulation example, it is shown that non-identical estimates can actually destabilize the overall system. To achieve stability, the event-based communication scheme is supplemented by periodic (but infrequent) exchange of the agentsâ?? estimates and reset to their joint average. When the local estimates are used for feedback control, the stability guarantee for the estimation problem extends to the event-based control system.

am ics

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


no image
Dual Execution of Optimized Contact Interaction Trajectories

Toussaint, M., Ratliff, N., Bohg, J., Righetti, L., Englert, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 47-54, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Efficient manipulation requires contact to reduce uncertainty. The manipulation literature refers to this as funneling: a methodology for increasing reliability and robustness by leveraging haptic feedback and control of environmental interaction. However, there is a fundamental gap between traditional approaches to trajectory optimization and this concept of robustness by funneling: traditional trajectory optimizers do not discover force feedback strategies. From a POMDP perspective, these behaviors could be regarded as explicit observation actions planned to sufficiently reduce uncertainty thereby enabling a task. While we are sympathetic to the full POMDP view, solving full continuous-space POMDPs in high-dimensions is hard. In this paper, we propose an alternative approach in which trajectory optimization objectives are augmented with new terms that reward uncertainty reduction through contacts, explicitly promoting funneling. This augmentation shifts the responsibility of robustness toward the actual execution of the optimized trajectories. Directly tracing trajectories through configuration space would lose all robustness-dual execution achieves robustness by devising force controllers to reproduce the temporal interaction profile encoded in the dual solution of the optimization problem. This work introduces dual execution in depth and analyze its performance through robustness experiments in both simulation and on a real-world robotic platform.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning and Exploration in a Novel Dimensionality-Reduction Task

Ebert, J, Kim, S, Schweighofer, N., Sternad, D, Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Amsterdam, Netherlands, 2014 (inproceedings)

am

[BibTex]

[BibTex]


no image
Balancing experiments on a torque-controlled humanoid with hierarchical inverse dynamics

Herzog, A., Righetti, L., Grimminger, F., Pastor, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 981-988, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Recently several hierarchical inverse dynamics controllers based on cascades of quadratic programs have been proposed for application on torque controlled robots. They have important theoretical benefits but have never been implemented on a torque controlled robot where model inaccuracies and real-time computation requirements can be problematic. In this contribution we present an experimental evaluation of these algorithms in the context of balance control for a humanoid robot. The presented experiments demonstrate the applicability of the approach under real robot conditions (i.e. model uncertainty, estimation errors, etc). We propose a simplification of the optimization problem that allows us to decrease computation time enough to implement it in a fast torque control loop. We implement a momentum-based balance controller which shows robust performance in face of unknown disturbances, even when the robot is standing on only one foot. In a second experiment, a tracking task is evaluated to demonstrate the performance of the controller with more complicated hierarchies. Our results show that hierarchical inverse dynamics controllers can be used for feedback control of humanoid robots and that momentum-based balance control can be efficiently implemented on a real robot.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Full Dynamics LQR Control of a Humanoid Robot: An Experimental Study on Balancing and Squatting

Mason, S., Righetti, L., Schaal, S.

In 2014 IEEE-RAS International Conference on Humanoid Robots, pages: 374-379, IEEE, Madrid, Spain, 2014 (inproceedings)

Abstract
Humanoid robots operating in human environments require whole-body controllers that can offer precise tracking and well-defined disturbance rejection behavior. In this contribution, we propose an experimental evaluation of a linear quadratic regulator (LQR) using a linearization of the full robot dynamics together with the contact constraints. The advantage of the controller is that it explicitly takes into account the coupling between the different joints to create optimal feedback controllers for whole-body control. We also propose a method to explicitly regulate other tasks of interest, such as the regulation of the center of mass of the robot or its angular momentum. In order to evaluate the performance of linear optimal control designs in a real-world scenario (model uncertainty, sensor noise, imperfect state estimation, etc), we test the controllers in a variety of tracking and balancing experiments on a torque controlled humanoid (e.g. balancing, split plane balancing, squatting, pushes while squatting, and balancing on a wheeled platform). The proposed control framework shows a reliable push recovery behavior competitive with more sophisticated balance controllers, rejecting impulses up to 11.7 Ns with peak forces of 650 N, with the added advantage of great computational simplicity. Furthermore, the controller is able to track squatting trajectories up to 1 Hz without relinearization, suggesting that the linearized dynamics is sufficient for significant ranges of motion.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
State Estimation for a Humanoid Robot

Rotella, N., Bloesch, M., Righetti, L., Schaal, S.

In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 952-958, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
This paper introduces a framework for state estimation on a humanoid robot platform using only common proprioceptive sensors and knowledge of leg kinematics. The presented approach extends that detailed in prior work on a point-foot quadruped platform by adding the rotational constraints imposed by the humanoid's flat feet. As in previous work, the proposed Extended Kalman Filter accommodates contact switching and makes no assumptions about gait or terrain, making it applicable on any humanoid platform for use in any task. A nonlinear observability analysis is performed on both the point-foot and flat-foot filters and it is concluded that the addition of rotational constraints significantly simplifies singular cases and improves the observability characteristics of the system. Results on a simulated walking dataset demonstrate the performance gain of the flat-foot filter as well as confirm the results of the presented observability analysis.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2008


no image
Human movement generation based on convergent flow fields: A computational model and a behavioral experiment

Hoffmann, H., Schaal, S.

In Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (inproceedings)

am

link (url) [BibTex]

2008


link (url) [BibTex]


no image
Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields

Park, D., Hoffmann, H., Pastor, P., Schaal, S.

In IEEE International Conference on Humanoid Robots, 2008., 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
The dual role of uncertainty in force field learning

Mistry, M., Theodorou, E., Hoffmann, H., Schaal, S.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract
Force field experiments have been a successful paradigm for studying the principles of planning, execution, and learning in human arm movements. Subjects have been shown to cope with the disturbances generated by force fields by learning internal models of the underlying dynamics to predict disturbance effects or by increasing arm impedance (via co-contraction) if a predictive approach becomes infeasible. Several studies have addressed the issue uncertainty in force field learning. Scheidt et al. demonstrated that subjects exposed to a viscous force field of fixed structure but varying strength (randomly changing from trial to trial), learn to adapt to the mean disturbance, regardless of the statistical distribution. Takahashi et al. additionally show a decrease in strength of after-effects after learning in the randomly varying environment. Thus they suggest that the nervous system adopts a dual strategy: learning an internal model of the mean of the random environment, while simultaneously increasing arm impedance to minimize the consequence of errors. In this study, we examine what role variance plays in the learning of uncertain force fields. We use a 7 degree-of-freedom exoskeleton robot as a manipulandum (Sarcos Master Arm, Sarcos, Inc.), and apply a 3D viscous force field of fixed structure and strength randomly selected from trial to trial. Additionally, in separate blocks of trials, we alter the variance of the randomly selected strength multiplier (while keeping a constant mean). In each block, after sufficient learning has occurred, we apply catch trials with no force field and measure the strength of after-effects. As expected in higher variance cases, results show increasingly smaller levels of after-effects as the variance is increased, thus implying subjects choose the robust strategy of increasing arm impedance to cope with higher levels of uncertainty. Interestingly, however, subjects show an increase in after-effect strength with a small amount of variance as compared to the deterministic (zero variance) case. This result implies that a small amount of variability aides in internal model formation, presumably a consequence of the additional amount of exploration conducted in the workspace of the task.

am

[BibTex]

[BibTex]


no image
Dynamic movement primitives for movement generation motivated by convergent force fields in frog

Hoffmann, H., Pastor, P., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
Behavioral experiments on reinforcement learning in human motor control

Hoffmann, H., Theodorou, E., Schaal, S.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract
Reinforcement learning (RL) - learning solely based on reward or cost feedback - is widespread in robotics control and has been also suggested as computational model for human motor control. In human motor control, however, hardly any experiment studied reinforcement learning. Here, we study learning based on visual cost feedback in a reaching task and did three experiments: (1) to establish a simple enough experiment for RL, (2) to study spatial localization of RL, and (3) to study the dependence of RL on the cost function. In experiment (1), subjects sit in front of a drawing tablet and look at a screen onto which the drawing pen's position is projected. Beginning from a start point, their task is to move with the pen through a target point presented on screen. Visual feedback about the pen's position is given only before movement onset. At the end of a movement, subjects get visual feedback only about the cost of this trial. We choose as cost the squared distance between target and virtual pen position at the target line. Above a threshold value, the cost was fixed at this value. In the mapping of the pen's position onto the screen, we added a bias (unknown to subject) and Gaussian noise. As result, subjects could learn the bias, and thus, showed reinforcement learning. In experiment (2), we randomly altered the target position between three different locations (three different directions from start point: -45, 0, 45). For each direction, we chose a different bias. As result, subjects learned all three bias values simultaneously. Thus, RL can be spatially localized. In experiment (3), we varied the sensitivity of the cost function by multiplying the squared distance with a constant value C, while keeping the same cut-off threshold. As in experiment (2), we had three target locations. We assigned to each location a different C value (this assignment was randomized between subjects). Since subjects learned the three locations simultaneously, we could directly compare the effect of the different cost functions. As result, we found an optimal C value; if C was too small (insensitive cost), learning was slow; if C was too large (narrow cost valley), the exploration time was longer and learning delayed. Thus, reinforcement learning in human motor control appears to be sen

am

[BibTex]

[BibTex]


no image
Movement generation by learning from demonstration and generalization to new targets

Pastor, P., Hoffmann, H., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
Combining dynamic movement primitives and potential fields for online obstacle avoidance

Park, D., Hoffmann, H., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), Cleveland, Ohio, 2008, 2008, clmc (inproceedings)

am

link (url) [BibTex]

link (url) [BibTex]


no image
Computational model for movement learning under uncertain cost

Theodorou, E., Hoffmann, H., Mistry, M., Schaal, S.

In Abstracts of the Society of Neuroscience Meeting (SFN 2008), Washington, DC 2008, 2008, clmc (inproceedings)

Abstract
Stochastic optimal control is a framework for computing control commands that lead to an optimal behavior under a given cost. Despite the long history of optimal control in engineering, it has been only recently applied to describe human motion. So far, stochastic optimal control has been mainly used in tasks that are already learned, such as reaching to a target. For learning, however, there are only few cases where optimal control has been applied. The main assumptions of stochastic optimal control that restrict its application to tasks after learning are the a priori knowledge of (1) a quadratic cost function (2) a state space model that captures the kinematics and/or dynamics of musculoskeletal system and (3) a measurement equation that models the proprioceptive and/or exteroceptive feedback. Under these assumptions, a sequence of control gains is computed that is optimal with respect to the prespecified cost function. In our work, we relax the assumption of the a priori known cost function and provide a computational framework for modeling tasks that involve learning. Typically, a cost function consists of two parts: one part that models the task constraints, like squared distance to goal at movement endpoint, and one part that integrates over the squared control commands. In learning a task, the first part of this cost function will be adapted. We use an expectation-maximization scheme for learning: the expectation step optimizes the task constraints through gradient descent of a reward function and the maximizing step optimizes the control commands. Our computational model is tested and compared with data given from a behavioral experiment. In this experiment, subjects sit in front of a drawing tablet and look at a screen onto which the drawing-pen's position is projected. Beginning from a start point, their task is to move with the pen through a target point presented on screen. Visual feedback about the pen's position is given only before movement onset. At the end of a movement, subjects get visual feedback only about the cost of this trial. In the mapping of the pen's position onto the screen, we added a bias (unknown to subject) and Gaussian noise. Therefore the cost is a function of this bias. The subjects were asked to reach to the target and minimize this cost over trials. In this behavioral experiment, subjects could learn the bias and thus showed reinforcement learning. With our computational model, we could model the learning process over trials. Particularly, the dependence on parameters of the reward function (Gaussian width) and the modulation of movement variance over time were similar in experiment and model.

am

[BibTex]

[BibTex]


no image
A Bayesian approach to empirical local linearizations for robotics

Ting, J., D’Souza, A., Vijayakumar, S., Schaal, S.

In International Conference on Robotics and Automation (ICRA2008), Pasadena, CA, USA, May 19-23, 2008, 2008, clmc (inproceedings)

Abstract
Local linearizations are ubiquitous in the control of robotic systems. Analytical methods, if available, can be used to obtain the linearization, but in complex robotics systems where the the dynamics and kinematics are often not faithfully obtainable, empirical linearization may be preferable. In this case, it is important to only use data for the local linearization that lies within a ``reasonable'' linear regime of the system, which can be defined from the Hessian at the point of the linearization -- a quantity that is not available without an analytical model. We introduce a Bayesian approach to solve statistically what constitutes a ``reasonable'' local regime. We approach this problem in the context local linear regression. In contrast to previous locally linear methods, we avoid cross-validation or complex statistical hypothesis testing techniques to find the appropriate local regime. Instead, we treat the parameters of the local regime probabilistically and use approximate Bayesian inference for their estimation. This approach results in an analytical set of iterative update equations that are easily implemented on real robotics systems for real-time applications. As in other locally weighted regressions, our algorithm also lends itself to complete nonlinear function approximation for learning empirical internal models. We sketch the derivation of our Bayesian method and provide evaluations on synthetic data and actual robot data where the analytical linearization was known.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Do humans plan continuous trajectories in kinematic coordinates?

Hoffmann, H., Schaal, S.

In Abstracts of the Society of Neuroscience Meeting (SFN 2008), Washington, DC 2008, 2008, clmc (inproceedings)

Abstract
The planning and execution of human arm movements is still unresolved. An ongoing controversy is whether we plan a movement in kinematic coordinates and convert these coordinates with an inverse internal model into motor commands (like muscle activation) or whether we combine a few muscle synergies or equilibrium points to move a hand, e.g., between two targets. The first hypothesis implies that a planner produces a desired end-effector position for all time points; the second relies on the dynamics of the muscular-skeletal system for a given control command to produce a continuous end-effector trajectory. To distinguish between these two possibilities, we use a visuomotor adaptation experiment. Subjects moved a pen on a graphics tablet and observed the pen's mapped position onto a screen (subjects quickly adapted to this mapping). The task was to move a cursor between two points in a given time window. In the adaptation test, we manipulated the velocity profile of the cursor feedback such that the shape of the trajectories remained unchanged (for straight paths). If humans would use a kinematic plan and map at each time the desired end-effector position onto control commands, subjects should adapt to the above manipulation. In a similar experiment, Wolpert et al (1995) showed adaptation to changes in the curvature of trajectories. This result, however, cannot rule out a shift of an equilibrium point or an additional synergy activation between start and end point of a movement. In our experiment, subjects did two sessions, one control without and one with velocity-profile manipulation. To skew the velocity profile of the cursor trajectory, we added to the current velocity, v, the function 0.8*v*cos(pi + pi*x), where x is the projection of the cursor position onto the start-goal line divided by the distance start to goal (x=0 at the start point). As result, subjects did not adapt to this manipulation: for all subjects, the true hand motion was not significantly modified in a direction consistent with adaptation, despite that the visually presented motion differed significantly from the control motion. One may still argue that this difference in motion was insufficient to be processed visually. Thus, as a control experiment, we replayed control and modified motions to the subjects and asked which of the two motions appeared 'more natural'. Subjects chose the unperturbed motion as more natural significantly better than chance. In summary, for a visuomotor transformation task, the hypothesis of a planned continuous end-effector trajectory predicts adaptation to a modified velocity profile. The current experiment found no adaptation under such transformation.

am

[BibTex]

[BibTex]

2006


no image
Learning operational space control

Peters, J., Schaal, S.

In Robotics: Science and Systems II (RSS 2006), pages: 255-262, (Editors: Gaurav S. Sukhatme and Stefan Schaal and Wolfram Burgard and Dieter Fox), Cambridge, MA: MIT Press, RSS , 2006, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach.

am ei

link (url) [BibTex]

2006


link (url) [BibTex]


no image
Reinforcement Learning for Parameterized Motor Primitives

Peters, J., Schaal, S.

In Proceedings of the 2006 International Joint Conference on Neural Networks, pages: 73-80, IJCNN, 2006, clmc (inproceedings)

Abstract
One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2005


no image
Natural Actor-Critic

Peters, J., Vijayakumar, S., Schaal, S.

In Proceedings of the 16th European Conference on Machine Learning, 3720, pages: 280-291, (Editors: Gama, J.;Camacho, R.;Brazdil, P.;Jorge, A.;Torgo, L.), Springer, ECML, 2005, clmc (inproceedings)

Abstract
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.

am ei

link (url) DOI [BibTex]

2005


link (url) DOI [BibTex]


no image
Comparative experiments on task space control with redundancy resolution

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

In Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3901-3908, Edmonton, Alberta, Canada, Aug. 2-6, IROS, 2005, clmc (inproceedings)

Abstract
Understanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Predicting EMG Data from M1 Neurons with Variational Bayesian Least Squares

Ting, J., D’Souza, A., Yamamoto, K., Yoshioka, T., Hoffman, D., Kakei, S., Sergio, L., Kalaska, J., Kawato, M., Strick, P., Schaal, S.

In Advances in Neural Information Processing Systems 18 (NIPS 2005), (Editors: Weiss, Y.;Schölkopf, B.;Platt, J.), Cambridge, MA: MIT Press, Vancouver, BC, Dec. 6-11, 2005, clmc (inproceedings)

Abstract
An increasing number of projects in neuroscience requires the statistical analysis of high dimensional data sets, as, for instance, in predicting behavior from neural firing, or in operating artificial devices from brain recordings in brain-machine interfaces. Linear analysis techniques remain prevalent in such cases, but classi-cal linear regression approaches are often numercially too fragile in high dimen-sions. In this paper, we address the question of whether EMG data collected from arm movements of monkeys can be faithfully reconstructed with linear ap-proaches from neural activity in primary motor cortex (M1). To achieve robust data analysis, we develop a full Bayesian approach to linear regression that automatically detects and excludes irrelevant features in the data, and regular-izes against overfitting. In comparison with ordinary least squares, stepwise re-gression, partial least squares, and a brute force combinatorial search for the most predictive input features in the data, we demonstrate that the new Bayesian method offers a superior mixture of characteristics in terms of regularization against overfitting, computational efficiency, and ease of use, demonstrating its potential as a drop-in replacement for other linear regression techniques. As neuroscientific results, our analyses demonstrate that EMG data can be well pre-dicted from M1 neurons, further opening the path for possible real-time inter-faces between brains and machines.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Rapbid synchronization and accurate phase-locking of rhythmic motor primitives

Pongas, D., Billard, A., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 2911-2916, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Rhythmic movement is ubiquitous in human and animal behavior, e.g., as in locomotion, dancing, swimming, chewing, scratching, music playing, etc. A particular feature of rhythmic movement in biology is the rapid synchronization and phase locking with other rhythmic events in the environment, for instance music or visual stimuli as in ball juggling. In traditional oscillator theories to rhythmic movement generation, synchronization with another signal is relatively slow, and it is not easy to achieve accurate phase locking with a particular feature of the driving stimulus. Using a recently developed framework of dynamic motor primitives, we demonstrate a novel algorithm for very rapid synchronizaton of a rhythmic movement pattern, which can phase lock any feature of the movement to any particulur event in the driving stimulus. As an example application, we demonstrate how an anthropomorphic robot can use imitation learning to acquire a complex rumming pattern and keep it synchronized with an external rhythm generator that changes its frequency over time.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A new methodology for robot control design

Peters, J., Mistry, M., Udwadia, F. E., Schaal, S.

In The 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005), Long Beach, CA, Sept. 24-28, 2005, clmc (inproceedings)

Abstract
Gauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Arm movement experiments with joint space force fields using an exoskeleton robot

Mistry, M., Mohajerian, P., Schaal, S.

In IEEE Ninth International Conference on Rehabilitation Robotics, pages: 408-413, Chicago, Illinois, June 28-July 1, 2005, clmc (inproceedings)

Abstract
A new experimental platform permits us to study a novel variety of issues of human motor control, particularly full 3-D movements involving the major seven degrees-of-freedom (DOF) of the human arm. We incorporate a seven DOF robot exoskeleton, and can minimize weight and inertia through gravity, Coriolis, and inertia compensation, such that subjects' arm movements are largely unaffected by the manipulandum. Torque perturbations can be individually applied to any or all seven joints of the human arm, thus creating novel dynamic environments, or force fields, for subjects to respond and adapt to. Our first study investigates a joint space force field where the shoulder velocity drives a disturbing force in the elbow joint. Results demonstrate that subjects learn to compensate for the force field within about 100 trials, and from the strong presence of aftereffects when removing the field in some randomized catch trials, that an inverse dynamics, or internal model, of the force field is formed by the nervous system. Interestingly, while post-learning hand trajectories return to baseline, joint space trajectories remained changed in response to the field, indicating that besides learning a model of the force field, the nervous system also chose to exploit the space to minimize the effects of the force field on the realization of the endpoint trajectory plan. Further applications for our apparatus include studies in motor system redundancy resolution and inverse kinematics, as well as rehabilitation.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A unifying framework for the control of robotics systems

Peters, J., Mistry, M., Udwadia, F. E., Cory, R., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 1824-1831, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Recently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of GaussÕ principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics.

am

link (url) [BibTex]

link (url) [BibTex]