Header logo is


2017


Thumb xl amd intentiongan
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets

Hausman, K., Chebotar, Y., Schaal, S., Sukhatme, G., Lim, J.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

am

pdf video [BibTex]

2017


pdf video [BibTex]


Thumb xl fig toyex lqr1kernel 1
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


no image
Synchronicity Trumps Mischief in Rhythmic Human-Robot Social-Physical Interaction

Fitter, N. T., Kuchenbecker, K. J.

In Proceedings of the International Symposium on Robotics Research (ISRR), Puerto Varas, Chile, December 2017 (inproceedings) In press

Abstract
Hand-clapping games and other forms of rhythmic social-physical interaction might help foster human-robot teamwork, but the design of such interactions has scarcely been explored. We leveraged our prior work to enable the Rethink Robotics Baxter Research Robot to competently play one-handed tempo-matching hand-clapping games with a human user. To understand how such a robot’s capabilities and behaviors affect user perception, we created four versions of this interaction: the hand clapping could be initiated by either the robot or the human, and the non-initiating partner could be either cooperative, yielding synchronous motion, or mischievously uncooperative. Twenty adults tested two clapping tempos in each of these four interaction modes in a random order, rating every trial on standardized scales. The study results showed that having the robot initiate the interaction gave it a more dominant perceived personality. Despite previous results on the intrigue of misbehaving robots, we found that moving synchronously with the robot almost always made the interaction more enjoyable, less mentally taxing, less physically demanding, and lower effort for users than asynchronous interactions caused by robot or human mischief. Taken together, our results indicate that cooperative rhythmic social-physical interaction has the potential to strengthen human-robot partnerships.

hi

[BibTex]

[BibTex]


Thumb xl teaser
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

am ics

PDF Project Page [BibTex]

PDF Project Page [BibTex]


no image
A New Data Source for Inverse Dynamics Learning

Kappler, D., Meier, F., Ratliff, N., Schaal, S.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

am

[BibTex]

[BibTex]


no image
Bayesian Regression for Artifact Correction in Electroencephalography

Fiebig, K., Jayaram, V., Hesse, T., Blank, A., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 131-136, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


no image
Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces

Grossberger, L., Hohmann, M. R., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 160-164, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


Thumb xl screen shot 2017 08 01 at 15.41.10
On the relevance of grasp metrics for predicting grasp success

Rubert, C., Kappler, D., Morales, A., Schaal, S., Bohg, J.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings) Accepted

Abstract
We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

am

Project Page [BibTex]

Project Page [BibTex]


no image
Local Bayesian Optimization of Motor Skills

Akrour, R., Sorokin, D., Peters, J., Neumann, G.

Proceedings of the 34th International Conference on Machine Learning, 70, pages: 41-50, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl pilqr cover
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.

Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


no image
Stiffness Perception during Pinching and Dissection with Teleoperated Haptic Forceps

Ng, C., Zareinia, K., Sun, Q., Kuchenbecker, K. J.

In Proceedings of the International Symposium on Robot and Human Interactive Communication (RO-MAN), pages: 456-463, Lisbon, Portugal, August 2017 (inproceedings)

hi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Design of a Parallel Continuum Manipulator for 6-DOF Fingertip Haptic Display

Young, E. M., Kuchenbecker, K. J.

In Proceedings of the IEEE World Haptics Conference (WHC), pages: 599-604, Munich, Germany, June 2017, Finalist for best poster paper (inproceedings)

Abstract
Despite rapid advancements in the field of fingertip haptics, rendering tactile cues with six degrees of freedom (6 DOF) remains an elusive challenge. In this paper, we investigate the potential of displaying fingertip haptic sensations with a 6-DOF parallel continuum manipulator (PCM) that mounts to the user's index finger and moves a contact platform around the fingertip. Compared to traditional mechanisms composed of rigid links and discrete joints, PCMs have the potential to be strong, dexterous, and compact, but they are also more complicated to design. We define the design space of 6-DOF parallel continuum manipulators and outline a process for refining such a device for fingertip haptic applications. Following extensive simulation, we obtain 12 designs that meet our specifications, construct a manually actuated prototype of one such design, and evaluate the simulation's ability to accurately predict the prototype's motion. Finally, we demonstrate the range of deliverable fingertip tactile cues, including a normal force into the finger and shear forces tangent to the finger at three extreme points on the boundary of the fingertip.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
High Magnitude Unidirectional Haptic Force Display Using a Motor/Brake Pair and a Cable

Hu, S., Kuchenbecker, K. J.

In Proceedings of the IEEE World Haptics Conference (WHC), pages: 394-399, Munich, Germany, June 2017 (inproceedings)

Abstract
Clever electromechanical design is required to make the force feedback delivered by a kinesthetic haptic interface both strong and safe. This paper explores a onedimensional haptic force display that combines a DC motor and a magnetic particle brake on the same shaft. Rather than a rigid linkage, a spooled cable connects the user to the actuators to enable a large workspace, reduce the moving mass, and eliminate the sticky residual force from the brake. This design combines the high torque/power ratio of the brake and the active output capabilities of the motor to provide a wider range of forces than can be achieved with either actuator alone. A prototype of this device was built, its performance was characterized, and it was used to simulate constant force sources and virtual springs and dampers. Compared to the conventional design of using only a motor, the hybrid device can output higher unidirectional forces at the expense of free space feeling less free.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
A Wrist-Squeezing Force-Feedback System for Robotic Surgery Training

Brown, J. D., Fernandez, J. N., Cohen, S. P., Kuchenbecker, K. J.

In Proceedings of the IEEE World Haptics Conference (WHC), pages: 107-112, Munich, Germany, June 2017 (inproceedings)

Abstract
Over time, surgical trainees learn to compensate for the lack of haptic feedback in commercial robotic minimally invasive surgical systems. Incorporating touch cues into robotic surgery training could potentially shorten this learning process if the benefits of haptic feedback were sustained after it is removed. In this paper, we develop a wrist-squeezing haptic feedback system and evaluate whether it holds the potential to train novice da Vinci users to reduce the force they exert on a bimanual inanimate training task. Subjects were randomly divided into two groups according to a multiple baseline experimental design. Each of the ten participants moved a ring along a curved wire nine times while the haptic feedback was conditionally withheld, provided, and withheld again. The realtime tactile feedback of applied force magnitude significantly reduced the integral of the force produced by the da Vinci tools on the task materials, and this result remained even when the haptic feedback was removed. Overall, our findings suggest that wrist-squeezing force feedback can play an essential role in helping novice trainees learn to minimize the force they exert with a surgical robot.

hi

DOI [BibTex]

DOI [BibTex]


no image
Handling Scan-Time Parameters in Haptic Surface Classification

Burka, A., Kuchenbecker, K. J.

In Proceedings of the IEEE World Haptics Conference (WHC), pages: 424-429, Munich, Germany, June 2017 (inproceedings)

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl apollo system2 croped
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Thumb xl learning ct block diagram v2
Learning Feedback Terms for Reactive Planning and Control

Rai, A., Sutanto, G., Schaal, S., Meier, F.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl this one
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


no image
Proton 2: Increasing the Sensitivity and Portability of a Visuo-haptic Surface Interaction Recorder

Burka, A., Rajvanshi, A., Allen, S., Kuchenbecker, K. J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 439-445, Singapore, May 2017 (inproceedings)

Abstract
The Portable Robotic Optical/Tactile ObservatioN PACKage (PROTONPACK, or Proton for short) is a new handheld visuo-haptic sensing system that records surface interactions. We previously demonstrated system calibration and a classification task using external motion tracking. This paper details improvements in surface classification performance and removal of the dependence on external motion tracking, necessary before embarking on our goal of gathering a vast surface interaction dataset. Two experiments were performed to refine data collection parameters. After adjusting the placement and filtering of the Proton's high-bandwidth accelerometers, we recorded interactions between two differently-sized steel tooling ball end-effectors (diameter 6.35 and 9.525 mm) and five surfaces. Using features based on normal force, tangential force, end-effector speed, and contact vibration, we trained multi-class SVMs to classify the surfaces using 50 ms chunks of data from each end-effector. Classification accuracies of 84.5% and 91.5% respectively were achieved on unseen test data, an improvement over prior results. In parallel, we pursued on-board motion tracking, using the Proton's camera and fiducial markers. Motion tracks from the external and onboard trackers agree within 2 mm and 0.01 rad RMS, and the accuracy decreases only slightly to 87.7% when using onboard tracking for the 9.525 mm end-effector. These experiments indicate that the Proton 2 is ready for portable data collection.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]

2006


no image
Learning operational space control

Peters, J., Schaal, S.

In Robotics: Science and Systems II (RSS 2006), pages: 255-262, (Editors: Gaurav S. Sukhatme and Stefan Schaal and Wolfram Burgard and Dieter Fox), Cambridge, MA: MIT Press, RSS , 2006, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach.

am ei

link (url) [BibTex]

2006


link (url) [BibTex]


no image
Reinforcement Learning for Parameterized Motor Primitives

Peters, J., Schaal, S.

In Proceedings of the 2006 International Joint Conference on Neural Networks, pages: 73-80, IJCNN, 2006, clmc (inproceedings)

Abstract
One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2000


no image
Reciprocal excitation between biological and robotic research

Schaal, S., Sternad, D., Dean, W., Kotoska, S., Osu, R., Kawato, M.

In Sensor Fusion and Decentralized Control in Robotic Systems III, Proceedings of SPIE, 4196, pages: 30-40, Boston, MA, Nov.5-8, 2000, November 2000, clmc (inproceedings)

Abstract
While biological principles have inspired researchers in computational and engineering research for a long time, there is still rather limited knowledge flow back from computational to biological domains. This paper presents examples of our work where research on anthropomorphic robots lead us to new insights into explaining biological movement phenomena, starting from behavioral studies up to brain imaging studies. Our research over the past years has focused on principles of trajectory formation with nonlinear dynamical systems, on learning internal models for nonlinear control, and on advanced topics like imitation learning. The formal and empirical analyses of the kinematics and dynamics of movements systems and the tasks that they need to perform lead us to suggest principles of motor control that later on we found surprisingly related to human behavior and even brain activity.

am

link (url) [BibTex]

2000


link (url) [BibTex]


no image
Nonlinear dynamical systems as movement primitives

Schaal, S., Kotosaka, S., Sternad, D.

In Humanoids2000, First IEEE-RAS International Conference on Humanoid Robots, CD-Proceedings, Cambridge, MA, September 2000, clmc (inproceedings)

Abstract
This paper explores the idea to create complex human-like movements from movement primitives based on nonlinear attractor dynamics. Each degree-of-freedom of a limb is assumed to have two independent abilities to create movement, one through a discrete dynamic system, and one through a rhythmic system. The discrete system creates point-to-point movements based on internal or external target specifications. The rhythmic system can add an additional oscillatory movement relative to the current position of the discrete system. In the present study, we develop appropriate dynamic systems that can realize the above model, motivate the particular choice of the systems from a biological and engineering point of view, and present simulation results of the performance of such movement primitives. The model was implemented for a drumming task on a humanoid robot

am

link (url) [BibTex]

link (url) [BibTex]


no image
Real Time Learning in Humanoids: A challenge for scalability of Online Algorithms

Vijayakumar, S., Schaal, S.

In Humanoids2000, First IEEE-RAS International Conference on Humanoid Robots, CD-Proceedings, Cambridge, MA, September 2000, clmc (inproceedings)

Abstract
While recent research in neural networks and statistical learning has focused mostly on learning from finite data sets without stringent constraints on computational efficiency, there is an increasing number of learning problems that require real-time performance from an essentially infinite stream of incrementally arriving data. This paper demonstrates how even high-dimensional learning problems of this kind can successfully be dealt with by techniques from nonparametric regression and locally weighted learning. As an example, we describe the application of one of the most advanced of such algorithms, Locally Weighted Projection Regression (LWPR), to the on-line learning of the inverse dynamics model of an actual seven degree-of-freedom anthropomorphic robot arm. LWPR's linear computational complexity in the number of input dimensions, its inherent mechanisms of local dimensionality reduction, and its sound learning rule based on incremental stochastic leave-one-out cross validation allows -- to our knowledge for the first time -- implementing inverse dynamics learning for such a complex robot with real-time performance. In our sample task, the robot acquires the local inverse dynamics model needed to trace a figure-8 in only 60 seconds of training.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Synchronized robot drumming by neural oscillator

Kotosaka, S., Schaal, S.

In The International Symposium on Adaptive Motion of Animals and Machines, Montreal, Canada, August 2000, clmc (inproceedings)

Abstract
Sensory-motor integration is one of the key issues in robotics. In this paper, we propose an approach to rhythmic arm movement control that is synchronized with an external signal based on exploiting a simple neural oscillator network. Trajectory generation by the neural oscillator is a biologically inspired method that can allow us to generate a smooth and continuous trajectory. The parameter tuning of the oscillators is used to generate a synchronized movement with wide intervals. We adopted the method for the drumming task as an example task. By using this method, the robot can realize synchronized drumming with wide drumming intervals in real time. The paper also shows the experimental results of drumming by a humanoid robot.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Real-time robot learning with locally weighted statistical learning

Schaal, S., Atkeson, C. G., Vijayakumar, S.

In International Conference on Robotics and Automation (ICRA2000), San Francisco, April 2000, 2000, clmc (inproceedings)

Abstract
Locally weighted learning (LWL) is a class of statistical learning techniques that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional beliefs that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested in up to 50 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing of a humanoid robot arm, and inverse-dynamics learning for a seven degree-of-freedom robot.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Fast learning of biomimetic oculomotor control with nonparametric regression networks

Shibata, T., Schaal, S.

In International Conference on Robotics and Automation (ICRA2000), pages: 3847-3854, San Francisco, April 2000, 2000, clmc (inproceedings)

Abstract
Accurate oculomotor control is one of the essential pre-requisites of successful visuomotor coordination. Given the variable nonlinearities of the geometry of binocular vision as well as the possible nonlinearities of the oculomotor plant, it is desirable to accomplish accurate oculomotor control through learning approaches. In this paper, we investigate learning control for a biomimetic active vision system mounted on a humanoid robot. By combining a biologically inspired cerebellar learning scheme with a state-of-the-art statistical learning network, our robot system is able to acquire high performance visual stabilization reflexes after about 40 seconds of learning despite significant nonlinearities and processing delays in the system.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional spaces

Vijayakumar, S., Schaal, S.

In Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), 1, pages: 288-293, Stanford, CA, 2000, clmc (inproceedings)

Abstract
Locally weighted projection regression is a new algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. This paper evaluates different methods of projection regression and derives a nonlinear function approximator based on them. This nonparametric local learning system i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic cross validation to learn iii) adjusts its weighting kernels based on local information only, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of - possibly redundant - inputs, as shown in evaluations with up to 50 dimensional data sets. To our knowledge, this is the first truly incremental spatially localized learning method to combine all these properties.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Inverse kinematics for humanoid robots

Tevatia, G., Schaal, S.

In International Conference on Robotics and Automation (ICRA2000), pages: 294-299, San Fransisco, April 24-28, 2000, 2000, clmc (inproceedings)

Abstract
Real-time control of the endeffector of a humanoid robot in external coordinates requires computationally efficient solutions of the inverse kinematics problem. In this context, this paper investigates methods of resolved motion rate control (RMRC) that employ optimization criteria to resolve kinematic redundancies. In particular we focus on two established techniques, the pseudo inverse with explicit optimization and the extended Jacobian method. We prove that the extended Jacobian method includes pseudo-inverse methods as a special solution. In terms of computational complexity, however, pseudo-inverse and extended Jacobian differ significantly in favor of pseudo-inverse methods. Employing numerical estimation techniques, we introduce a computationally efficient version of the extended Jacobian with performance comparable to the original version . Our results are illustrated in simulation studies with a multiple degree-of-freedom robot, and were tested on a 30 degree-of-freedom robot. 

am

link (url) [BibTex]

link (url) [BibTex]


no image
Fast and efficient incremental learning for high-dimensional movement systems

Vijayakumar, S., Schaal, S.

In International Conference on Robotics and Automation (ICRA2000), San Francisco, April 2000, 2000, clmc (inproceedings)

Abstract
We introduce a new algorithm, Locally Weighted Projection Regression (LWPR), for incremental real-time learning of nonlinear functions, as particularly useful for problems of autonomous real-time robot control that re-quires internal models of dynamics, kinematics, or other functions. At its core, LWPR uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space, to achieve piecewise linear function approximation. The most outstanding properties of LWPR are that it i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic cross validation to learn iii) adjusts its local weighting kernels based on only local information to avoid interference problems, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number ofâ??possibly redundant and/or irrelevantâ??inputs, as shown in evaluations with up to 50 dimensional data sets for learning the inverse dynamics of an anthropomorphic robot arm. To our knowledge, this is the first incremental neural network learning method to combine all these properties and that is well suited for complex on-line learning problems in robotics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
On-line learning for humanoid robot systems

Conradt, J., Tevatia, G., Vijayakumar, S., Schaal, S.

In Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), 1, pages: 191-198, Stanford, CA, 2000, clmc (inproceedings)

Abstract
Humanoid robots are high-dimensional movement systems for which analytical system identification and control methods are insufficient due to unknown nonlinearities in the system structure. As a way out, supervised learning methods can be employed to create model-based nonlinear controllers which use functions in the control loop that are estimated by learning algorithms. However, internal models for humanoid systems are rather high-dimensional such that conventional learning algorithms would suffer from slow learning speed, catastrophic interference, and the curse of dimensionality. In this paper we explore a new statistical learning algorithm, locally weighted projection regression (LWPR), for learning internal models in real-time. LWPR is a nonparametric spatially localized learning system that employs the less familiar technique of partial least squares regression to represent functional relationships in a piecewise linear fashion. The algorithm can work successfully in very high dimensional spaces and detect irrelevant and redundant inputs while only requiring a computational complexity that is linear in the number of input dimensions. We demonstrate the application of the algorithm in learning two classical internal models of robot control, the inverse kinematics and the inverse dynamics of an actual seven degree-of-freedom anthropomorphic robot arm. For both examples, LWPR can achieve excellent real-time learning results from less than one hour of actual training data.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Humanoid Robot DB

Kotosaka, S., Shibata, T., Schaal, S.

In Proceedings of the International Conference on Machine Automation (ICMA2000), pages: 21-26, 2000, clmc (inproceedings)

am

[BibTex]

[BibTex]