Header logo is


2007


A Database and Evaluation Methodology for Optical Flow
A Database and Evaluation Methodology for Optical Flow

Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, October 2007 (inproceedings)

ps

pdf [BibTex]

2007


pdf [BibTex]


Shining a light on human pose: On shadows, shading and the estimation of pose and shape,
Shining a light on human pose: On shadows, shading and the estimation of pose and shape,

Balan, A., Black, M. J., Haussecker, H., Sigal, L.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, October 2007 (inproceedings)

ps

pdf YouTube [BibTex]

pdf YouTube [BibTex]


no image
Ensemble spiking activity as a source of cortical control signals in individuals with tetraplegia

Simeral, J. D., Kim, S. P., Black, M. J., Donoghue, J. P., Hochberg, L. R.

Biomedical Engineering Society, BMES, september 2007 (conference)

ps

[BibTex]

[BibTex]


Detailed human shape and pose from images
Detailed human shape and pose from images

Balan, A., Sigal, L., Black, M. J., Davis, J., Haussecker, H.

In IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, pages: 1-8, Minneapolis, June 2007 (inproceedings)

Abstract
Much of the research on video-based human motion capture assumes the body shape is known a priori and is represented coarsely (e.g. using cylinders or superquadrics to model limbs). These body models stand in sharp contrast to the richly detailed 3D body models used by the graphics community. Here we propose a method for recovering such models directly from images. Specifically, we represent the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but detailed, parametric model of shape and pose-dependent deformations that is learned from a database of range scans of human bodies. Previous work showed that the parameters of the SCAPE model could be estimated from marker-based motion capture data. Here we go further to estimate the parameters directly from image data. We define a cost function between image observations and a hypothesized mesh and formulate the problem as optimization over the body shape and pose parameters using stochastic search. Our results show that such rich generative models enable the automatic recovery of detailed human shape and pose from images.

ps

pdf YouTube [BibTex]

pdf YouTube [BibTex]


Decoding grasp aperture from motor-cortical population activity
Decoding grasp aperture from motor-cortical population activity

Artemiadis, P., Shakhnarovich, G., Vargas-Irwin, C., Donoghue, J. P., Black, M. J.

In The 3rd International IEEE EMBS Conference on Neural Engineering, pages: 518-521, May 2007 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


Multi-state decoding of point-and-click control signals from motor cortical activity in a human with tetraplegia
Multi-state decoding of point-and-click control signals from motor cortical activity in a human with tetraplegia

Kim, S., Simeral, J., Hochberg, L., Donoghue, J. P., Friehs, G., Black, M. J.

In The 3rd International IEEE EMBS Conference on Neural Engineering, pages: 486-489, May 2007 (inproceedings)

Abstract
Basic neural-prosthetic control of a computer cursor has been recently demonstrated by Hochberg et al. [1] using the BrainGate system (Cyberkinetics Neurotechnology Systems, Inc.). While these results demonstrate the feasibility of intracortically-driven prostheses for humans with paralysis, a practical cursor-based computer interface requires more precise cursor control and the ability to “click” on areas of interest. Here we present a practical point and click device that decodes both continuous states (e.g. cursor kinematics) and discrete states (e.g. click state) from single neural population in human motor cortex. We describe a probabilistic multi-state decoder and the necessary training paradigms that enable point and click cursor control by a human with tetraplegia using an implanted microelectrode array. We present results from multiple recording sessions and quantify the point and click performance.

ps

pdf [BibTex]

pdf [BibTex]


no image
Towards Machine Learning of Motor Skills

Peters, J., Schaal, S., Schölkopf, B.

In Proceedings of Autonome Mobile Systeme (AMS), pages: 138-144, (Editors: K Berns and T Luksch), 2007, clmc (inproceedings)

Abstract
Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement Learning for Optimal Control of Arm Movements

Theodorou, E., Peters, J., Schaal, S.

In Abstracts of the 37st Meeting of the Society of Neuroscience., Neuroscience, 2007, clmc (inproceedings)

Abstract
Every day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible.

am ei

[BibTex]

[BibTex]


no image
Reinforcement learning by reward-weighted regression for operational space control

Peters, J., Schaal, S.

In Proceedings of the 24th Annual International Conference on Machine Learning, pages: 745-750, ICML, 2007, clmc (inproceedings)

Abstract
Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Policy gradient methods for machine learning

Peters, J., Theodorou, E., Schaal, S.

In Proceedings of the 14th INFORMS Conference of the Applied Probability Society, pages: 97-98, Eindhoven, Netherlands, July 9-11, 2007, 2007, clmc (inproceedings)

Abstract
We present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from ÔvanillaÕ policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics.

am ei

[BibTex]

[BibTex]


no image
Policy Learning for Motor Skills

Peters, J., Schaal, S.

In Proceedings of 14th International Conference on Neural Information Processing (ICONIP), pages: 233-242, (Editors: Ishikawa, M. , K. Doya, H. Miyamoto, T. Yamakawa), 2007, clmc (inproceedings)

Abstract
Policy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement learning for operational space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, pages: 2111-2116, IEEE Computer Society, ICRA, 2007, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

am ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Less Conservative Polytopic LPV Models for Charge Control by Combining Parameter Set Mapping and Set Intersection

Kwiatkowski, A., Trimpe, S., Werner, H.

In Proceedings of the 46th IEEE Conference on Decision and Control, 2007 (inproceedings)

am ics

DOI [BibTex]

DOI [BibTex]


no image
Using reward-weighted regression for reinforcement learning of task space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages: 262-267, Honolulu, Hawaii, April 1-5, 2007, 2007, clmc (inproceedings)

Abstract
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

Riedmiller, M., Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages: 254-261, ADPRL, 2007, clmc (inproceedings)

Abstract
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

am ei

PDF [BibTex]

PDF [BibTex]


no image
Uncertain 3D Force Fields in Reaching Movements: Do Humans Favor Robust or Average Performance?

Mistry, M., Theodorou, E., Hoffmann, H., Schaal, S.

In Abstracts of the 37th Meeting of the Society of Neuroscience, 2007, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
Applying the episodic natural actor-critic architecture to motor primitive learning

Peters, J., Schaal, S.

In Proceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, April 25-27, 2007, clmc (inproceedings)

Abstract
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the Òbuilding blocks of movement generationÓ, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A computational model of human trajectory planning based on convergent flow fields

Hoffman, H., Schaal, S.

In Abstracts of the 37st Meeting of the Society of Neuroscience, San Diego, CA, Nov. 3-7, 2007, clmc (inproceedings)

Abstract
A popular computational model suggests that smooth reaching movements are generated in humans by minimizing a difference vector between hand and target in visual coordinates (Shadmehr and Wise, 2005). To achieve such a task, the optimal joint accelerations may be pre-computed. However, this pre-planning is inflexible towards perturbations of the limb, and there is strong evidence that reaching movements can be modified on-line at any moment during the movement. Thus, next-state planning models (Bullock and Grossberg, 1988) have been suggested that compute the current control command from a function of the goal state such that the overall movement smoothly converges to the goal (see Shadmehr and Wise (2005) for an overview). So far, these models have been restricted to simple point-to-point reaching movements with (approximately) straight trajectories. Here, we present a computational model for learning and executing arbitrary trajectories that combines ideas from pattern generation with dynamic systems and the observation of convergent force fields, which control a frog leg after spinal stimulation (Giszter et al., 1993). In our model, we incorporate the following two observations: first, the orientation of vectors in a force field is invariant over time, but their amplitude is modulated by a time-varying function, and second, two force fields add up when stimulated simultaneously (Giszter et al., 1993). This addition of convergent force fields varying over time results in a virtual trajectory (a moving equilibrium point) that correlates with the actual leg movement (Giszter et al., 1993). Our next-state planner is a set of differential equations that provide the desired end-effector or joint accelerations using feedback of the current state of the limb. These accelerations can be interpreted as resulting from a damped spring that links the current limb position with a virtual trajectory. This virtual trajectory can be learned to realize any desired limb trajectory and velocity profile, and learning is efficient since the time-modulated sum of convergent force fields equals a sum of weighted basis functions (Gaussian time pulses). Thus, linear algebra is sufficient to compute these weights, which correspond to points on the virtual trajectory. During movement execution, the differential equation corrects automatically for perturbations and brings back smoothly the limb towards the goal. Virtual trajectories can be rescaled and added allowing to build a set of movement primitives to describe movements more complex than previously learned. We demonstrate the potential of the suggested model by learning and generating a wide variety of movements.

am

[BibTex]

[BibTex]


Deterministic Annealing for Multiple-Instance Learning
Deterministic Annealing for Multiple-Instance Learning

Gehler, P., Chapelle, O.

In Artificial Intelligence and Statistics (AIStats), 2007 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
Point-and-click cursor control by a person with tetraplegia using an intracortical neural interface system

Kim, S., Simeral, J. D., Hochberg, L. R., Friehs, G., Donoghue, J. P., Black, M. J.

Program No. 517.2. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

ps

[BibTex]

[BibTex]


no image
A Computational Model of Arm Trajectory Modification Using Dynamic Movement Primitives

Mohajerian, P., Hoffmann, H., Mistry, M., Schaal, S.

In Abstracts of the 37st Meeting of the Society of Neuroscience, San Diego, CA, Nov 3-7, 2007, clmc (inproceedings)

Abstract
Several scientists used a double-step target-displacement protocol to investigate how an unexpected upcoming new target modifies ongoing discrete movements. Interesting observations are the initial direction of the movement, the spatial path of the movement to the second target, and the amplification of the speed in the second movement. Experimental data show that the above properties are influenced by the movement reaction time and the interstimulus interval between the onset of the first and second target. Hypotheses in the literature concerning the interpretation of the observed data include a) the second movement is superimposed on the first movement (Henis and Flash, 1995), b) the first movement is aborted and the second movement is planned to smoothly connect the current state of the arm with the new target (Hoff and Arbib, 1992), c) the second movement is initiated by a new control signal that replaces the first movement's control signal, but does not take the state of the system into account (Flanagan et al., 1993), and (d) the second movement is initiated by a new goal command, but the control structure stays unchanged, and feed-back from the current state is taken into account (Hoff and Arbib, 1993). We investigate target switching from the viewpoint of Dynamic Movement Primitives (DMPs). DMPs are trajectory planning units that are formalized as stable nonlinear attractor systems (Ijspeert et al., 2002). They are a useful framework for biological motor control as they are highly flexible in creating complex rhythmic and discrete behaviors that can quickly adapt to the inevitable perturbations of dynamically changing, stochastic environments. In this model, target switching is accomplished simply by updating the target input to the discrete movement primitive for reaching. The reaching trajectory in this model can be straight or take any other route; in contrast, the Hoff and Arbib (1993) model is restricted to straight reaching movement plans. In the present study, we use DMPs to reproduce in simulation a large number of target-switching experimental data from the literature and to show that online correction and the observed target switching phenomena can be accomplished by changing the goal state of an on-going DMP, without the need to switch to different movement primitives or to re-plan the movement. :

am

PDF [BibTex]

PDF [BibTex]


no image
Inverse dynamics control with floating base and constraints

Nakanishi, J., Mistry, M., Schaal, S.

In International Conference on Robotics and Automation (ICRA2007), pages: 1942-1947, Rome, Italy, April 10-14, 2007, clmc (inproceedings)

Abstract
In this paper, we address the issues of compliant control of a robot under contact constraints with a goal of using joint space based pattern generators as movement primitives, as often considered in the studies of legged locomotion and biological motor control. For this purpose, we explore inverse dynamics control of constrained dynamical systems. When the system is overconstrained, it is not straightforward to formulate an inverse dynamics control law since the problem becomes an ill-posed one, where infinitely many combinations of joint torques are possible to achieve the desired joint accelerations. The goal of this paper is to develop a general and computationally efficient inverse dynamics algorithm for a robot with a free floating base and constraints. We suggest an approximate way of computing inverse dynamics algorithm by treating constraint forces computed with a Lagrange multiplier method as simply external forces based on FeatherstoneÕs floating base formulation of inverse dynamics. We present how all the necessary quantities to compute our controller can be efficiently extracted from FeatherstoneÕs spatial notation of robot dynamics. We evaluate the effectiveness of the suggested approach on a simulated biped robot model.

am

link (url) [BibTex]

link (url) [BibTex]


Learning Appearances with Low-Rank SVM
Learning Appearances with Low-Rank SVM

Wolf, L., Jhuang, H., Hazan, T.

In Conference on Computer Vision and Pattern Recognition (CVPR), 2007 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
Neural correlates of grip aperture in primary motor cortex

Vargas-Irwin, C., Shakhnarovich, G., Artemiadis, P., Donoghue, J. P., Black, M. J.

Program No. 517.10. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

ps

[BibTex]

[BibTex]


no image
Directional tuning in motor cortex of a person with ALS

Simeral, J. D., Donoghue, J. P., Black, M. J., Friehs, G. M., Brown, R. H., Krivickas, L. S., Hochberg, L. R.

Program No. 517.4. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

ps

[BibTex]

[BibTex]


no image
Kernel carpentry for onlne regression using randomly varying coefficient model

Edakunni, N. U., Schaal, S., Vijayakumar, S.

In Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India: Jan. 6-12, 2007, clmc (inproceedings)

Abstract
We present a Bayesian formulation of locally weighted learning (LWL) using the novel concept of a randomly varying coefficient model. Based on this, we propose a mechanism for multivariate non-linear regression using spatially localised linear models that learns completely independent of each other, uses only local information and adapts the local model complexity in a data driven fashion. We derive online updates for the model parameters based on variational Bayesian EM. The evaluation of the proposed algorithm against other state-of-the-art methods reveal the excellent, robust generalization performance beside surprisingly efficient time and space complexity properties. This paper, for the first time, brings together the computational efficiency and the adaptability of Õnon-competitiveÕ locally weighted learning schemes and the modeling guarantees of the Bayesian formulation.

am

link (url) [BibTex]

link (url) [BibTex]


Denoising archival films using a learned {Bayesian} model
Denoising archival films using a learned Bayesian model

Moldovan, T. M., Roth, S., Black, M. J.

(CS-07-03), Brown University, Department of Computer Science, 2007 (techreport)

ps

pdf [BibTex]

pdf [BibTex]


Steerable random fields
Steerable random fields

(Best Paper Award, INI-Graphics Net, 2008)

Roth, S., Black, M. J.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, 2007 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
Toward standardized assessment of pointing devices for brain-computer interfaces

Donoghue, J., Simeral, J., Kim, S., G.M. Friehs, L. H., Black, M.

Program No. 517.16. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

ps

[BibTex]

[BibTex]


no image
A robust quadruped walking gait for traversing rough terrain

Pongas, D., Mistry, M., Schaal, S.

In International Conference on Robotics and Automation (ICRA2007), pages: 1474-1479, Rome, April 10-14, 2007, 2007, clmc (inproceedings)

Abstract
Legged locomotion excels when terrains become too rough for wheeled systems or open-loop walking pattern generators to succeed, i.e., when accurate foot placement is of primary importance in successfully reaching the task goal. In this paper we address the scenario where the rough terrain is traversed with a static walking gait, and where for every foot placement of a leg, the location of the foot placement was selected irregularly by a planning algorithm. Our goal is to adjust a smooth walking pattern generator with the selection of every foot placement such that the COG of the robot follows a stable trajectory characterized by a stability margin relative to the current support triangle. We propose a novel parameterization of the COG trajectory based on the current position, velocity, and acceleration of the four legs of the robot. This COG trajectory has guaranteed continuous velocity and acceleration profiles, which leads to continuous velocity and acceleration profiles of the leg movement, which is ideally suited for advanced model-based controllers. Pitch, yaw, and ground clearance of the robot are easily adjusted automatically under any terrain situation. We evaluate our gait generation technique on the Little-Dog quadruped robot when traversing complex rocky and sloped terrains.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Bayesian Nonparametric Regression with Local Models

Ting, J., Schaal, S.

In Workshop on Robotic Challenges for Machine Learning, NIPS 2007, 2007, clmc (inproceedings)

am

[BibTex]

[BibTex]


A Biologically Inspired System for Action Recognition
A Biologically Inspired System for Action Recognition

Jhuang, H., Serre, T., Wolf, L., Poggio, T.

In International Conference on Computer Vision (ICCV), 2007 (inproceedings)

ps

code pdf [BibTex]

code pdf [BibTex]


no image
Learning an Outlier-Robust Kalman Filter

Ting, J., Theodorou, E., Schaal, S.

CLMC Technical Report: TR-CLMC-2007-1, Los Angeles, CA, 2007, clmc (techreport)

Abstract
We introduce a modified Kalman filter that performs robust, real-time outlier detection, without the need for manual parameter tuning by the user. Systems that rely on high quality sensory data (for instance, robotic systems) can be sensitive to data containing outliers. The standard Kalman filter is not robust to outliers, and other variations of the Kalman filter have been proposed to overcome this issue. However, these methods may require manual parameter tuning, use of heuristics or complicated parameter estimation procedures. Our Kalman filter uses a weighted least squares-like approach by introducing weights for each data sample. A data sample with a smaller weight has a weaker contribution when estimating the current time step?s state. Using an incremental variational Expectation-Maximization framework, we learn the weights and system dynamics. We evaluate our Kalman filter algorithm on data from a robotic dog.

am

PDF [BibTex]

PDF [BibTex]


no image
Task space control with prioritization for balance and locomotion

Mistry, M., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robotics Systems (IROS 2007), San Diego, CA: Oct. 29 Ð Nov. 2, 2007, clmc (inproceedings)

Abstract
This paper addresses locomotion with active balancing, via task space control with prioritization. The center of gravity (COG) and foot of the swing leg are treated as task space control points. Floating base inverse kinematics with constraints is employed, thereby allowing for a mobile platform suitable for locomotion. Different techniques of task prioritization are discussed and we clarify differences and similarities of previous suggested work. Varying levels of prioritization for control are examined with emphasis on singularity robustness and the negative effects of constraint switching. A novel controller for task space control of balance and locomotion is developed which attempts to address singularity robustness, while minimizing discontinuities created by constraint switching. Controllers are evaluated using a quadruped robot simulator engaging in a locomotion task.

am

link (url) [BibTex]

link (url) [BibTex]


no image
AREADNE Research in Encoding And Decoding of Neural Ensembles

Shakhnarovich, G., Hochberg, L. R., Donoghue, J. P., Stein, J., Brown, R. H., Krivickas, L. S., Friehs, G. M., Black, M. J.

Program No. 517.8. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

ps

[BibTex]

[BibTex]

2006


no image
Finding directional movement representations in motor cortical neural populations using nonlinear manifold learning

WorKim, S., Simeral, J., Jenkins, O., Donoghue, J., Black, M.

World Congress on Medical Physics and Biomedical Engineering 2006, Seoul, Korea, August 2006 (conference)

ps

[BibTex]

2006


[BibTex]


A non-parametric {Bayesian} approach to spike sorting
A non-parametric Bayesian approach to spike sorting

Wood, F., Goldwater, S., Black, M. J.

In International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pages: 1165-1169, New York, NY, August 2006 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


Predicting {3D} people from {2D} pictures
Predicting 3D people from 2D pictures

(Best Paper)

Sigal, L., Black, M. J.

In Proc. IV Conf. on Articulated Motion and DeformableObjects (AMDO), LNCS 4069, pages: 185-195, July 2006 (inproceedings)

Abstract
We propose a hierarchical process for inferring the 3D pose of a person from monocular images. First we infer a learned view-based 2D body model from a single image using non-parametric belief propagation. This approach integrates information from bottom-up body-part proposal processes and deals with self-occlusion to compute distributions over limb poses. Then, we exploit a learned Mixture of Experts model to infer a distribution of 3D poses conditioned on 2D poses. This approach is more general than recent work on inferring 3D pose directly from silhouettes since the 2D body model provides a richer representation that includes the 2D joint angles and the poses of limbs that may be unobserved in the silhouette. We demonstrate the method in a laboratory setting where we evaluate the accuracy of the 3D poses against ground truth data. We also estimate 3D body pose in a monocular image sequence. The resulting 3D estimates are sufficiently accurate to serve as proposals for the Bayesian inference of 3D human motion over time

ps

pdf pdf from publisher Video [BibTex]

pdf pdf from publisher Video [BibTex]


Specular flow and the recovery of surface structure
Specular flow and the recovery of surface structure

Roth, S., Black, M.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 2, pages: 1869-1876, New York, NY, June 2006 (inproceedings)

Abstract
In scenes containing specular objects, the image motion observed by a moving camera may be an intermixed combination of optical flow resulting from diffuse reflectance (diffuse flow) and specular reflection (specular flow). Here, with few assumptions, we formalize the notion of specular flow, show how it relates to the 3D structure of the world, and develop an algorithm for estimating scene structure from 2D image motion. Unlike previous work on isolated specular highlights we use two image frames and estimate the semi-dense flow arising from the specular reflections of textured scenes. We parametrically model the image motion of a quadratic surface patch viewed from a moving camera. The flow is modeled as a probabilistic mixture of diffuse and specular components and the 3D shape is recovered using an Expectation-Maximization algorithm. Rather than treating specular reflections as noise to be removed or ignored, we show that the specular flow provides additional constraints on scene geometry that improve estimation of 3D structure when compared with reconstruction from diffuse flow alone. We demonstrate this for a set of synthetic and real sequences of mixed specular-diffuse objects.

ps

pdf [BibTex]

pdf [BibTex]


An adaptive appearance model approach for model-based articulated object tracking
An adaptive appearance model approach for model-based articulated object tracking

Balan, A., Black, M. J.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 1, pages: 758-765, New York, NY, June 2006 (inproceedings)

Abstract
The detection and tracking of three-dimensional human body models has progressed rapidly but successful approaches typically rely on accurate foreground silhouettes obtained using background segmentation. There are many practical applications where such information is imprecise. Here we develop a new image likelihood function based on the visual appearance of the subject being tracked. We propose a robust, adaptive, appearance model based on the Wandering-Stable-Lost framework extended to the case of articulated body parts. The method models appearance using a mixture model that includes an adaptive template, frame-to-frame matching and an outlier process. We employ an annealed particle filtering algorithm for inference and take advantage of the 3D body model to predict self occlusion and improve pose estimation accuracy. Quantitative tracking results are presented for a walking sequence with a 180 degree turn, captured with four synchronized and calibrated cameras and containing significant appearance changes and self-occlusion in each view.

ps

pdf [BibTex]

pdf [BibTex]


Measure locally, reason globally: Occlusion-sensitive articulated pose estimation
Measure locally, reason globally: Occlusion-sensitive articulated pose estimation

Sigal, L., Black, M. J.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 2, pages: 2041-2048, New York, NY, June 2006 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


Statistical analysis of the non-stationarity of neural population codes
Statistical analysis of the non-stationarity of neural population codes

Kim, S., Wood, F., Fellows, M., Donoghue, J. P., Black, M. J.

In BioRob 2006, The first IEEE / RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, pages: 295-299, Pisa, Italy, Febuary 2006 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
How to choose the covariance for Gaussian process regression independently of the basis

Franz, M., Gehler, P.

In Proceedings of the Workshop Gaussian Processes in Practice, Workshop Gaussian Processes in Practice (GPIP), 2006 (inproceedings)

ei ps

pdf [BibTex]

pdf [BibTex]


no image
Learning operational space control

Peters, J., Schaal, S.

In Robotics: Science and Systems II (RSS 2006), pages: 255-262, (Editors: Gaurav S. Sukhatme and Stefan Schaal and Wolfram Burgard and Dieter Fox), Cambridge, MA: MIT Press, RSS , 2006, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach.

am ei

link (url) [BibTex]

link (url) [BibTex]


no image
Reinforcement Learning for Parameterized Motor Primitives

Peters, J., Schaal, S.

In Proceedings of the 2006 International Joint Conference on Neural Networks, pages: 73-80, IJCNN, 2006, clmc (inproceedings)

Abstract
One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


The rate adapting poisson model for information retrieval and object recognition
The rate adapting poisson model for information retrieval and object recognition

Gehler, P. V., Holub, A. D., Welling, M.

In Proceedings of the 23rd international conference on Machine learning, pages: 337-344, ICML ’06, ACM, New York, NY, USA, 2006 (inproceedings)

ei ps

project page pdf DOI [BibTex]

project page pdf DOI [BibTex]


no image
Policy gradient methods for robotics

Peters, J., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robotics Systems, pages: 2219-2225, IROS, 2006, clmc (inproceedings)

Abstract
The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Implicit Wiener Series, Part II: Regularised estimation
Implicit Wiener Series, Part II: Regularised estimation

Gehler, P., Franz, M.

(148), Max Planck Institute, 2006 (techreport)

ei ps

pdf [BibTex]

pdf [BibTex]