Header logo is de

Apprenticeship learning via soft local homomorphisms

2010

Conference Paper

ei


We consider the problem of apprenticeship learning when the expert's demonstration covers only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient solution to this problem based on the assumption that the expert is optimally acting in a Markov Decision Process (MDP). However, past work on IRL requires an accurate estimate of the frequency of encountering each feature of the states when the robot follows the expert‘s policy. Given that the complete policy of the expert is unknown, the features frequencies can only be empirically estimated from the demonstrated trajectories. In this paper, we propose to use a transfer method, known as soft homomorphism, in order to generalize the expert‘s policy to unvisited regions of the state space. The generalized policy can be used either as the robot‘s final policy, or to calculate the features frequencies within an IRL algorithm. Empirical results show that our approach is able to learn good policies from a small number of demonstrations.

Author(s): Boularias, A. and Chaib-Draa, B.
Journal: Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010)
Pages: 2971-2976
Year: 2010
Month: May
Day: 0
Publisher: IEEE

Department(s): Empirische Inferenz
Bibtex Type: Conference Paper (inproceedings)

DOI: 10.1109/ROBOT.2010.5509717
Event Name: 2010 IEEE International Conference on Robotics and Automation (ICRA 2010)
Event Place: Anchorage, AK, USA

Address: Piscataway, NJ, USA
Digital: 0
ISBN: 978-1-424-45038-1
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik

Links: PDF
Web

BibTex

@inproceedings{6825,
  title = {Apprenticeship learning via soft local homomorphisms},
  author = {Boularias, A. and Chaib-Draa, B.},
  journal = {Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010)},
  pages = {2971-2976},
  publisher = {IEEE},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Piscataway, NJ, USA},
  month = may,
  year = {2010},
  month_numeric = {5}
}