Safe Exploration in Finite Markov Decision Processes with Gaussian Processes (Talk)

Matteo Turchetta

ETH Zurich

In classical reinforcement learning agents accept arbitrary short term loss for long term gain when exploring their environment. This is infeasible for safety critical applications such as robotics, where even a single unsafe action may cause system failure or harm the environment. In this work, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an a priori unknown safety constraint that depends on states and actions and satisfies certain regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm, SAFEMDP, for this task and prove that it completely explores the safely reachable part of the MDP without violating the safety constraint. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.

Details

22 February 2017 • 16:00 - 17:00
AMD Seminar Room (Paul-Ehrlich-Str. 15, 1rst floor)
Autonomous Motion

Organizers

ics

Sebastian Trimpe

Alumni

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes (Talk)

Details

Organizers

Share

Latest News

Links

Contact Us