Ray Summit
Human-in-the-Loop Reinforcement Learning
Tuesday, June 22 10:35 AM PDTPieter Abbeel, Professor, UC Berkeley | Founder, Covariant | Host, The Robot Brains Podcast
Deep reinforcement learning (Deep RL) has seen many successes, including learning to play Atari games, the classical game of Go, robotic locomotion and manipulation. However, now that Deep RL has become fairly capable of optimizing reward, a new challenge has arisen: How to choose the reward function that is to be optimized? Indeed, this often becomes the key engineering time sink for practitioners. In this talk, I will present some recent progress on human-in-the-loop reinforcement learning. The newly proposed algorithm, PEBBLE, empowers a human supervisor to directly teach an AI agent new skills without the usual extensive reward engineering or curriculum design efforts.
Speakers

Pieter Abbeel
Professor, UC Berkeley | Founder, Covariant | Host, The Robot Brains Podcast, UC Berkeley | Covariant | The Robot Brains