RL Summit

Keynote: Offline reinforcement learning

Tuesday, March 29, 4:00PM UTC

Opening Keynote on Offline RL by Sergey Levine.


Reinforcement learning (RL) provides an algorithmic framework for rational sequential decision making. However, the kinds of problem domains where RL has been applied successfully seem to differ substantially from the settings where supervised machine learning has been successful. RL algorithms can learn to play Atari or board games, whereas supervised machine learning algorithms can make highly accurate predictions in complex open-world settings.


Virtually all the problems that we want to solve with machine learning are really decision making problems — deciding which product to show to a customer, deciding how to tag a photo, or deciding how to translate a string of text — so why aren't we solving them all with RL? One of the biggest issues with modern RL is that it does not effectively utilize the kinds of large and highly diverse datasets that have been instrumental to the success of supervised machine learning.


In this talk, I will discuss the technologies that can help us address this issue: enabling RL methods to use large datasets via offline RL. Offline RL algorithms can analyze large, previously collected datasets to extract the most effective policies, and then fine-tune these policies with additional online interaction as needed. I will cover the technical foundations of offline RL, discuss recent algorithm advances, and present several applications.

Resources:

Speakers

Sergey Levine

Sergey Levine

Department of Electrical Engineering and Computer Sciences

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.