Reinforcement learning (RL) has achieved great empirical success over the past few decades, and has been used in many fields such as robotics, healthcare, game playing, etc. This course will study RL from a theoretical perspective: when and how can we design RL algorithms with provable guarantees? Specifically, we will look at recent theoretical advances in several representative RL problems, such as RL with a generative model; exploration in RL; RL with function approximation; policy optimization in RL; offline RL. In the first half of this course, students will learn the necessary mathematical tools (such as Markov Decision Processes, concentration inequalities, optimization tools) for the design and analysis of RL algorithms. In the second half of this course, each registered student will present a recent paper on RL theory.

Time and venue: TuTh 3:30pm-4:45pm, Cesar E. Chavez Building 305

D2L course webpage: lecture video recordings can be found at “UA Tools” -> “Panopto”

We will be using Piazza to make important announcements and do Q&As. Please self-enroll here. Some general rules:

- If you have technical questions, try posing your questions as general as possible, so that it can promote discussions among the class.
- If you have private questions, unless necessary, please make a private Piazza post instead of sending me an email. This will help me process your requests much more efficiently.

Office: Gould-Simpson 720

Office Hours: Thursdays 2-3pm, or by email appointment

Most of the lecture materials will be based on the book draft Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal, Nan Jiang, Sham Kakade, and Wen Sun.

Some additional useful materials:

Reinforcement learning: an introduction by Richard Sutton and Andrew Barto

Algorithms of reinforcement learning by Csaba Szepesvari

Bandit algorithms by Tor Lattimore and Csaba Szepesvari

Markov Decision Processes: Discrete Stochastic Dynamic Programming by Martin Puterman

RL theory virtual seminars by Gergely Neu, Ciara Pike-Burke, and Csaba Szepesvari

Here are some excellent notes for probability review and linear algebra review from Stanford’s CS 229 course.

I also recommended watching the lecture Street Fighting Mathematics by Ryan O’Donnell for general introductions to approaching theory-ish problems.

We will be using the following scribe note LaTeX template file and style file. See also Prof. Rob Schapire’s suggestions on preparing scribe notes. Please sign up for one scribing slot at the sign up sheet.

Some useful LaTeX resources: Learn LaTeX in 30 minutes by Overleaf; Introduction to LATEX by MIT Research Science Institute

All registered students will be asked to give a 60-min in-class presentation on an RL theory paper of their choice; please sign up for one presentation slot at the sign up sheet. See the Presentation page for more details.

CSC 696H: Advanced seminar on optimization and sampling by Kobus Barnard

CSC 535: Probabilistic Graphical Models by Kobus Barnard

ISTA 457/INFO 557: Neural Networks by Steven Bethard

CSC 665: Online Learning and Multi-armed Bandits by Kwang-Sung Jun

INFO 521: Introduction to Machine Learning by Clayton Morrison

CSC 665: Advanced Topics in Probabilistic Graphical Models by Jason Pacheco

CSC 580: Principles of Machine Learning by Carlos Scheidegger

ECE523: Engineering Applications of Machine Learning and Data Analytics by Gregory Ditzler

MIS 601: Statistical Foundations of Machine Learning by Junming Yin

MATH 574M: Statistical Machine Learning by Helen Zhang

CSC 588: Machine Learning Theory by Chicheng Zhang

Many RL theory courses offered at other institutions have good lecture materials; these together offer a diverse set of perspectives of this field; here are a few examples:

Bandits and RL by Alekh Agarwal and Alex Slivkins

Reinforcement Learning by Shipra Agrawal

Foundations of Reinforcement Learning by Chi Jin

Statistical Reinforcement Learning by Nan Jiang

Foundations of Reinforcement Learning by Wen Sun and Sham Kakade

Theoretical Foundations of Reinforcement Learning by Csaba Szepesvari

Reinforcement Learning by Alessandro Lazaric

COLT 2021 Tutorial: Statistical Foundations of Reinforcement Learning by Akshay Krishnamurthy and Wen Sun

AAAI 2020 and ALT 2019 Tutorials: Exploration-Exploitation in Reinforcement Learning by Ronan Fruit, Mohammad Ghavamzadeh, Alessandro Lazaric, and Matteo Pirotta

FOCS 2020 Tutorial: Theoretical Foundations of Reinforcement Learning by Alekh Agarwal, Akshay Krishnamurthy, and John Langford

ICML 2018 Tutorial: Optimization Perspectives on Learning to Control by Ben Recht