Date Topic Materials
January 6 Introduction to reinforcement learning. Bandit algorithms RL book, chapters 1,2.
Intro slides
Bandit slides
January 11 More on bandits and exploration RL book, chapter 2.
Bandit slides (Definitions, epsilon-greedy, optimistic initialization, UCB)
January 13 More on Bandits.
RL book, chapter 2.
Bandit slides (Regret, bounds, gradient-based algorithms)
Assignment 1
January 18 Wrap-up on bandits (other versions of the problem) Finite MDPs, value functions and policies RL book, chapter 3
Slides (will still be slightly revised)
January 20 More on MDPs. Bellman equations, policy evaluation. Policy iteration. Value iteration RL book, chapter 4.
Slides
January 25 Policy evaluation using Monte-Carlo Methods and Temporal-Difference learning RL book, Sec 5.1, 6.1, 6.2, 6.3
Slides
January 27 More on MC and TD, including n-step TD. Control using Monte Carlo and TD, including SARSA, Q-learning if we have time) RL book, Sec. 5.3, 5.4, 6.4, 6.5, 7.1
Assignment 1 due; <Slides
February 1 Q-learning. Starting discussion of convergence results RL book, chapter 7
Slides
February 3 Finish discussion of previous results. Planning and learning RL book, chapter 8
Finishing slides from last time. Planning slides
February 8 Intro to RL with function approximation. Value-based methods RL book Sec. 9.1-9.4
Slides
February 10 More on RL with function approximation. Eligibility traces. Control with function approximation RL book chapter 9
Slides
 
February 15 Off-policy learning RL book chapter 12
Slides
February 17 No lecture RL book Chapter 10
February 22 No lecture  
February 24 More on off-policy learning RL book, chapter 12
Assignment 2
March 1 Study break
March 3 Study break
March 8 Policy gradient RL book Chapter 13
Slides (with thanks to Hado Van Hasselt)
March 10 More on policy gradient Slides
Assignment 2 due
Assignment 3 to be posted
March 15 More on Deep RL: Model-based, temporal abstraction Slides Project details to be posted
March 17 More on Deep RL
March 22 Distributional RL (Guest lecturer Marc Bellemare) Distributional RL book
Final project document
March 24 Distributional RL (Guest lecturer Marc Bellemare)
March 29 Distributional RL (Guest lecturer Marc Bellemare)
March 31 Distributional RL (Guest lecturer Marc Bellemare)
April 5 Special topics: Batch RL Slides (with thanks to Emma Brunskill)
Assignment 3
April 7 Special topics: Rewards and Tasks Slides Part 1, Part 2
April 12 Wrap-up: Thoughts on RL for AI Slides
Project due (can be turned in without penalties until April 26)