Date Topic Materials
January 4 Introduction to reinforcement learning. Bandit algorithms RL book, chapter 1.
Intro slides
January 9 Bandits: definition of multi-armed bandit, epsilon-greedy exploration, optimism, UCB. RL book, Sec. 2.1-2.7
Bandit slides
January 11 Bandits: regret definition and analysis for epsilon-gredy and UCB, gradient-based bandits
RL book, chapter 2
Assignment 1 posted
January 16 Wrap up of bandits: Gradient-based bandits, Thompson sampling. RL book, chapter 2
January 18 Markov Decision Processes. Value functions. Bellman equations, policy evaluation. Policy iteration. Value iteration RL book, chapter 3
January 23 More on dynamic programming: policy iteration, value iteration, contractions. Policy evaluation using Monte-Carlo Methods and Temporal-Difference learning RL book, Chapter 4, Sec 5.1, 6.1, 6.2, 6.3
January 25 More on TD. Control using Monte Carlo and TD, including SARSA RL book, Sec. 5.3, 5.4, 6.4, 6.5, 7.1

January 30 Q-learning RL book, chapter 7
Assignment 1 due
Assignment 2 posted
February 1 More on value-based RL, function approximation RL book, chapter 8
February 6 More on value-based RL with function approximation RL book Sec. 9.1-9.4
February 8 More on Deep RL RL book chapter 9
February 13 Plannning and model-based RL RL book chapter 8
February 15 More on model-based RL and planning RL book chapter 8
Assignment 2 due
Project information posted
February 20 Policy-gradient methods RL book chapter 13
February 22 More on policy gradient. Actor-critic RL book chapter 13
Assignment 3 posted
February 27 More on policy gradient: DDPG, TRPO  
February 29 Wrap up material from the RL book  
March 5 Study break
March 7 Study break
March 12 Hierachical RL
March 14 More on hierarchical RL
March 19 Offline and Batch RL Assignement 3 due
March 21 More on offline and batch RL
March 26 Where do rewards come from? Inverse RL
March 28 Where do rewards come from? Learning from preferences and human feedback
April 2 Meta-learning; Never-ending / continual RL
April 4 More on never-ending and continual RL
April 9 Wrap-up: Thoughts on RL for AI Project due April 12