Date Topic Materials
January 8 Introduction to reinforcement learning. Bandit algorithms RL book, chapters 1,2.
Intro slides
Bandit slides
January 10 More on bandits and exploration RL book, chapter 2.
January 15 More on Bandits.
RL book, chapter 2.
David Silver's slides, up to slide 25
January 17 Finite MDPs, Bellman equations, policy evaluation RL book, chapter 3
Assignment 1
January 22 Control, optimality equations, value iteration, policy iteration. RL book, chapter 4.
January 24 Monte-Carlo Methods RL book, chapter 5
January 29 Temporal-difference learning methods (including TD(0), SARSA, Q-learning) RL book, chapter 6
Assignment 1 due.
January 31 Wrap-up of TD. Multi-step Bootstrapping RL book, chapter 7
February 5 Planning and learning. Convergence of TD-style methods RL book, chapter 8
Slides on MCTS from Alan Fern
February 7 More on theory of tabular TD Notes to be posted
February 12 No class Assignment 2 posted
February 14 No class David Silver's lecture on RL with function approximation
February 19 On-policy reinforcement learning with function approximation: TD, Sarsa RL book, chapter 9, 10
Slides on prediction, control
Assignment 2
February 21 More on on-policy learning with function approximatiion: average-reward case, eligibility traces RL book, chapter 12
February 26 More on eligibility traces. off-policy learning with function approximation RL book, chapter 12
Project ideas posted
February 28 More on value-based RL with function approximation RL book, chapter 12
March 5 Study break
March 7 Study break
March 12 Midterm recap Slides
Assignment 2 due
March 14 In-class midterm exam Midterm from 2018
March 19 LSTD and LSPI. Policy Gradient Methods RL book, chapter 13. Boyan paper; Lagoudakis and Parr paper
Slides form David Silver on least-squares methods (part 3 - batch RL). Slides on policy gradient
Assignment 3 posted
March 21 More on policy gradient-based methods Slides
Project proposal due
March 26 More on policy gradient (Riashat) Slides
Options paper
March 28 Frontiers: Temporal abstraction Slides
April 2 Frontiers: Finish temeporal abstraction. Inverse reinforcement learning Pieter Abbeel sldies on inverse RL
April 4 Frontiers: Meta-learning Slides (courtesy of Di Wu)
April 9 Frontiers: More on exploration Slides (based on material from Sergei Levine
Pseudocounts paper (Bellemare et al, 2016)
Deep exploration via randomized valuee functions (Osband et al, 2017)
April 11 Final project poster session (in-class) TBD

January 11