Course Description
The goal of this class is to provide an introduction to reinforcement
learning, a very active part of machine learning. Reinforcement learning
is concerned with building computer agents which learn how to predict
and act in a stochastic environment, based on past experience.
Applications of reinforcement learning range from classical control
problems, such as power plant optimization or dynamical system control,
to game playing, inventory control, and many other fields. Notably,
reinforcement learning has also produced very compelling models of
animal and human learning. During this course, we will study theoretical
properties and practical applications of reinforcement learning. We
will follow the second edition of the classic textbook by Sutton &
Barto (available online for free, or from MIT Press), and supplement it
as needed with papers and other materials.
Prerequisites
Knowledge of the Python programming language is required.
Knowledge
of probability/statistics, multivariate calculus and linear algebra is
required. Example courses at
McGill providing sufficient background in probability are MATH-323 or
ECSE-305. Machine learning background, as provided for example by
COMP-451, COMP-551 or COMP-652 is required.
If you have doubts regarding your background, please contact Doina or
Isabeau to discuss it.
Reference Materials
Required textbook:
Lecture notes and other relevant materials will be linked to the
lectures web page.
MyCourses will be used for bulletin board, access to Ed discussion groups, and assignment submission and grading.
Evaluation
The class grade will be based on the following components:
- 3 Assignments: 54% (18% each assignment)
- Final Project: 36%
- Quizzes: 10%
Late submission policy and extensions:
- For assignments and project, 2^{late days}% penalty will be deducted.
- No extension will possible under any circumstances for the quizzes because the answers are automatically released after the due date. Each quiz will be roughly 10 questions and you will have more than week to do it, so there should be no reason to need an extension.
- In case of valid reasons, you may ask the Head TA to waive some a few days of late penalty for the assignments or project.
- Even when done in groups, all members of the groups should contribute and do every part of the project, thus saying that a group member didn’t do their part will never be deemed a valid excuse. If you only do one part of a project, you are not learning everything you need to learn in this class.
- After getting your grade back for an assignment or project, you have a 2 week period to voice any concerns, after 2 weeks the grade is final.
Minor changes to the evaluation scheme (if any) will be announced in
class by the Add/Drop deadline (pending in-class discussion and the
estimated total enrollment).
McGill University values academic integrity. Therefore all students
must understand the meaning and consequences of cheating, plagiarism
and other academic offenses under the Code of Student Conduct and
Disciplinary Procedures (see www.mcgill.ca/students/srr/honest for
more information).
In accord with McGill University's Charter of Students' Rights,
students in this course have the right to submit in English or in French
any written work that is to be graded.
In the event of extraordinary circumstances beyond the University's
control, the content and/or evaluation scheme in this course is subject
to change.
Contact
E-mail: comp579@mcgill.ca (not for questions about the material, ask on Ed instead)
Discussion Board: Ed (access directly here, or under the Content Tab in Mycourses)
Office hours