Syllabus for Machine Learning (COMP-652A)- Fall 2002

Machine Learning (COMP-652)

Syllabus - Fall 2002

General Information

Location:	Peterson Hall, Room 306
Times:	Tuesday and Thursday, 11:30-1:00
Instructor:	Professor Doina Precup, School of Computer Science.
Office:	McConnell 326.
Phone:	398-6443.
Email:	dprecup@cs.mcgill.ca
Office hours:	Tuesday & Thursday, 1:00-1:30pm. Monday, 2:00-3:00pm. Meetings at other times by appointment only! IMPORTANT: Email is the easiest way to reach me!
Class web page:	http://www.cs.mcgill.ca/~dprecup/courses/ml.html IMPORTANT: This is where class notes, announcements and homeworks are posted!

Course Description

The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. In recent years, many successful applications of machine learning have been developed, ranging from data-mining programs that learn to detect fraudulent credit card transactions, to autonomous vehicles that learn to drive on public highways. At the same time, there have been important advances in the theory and algorithms that form the foundation of this field.

The goal of this class is to provide an overview of the state-of-art algorithms used in machine learning. We will discuss both the theoretical properties of these algorithms and their practical applications.

Prerequisites

Basic knowledge of a programming language is required. Basic knowledge of probabilities and statistics is required. Example courses at McGill providing sufficient background are 189-323 or 304-305. Some AI background is recommended, as provided, for instance by 308-424 or 304-526. If you have doubts regarding your background, please contact me to discuss it.

Reference Materials

The main textbook for the course is Machine Learning by Tom Mitchell (McGraw-Hill, 1997). Since the book does not cover many new and exciting developments in the field, we will supplement it with research papers in the field. These will be distributed in class or posted on the web page, as appropriate. The class slides will be posted on the web page.

Class Requirements

The class grade will be based on the following components:

Seven problem-solving assignments - 35%
Reading assignments - 15%
Two in-class exams - 20%
Class project - 30%
Participation in class discussions - up to 2% extra credit.

Minor changes to the evaluation scheme (if any) will be announced in class by Tuesday, September 10 (pending in-class discussion and the estimated total enrollment).

The assignments will include both written work and implementations. The implementation work will involve programming of algorithms that we discuss in class and reporting results produced by these algorithms. The ability to program in a high-level language, such as C/C++, Java, LISP, Matlab, under the UNIX operating system is assumed but the use of these languages is not mandatory. You may use any programming language, as long as you can demonstrate that your implementation is correct. The written work will involve solving problems related to the theoretical part of the course, and explaining the results of your implementations. During the course we will also use programs written by other machine learning researchers. Instructions regarding how to use these programs will be provided as needed.

The reading assignments will involve summarizing and/or critiquing research papers in the field. You are also expected to actively participate in class discussions, and the participation is part of your grade for the reading assignment. When reading the book and other papers, always think about questions that you may have, and bring them to class.

The final project will involve choosing a topic, either from a list of suggestions that will be provided or based on your own interests. You will be required to read papers related to the topic, implement or find available implementations of the algorithms discussed, and perform experimental work comparing these algorithms. By October 15, you will have to hand in a short write-up of your intended work (details will be provided later). At the end of the semester, you will write a final report and you will prepare a 15-minute class presentation of the topic you investigated. Due to the high number of students currently registered, we will only select a subset of the students for presentations. These will be scheduled outside of our lecture time, during December.

Homework Policy

Assignments should be submitted in class on the day when they are due. In addition, the answers to programming problems must also be submitted electronically, using the handin system. For late assignments, 10 points will be deducted from the grade for every late day, for the first 5 days. No credit is given for assignments submitted more than 5 days late, unless you have a medical problem.

Reading assignments are due in class. If you cannot come to class, please submit your assignment by e-mail, in plain text, postscript or PDF format.

All assignments are INDIVIDUAL! You may discuss the problems with your colleagues, but you must submit individual homeworks. Please acknowledge all sources you use in the homeworks (papers, code or ideas from someone else). Please refer to the McGill Academic Integrity web page for information on plagiarism and especially, on how to avoid it.

Tentative Schedule

Introduction (1 lecture)
Concept Learning and Version Spaces (1 lecture)
Bayesian Learning (1 lecture)
Decision Trees (2 lectures)
Linear units, gradient descent (1 lecture)
Sigmoid Neural Networks (1 lecture)
Empirical evaluation of learning algorithms (2 lectures)
Computational Learning Theory (2 lectures)
Instance-based learning (1 lecture)
Ensemble methods (1 lecture)
Support Vector Machines (2 lectures)
Genetic Algorithms (1 lecture)
Reinforcement learning (4 lectures)
Learning probabilities (1 lecture)
Clustering (1 lecture)
Factor analysis (1 lecture)
In-class examinations (2 lectures)
Wrap-up (1 lecture)

IMPORTANT: The schedule is subject to change. Up-to-date information about the schedule and assigned readings will be posted on the class web page.