COMP-598 Topics in Computer Science: Applied Machine Learning

Syllabus - Fall 2014


General Information

Location:Trottier, ENGTR 2120
Times:Tuesday / Thursday, 11:30am-1:00pm
Instructor:
 
 
Prof. Joelle Pineau, School of Computer Science
Email: jpineau@cs.mcgill.ca
Office: McConnell 106N
Office hours: Thursday, 1:00-2:30pm
Teaching assistants:
 
 
Pierre-Luc Bacon
Email: pbacon@cs.mcgill.ca
Office: McConnell 111
Office hours: Wednesday, 11am
Angus Leigh
Email: angus.leigh@cs.mcgill.ca
Office: McConnell 111
Office hours: Wednesday, 11am
Class web page:
 
http://www.cs.mcgill.ca/~jpineau/comp598

Course Description

The course will cover selected topics and new developments in Data mining and Machine learning, with a particular emphasis on good methods and practices for effective deployment of real systems. We will study commonly used algorithms and techniques, including clustering, neural networks, support vector machines, decision trees. We will also discuss methods to address practical issues such as feature selection and dimensionality reduction, error estimation and empirical validation, algorithm design and parallelization, and handling of large datasets.

Course content (subject to minor changes):

  1. Linear regression. Linear classification.
  2. Performance evaluation, overfitting, cross-validation, bias-variance analysis, error estimation.
  3. Naive Bayes.
  4. Decision trees. Regression trees and ensemble methods.
  5. Cost-sensitive learning.
  6. Support vector machines.
  7. Artificial neural networks. Deep learning.
  8. Feature selection. Dimensionality reduction. Regularization.
  9. Online / streaming data.
  10. Data structures and Map-Reduce.
  11. Unsupervised learning and clustering. Semi-supervised learning.
  12. Applications.

Reference Materials

There is no required textbook. Lecture notes and references will be available from the course web page. The following texts can also be very useful:
  1. Trevor Hastie, Robert Tibshirani and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition. Springer. 2009. Available free online.
  2. Christopher Bishop. Pattern Recognition and Machine Learning. Springer. 2007.
  3. Kevin Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press. 2012.
  4. David MacKay. Information Theory, Inference and Learning Algorithms. Cambridge University Press. 2003.
  5. Richard Duda, Peter Hard and David Stork. Pattern Classification. 2nd Edition. Wiley & Sons. 2001.

Prerequisites / Anterequesites

Basic knowledge of a programming language is required. Basic knowledge of probabilities/statistics, calculus and linear algebra is required. Example courses at McGill providing sufficient background in probability are MATH-323 or ECSE-305. Some AI background is recommended, as provided, for instance by COMP-424 or ECSE-526, but not required. If you have doubts regarding your background, please contact me to discuss it.

Students who took COMP-652 in 2012 or before CANNOT take COMP-598. Starting in Fall 2013, COMP-598 and COMP-652 were designed to avoid significant overlap; you can take either or both.

The courses is intended for hard-working, technically skilled, highly motivated students. Participants will be expected to display initiative, creativity, scientific rigour, critical thinking, and good communication skills.

Evaluation Criteria

The class grade will be based on the following components:

The weekly exercises will consist of quizzes (in class) or practical work (take-home) designed to develop basic understanding of the course material as we progress through the topics. They are optional and will not be graded, but solutions will be provided. The exercises will provide good practice for the midterm.

The midterm is designed to assess in-depth understanding of fundamental methods and algorithms. It will be scheduled towards the later end of the semester (November). There is no final exam.

The data analysis case studies will require reading, writing, programming and experiments to gain hands-on experience with the application of recent machine learning methods, including concepts covered in the lectures, and concepts drawn from the literature. Students will be responsible for characterizing the problem, developing methods of analysis, and presenting the results of their work. Some case studies will be individual, others will be done in groups (usually of 3 or less).

We will use a peer-review system to evaluate the data analysis case studies. Each student will be asked to read and evaluate submissions of their colleagues. The emphasis will be placed on providing constructive feedback on the methodology and presentation.

Evaluation Policy

All course work should be submitted online (details to be given in class), by 11:59pm, on the assigned due date. Late work will be subject to a 50% penalty, and can be submitted anytime up to the last day of class (or 1 week after the deadline, whichever comes later).

No make-up midterm will be given.

Some of the course work will be individual, other components can be completed in groups. It is the responsibility of each student to understand the policy for each work, and ask questions of the instructor if this is not clear. It is also the responsibility of each student to carefully acknowledge all sources (papers, code, books, websites, individual communications) using appropriate referencing style when submitting work.

We will use automated systems to detect possible cases of text or software plagiarism. Cases that warrant further investigation will be referred to the university disciplinary officers. Students who have concerns about how to properly use and acknowledge third-party software should consult the course instructor or TAs.

McGill University values academic integrity. Therefore all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (see www.mcgill.ca/students/srr/honest/ ) for more information).

In accord with McGill University's Charter of Students' Rights, students in this course have the right to submit in English or in French any written work that is to be graded.

In the event of extraordinary circumstances beyond the University's control, the content and/or evaluation scheme in this course is subject to change.