Applied Machine Learning

Winter 2020 (COMP551-002)


Class: Tuesdays and Thursdays 4:05 pm-5:25 pm, MAASS 10
Instructor: Siamak Ravanbakhsh
Office hours: Wednesdays 4:30 pm-5:30 pm, ENGMC 325
Sections: Another Section is Offered by Prof. Reihaneh Rabbany
TA Contact ( and Office Hours
Yanlin Zhang (yanlin.zhang2), Head TA,
Jin Dong (jin.dong), Wed 3-4pm, MC 108
Haque Ishfaq (haque.ishfaq), Mon 1-2pm, MC 112
Martin Klissarov (martin.klissarov), Tue 3-4pm, MC 112
Kian Ahrabian (kian.ahrabian), Mon 1-2pm, MC 321
Samin Yeasar Arnob (samin.arnob), Wed 9-10am, MC 438
Tianzi Yang (tianzi.yang), Mon 4-5pm, Trottier 3090
Zhilong Chen (zhilong.chen), Tue 4-5pm, MC 229
David Venuto (david.venuto), Wed 1-2pm, MC 321
Yutong Yan (yutong.yan), Wed 2-3pm, MC 108
Arnab Kumar Mondal (arnab.mondal), Wed 11am-12pm, MC 202

Course Description

This course covers a selected set of topics in machine learning and data mining, with an emphasis on good methods and practices for deployment of real systems. The majority of sections are related to commonly used supervised learning techniques, and to a lesser degree unsupervised methods. This includes fundamentals of algorithms on linear and logistic regression, decision trees, support vector machines, clustering, neural networks, as well as key techniques for feature selection and dimensionality reduction, error estimation and empirical validation.


This course requires programming skills (Python) and basic knowledge of probabilities, calculus and linear algebra provided by courses similar to MATH-323 or ECSE-305. For more information see the course prerequisites and restrictions at McGill’s webpage.

Course Material:

Assignments, announcements, slides, project descriptions and other course materials are posted on myCourses.


There are no required textbook but the topics are covered by the following books:
[Bishop] Pattern Recognition and Machine Learning by Christopher Bishop (2007)
[GBC]    Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016)
[Murphy] Machine Learning: A Probabilistic Perspective by Kevin Murphy (2012)
[HTF]   The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009)

Other Related References
Information Theory, Inference, and Learning Algorithms, by David MacKay (2003)
Bayesian Reasoning and Machine Learning , by David Barber (2012).
Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David (2014)
Foundations of Machine Learning, by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (2018)
Dive into Deep Learning , by Aston Zhang, Zachary Lipton, Mu Li, and Alexander J. Smola (2019)
Mathematics for Machine Learning , by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong (2019)
A Course in Machine Learning, by Hal Daumé III (2017)
Hands-on Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron (2017)


Probability and Linear Algebra
Jan 21st, 11:35am-12:55pm, Strathcona Anatomy & Dentistry M-1
Jan 21st 6:05pm-7:55, RPHYS 112
Python / NumPy
Jan 20th, 6:05pm-7:55pm, ENGMC 304
Jan 21st, 11:35am-12:55pm, Strathcona Anatomy & Dentistry M-1
Feb 3rd, 6:05pm-7:25pm, ENGMC 304
Feb 4th, 4:05pm-5:25pm, BURN 1B24
Feb 6th, 6:05pm-7:55pm, ENGMC 304
Feb 24th, 6:05pm-7:55pm, MAASS 112
Feb 25th, 6:05pm-7:55pm, LEA 232
Feb 26th, 3:05pm-4:55pm, EDUC 624

Tentative Outline

Syllabus and Introduction (short version)
- optional reading: review linear algebra (sections 1-3), and probability theory.

K-Nearest Neighbours and Some Important Concepts (short version)
- chapter 1 [Murphy]
- optional reading: Domingos, Pedro M. A few useful things to know about machine learning." Commun. acm 55.10 (2012): 78-87.

Linear Regression (short version)
- 7.1-7.4 [Murphy], 3.1-3.22 [HTF], 3-3.1.3 [Bishop]

Logistic Regression (short version) and Naive Bayes (short version)
- 8.1-8.3.3 [Murphy], 4.4-4.4.3[HTF], 4.1-4.1.3 + 4.3-4.3.3 [Bishop]

Regularization, Bias-Variance (short version)
- 3.1.4-3.3 [Bishop]

Gradient Descent (short version)
- optional reading: Ruder, S., 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.

Linear Support Vector Machines (short version)
- 4.5-4.5.2 [HTF], 4.11-4.13 + 4.1.7 + 7.1-7.14 excluding Kernels [Bishop]

Decision Trees (short version)
- 9.2 [HTF], 16.1-16.2.6 [Murphy], 14.4 [Bishop]
- a beautiful visual guide to decision trees; also bias-variance in decision trees.

Bootstrap, Bagging and Boosting (short version)
- 10-10.13 + 15-15.3.2 [HTF], 16.4 [Murphy], 14.3 [Bishop]
- see demonstrations here and here.

Multilayer Perceptrons (short version)
- 6-6.5 + parts of 7 [GBC]
- see the demo at tensorflow playground.

Gradient Computation and Autodiff (short version)
- 6.5 + 8.2 [GBC]
- a nice blog post on computational graph and backpropagation.
- visualizations of the loss landscape in deep learning.

Convolutional Neural Networks (short version)
- 9 [GBC]
- a nice blog post on feature visualization in convnets .
- Dumoulin, V. and Visin, F., 2016. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285.

Recurrent Neural Networks
Dimensionality Reduction
Bayesian Inference and Conjugate Priors
Bayesian Linear Regression
Kernel trick
Gaussian Processes


Evaluation will be based on the following components:
Weekly quizzes (15%) online in myCourses
Mini-projects (50%) group assignments
Late midterm exam (35%) March 30th 6:05 pm -8:55 pm

Late Submission

All due dates are 11:59 pm in Montreal unless stated otherwise. No make-up quizzes will be given. For mini-projects, late work will be automatically subject to a 20% penalty and can be submitted up to 5 days after the deadline. If you experience barriers to learning in this course, submitting the projects, etc., please do not hesitate to discuss them with me. As a point of reference, you can reach the Office for Students with Disabilities at 514-398-6009.

Academic Integrity

“McGill University values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures” (see for more information). (Approved by Senate on 29 January 2003)