Course Description
This course covers a selected set of topics in machine learning and data mining, with an emphasis on good methods and practices for deployment of real systems. The majority of sections are related to commonly used supervised learning techniques, and to a lesser degree unsupervised methods. This includes fundamentals of algorithms on linear and logistic regression, decision trees, support vector machines, clustering, neural networks, as well as key techniques for feature selection and dimensionality reduction, error estimation and empirical validation.
Prerequisites:
This course requires programming skills (Python) and basic knowledge of probabilities,
calculus and linear algebra provided by courses similar to MATH-323 or ECSE-305.
For more information see the course prerequisites and restrictions
at
McGill’s webpage.
Course Material:
Assignments, announcements, slides, project descriptions and other course materials are posted on
myCourses.
Textbooks:
There are no required textbook but the topics are covered by the following books:
[Bishop]
Pattern Recognition and Machine Learning by Christopher Bishop (2007)
[GBC]   
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016)
[Murphy]
Machine Learning: A Probabilistic Perspective by Kevin Murphy (2012)
[HTF]  
The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009)
Other Related References
Information Theory, Inference, and Learning Algorithms, by David MacKay (2003)
Bayesian Reasoning and Machine Learning , by David Barber (2012).
Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David (2014)
Foundations of Machine Learning, by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (2018)
Dive into Deep Learning , by Aston Zhang, Zachary Lipton, Mu Li, and Alexander J. Smola (2019)
Mathematics for Machine Learning , by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong (2019)
A Course in Machine Learning, by Hal Daumé III (2017)
Hands-on Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron (2017)
Tutorials:
Probability and Linear Algebra
Jan 21st, 11:35am-12:55pm, Strathcona Anatomy & Dentistry M-1
Jan 21st 6:05pm-7:55, RPHYS 112
Python / NumPy
Jan 20th, 6:05pm-7:55pm, ENGMC 304
Jan 21st, 11:35am-12:55pm, Strathcona Anatomy & Dentistry M-1
Scikit-Learn
Feb 3rd, 6:05pm-7:25pm, ENGMC 304
Feb 4th, 4:05pm-5:25pm, BURN 1B24
Feb 6th, 6:05pm-7:55pm, ENGMC 304
Pytorch
Feb 24th, 6:05pm-7:55pm, MAASS 112
Feb 25th, 6:05pm-7:55pm, LEA 232
Feb 26th, 3:05pm-4:55pm, EDUC 624
Tentative Outline
- Syllabus and Introduction (short version)
- - optional reading: review linear algebra (sections 1-3), and probability theory.
- K-Nearest Neighbours and Some Important Concepts (short version)
- - chapter 1 [Murphy]
- - optional reading: Domingos, Pedro M. A few useful things to know about machine learning." Commun. acm 55.10 (2012): 78-87.
- Linear Regression (short version)
- - 7.1-7.4 [Murphy], 3.1-3.22 [HTF], 3-3.1.3 [Bishop]
- Logistic Regression (short version) and Naive Bayes (short version)
- - 8.1-8.3.3 [Murphy], 4.4-4.4.3[HTF], 4.1-4.1.3 + 4.3-4.3.3 [Bishop]
- Regularization, Bias-Variance (short version)
- - 3.1.4-3.3 [Bishop]
- Gradient Descent (short version)
- - optional reading: Ruder, S., 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.
- Linear Support Vector Machines (short version)
- - 4.5-4.5.2 [HTF], 4.11-4.13 + 4.1.7 + 7.1-7.14 excluding Kernels [Bishop]
- Decision Trees (short version)
- - 9.2 [HTF], 16.1-16.2.6 [Murphy], 14.4 [Bishop]
- - a beautiful visual guide to decision trees; also bias-variance in decision trees.
- Bootstrap, Bagging and Boosting (short version)
- - 10-10.13 + 15-15.3.2 [HTF], 16.4 [Murphy], 14.3 [Bishop]
- - see demonstrations here and here.
- Multilayer Perceptrons (short version)
- - 6-6.5 + parts of 7 [GBC]
- - see the demo at tensorflow playground.
- Gradient Computation and Autodiff (short version)
- - 6.5 + 8.2 [GBC]
- - a nice blog post on computational graph and backpropagation.
- - visualizations of the loss landscape in deep learning.
- Convolutional Neural Networks (short version)
- - 9 [GBC]
- - a nice blog post on feature visualization in convnets .
- - Dumoulin, V. and Visin, F., 2016. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285.
- Recurrent Neural Networks
- Dimensionality Reduction
- Clustering
- Bayesian Inference and Conjugate Priors
- Bayesian Linear Regression
- Kernel trick
- Gaussian Processes