Lecture Schedule
| Date | Topic | Materials |
| Sep.1 | Introduction. Types of machine learning. Linear regression. Overfitting and cross-validation | Lecture 1 slides Bishop, Sec. 1.1 NY Times Article on Statistics If you need to catch up on the math, a brief probability review and linear algebra review from Stanford University |
| Sep.6 | Overfitting and bias-variance error decomposition. Linear models with basis functions. Gradient descent | Lecture 2 slides Bishop, Sec. 3.1, 3.2 |
Sep.8 | More on linear methods for regression. Analysis of least-squares as maximum-likelihood learning. L2 and L1 regularization. Bayesian learning | Lecture 3 slides Bishop, Sec. 3.1, 3.3 |
| Sep.13 | Classification. Generative vs. discriminative learning. Naive Bayes. | Lecture 4 slides Bishop, Sec. 4.2. |
| Sep.15 | Gaussian Discriminant Analysis. Logistic regression. | Lecture 5 slides Bishop, Sec. 4.3 |
| Sep.20 | Feed-forward neural networks | Lecture 6 slides Bishop, Sec. 5.1, 5.2, 5.3, 5.4, 5.5 |
| Sep.22 | Instance-based learning |
Lecture 7 slides Bishop, Sec. 2.5 |
| Sep.27 | Decision trees | Lecture 8 slides Bishop, Sec. 1.6, 14.4 |
| Sep. 29 | Ensemble methods. Boosting | Lecture 9 slides Bishop, Sec. 14.2 14.3 Ensemble methods in machine learning by Tom Dietterich (2000). See also the boosting web site |
| Oct. 4 | Perceptrons. Support Vector Machines | Lecture 10 slides Bishop, Sec. 4.1, 7.1. See also the kernel machines web site |
| Oct. 6 | Non-linear SVM. Kernels. Kernelizing other algorithms. | Lecture 11 slides Bishop, Sec. 6.1, 6.2, 7.1 See also the kernel machines web site |
| Oct. 11 | Computational learning theory. PAC model. VC dimension. | Lecture 12 slides Mitchell, Chapter 7 See also the computational learning theory web site |
| Oct. 13 | Wrap-up of computational learning theory. Experimental comparisons of algorithms | Lecture 13 slides Mitchell, Chapter 5 See also ROC Graphs: Notes and Practical Considerations for Researchers by Tom Fawcett (2004). |
| Oct. 18 | Active learning and semi-supervised learning | Lecture 14 slides Active learning tutorial by Burr Settles (2010) |
| Oct. 20 | Clustering | Lecture 15 slides Bishop, Sec. 9.1 |
| Oct. 25 | Expectation Maximization (EM) and mixture of Gaussians. | Lecture 16 slides Bishop, Sec. 9.2, 9.3, 9.4 |
| Oct. 27 | Dimensionality reduction. PCA. | Lecture 17 slides Bishop, Sec. 12.1 |
| Nov. 1 | Midterm exam. | Covering lectures 1-12. You are allowed one double-sided "cheat sheet" Midterm from 2010. |
| Nov. 3 | Kernel PCA and other non-linear dimensionality reduction methods | Lecture 18 Bishop, Sec. 12.3, 12.4 |
| Nov. 8 | Introduction to graphical models. Undirected models. Belief propagation | Lecture 19
Bishop, Sec. 8.3, 8.4 See also graphical models tutorial by Koller et al (2007). |
| Nov. 10 | No class | |
| Nov. 15 | Time series data. Hidden Markov Models | Lecture 20 slides Bishop, Sec. 13.1, 13.2 See also HMM tutorial by Rabiner (1989) |
| Nov. 17 | More on time series data. Kalman and particle filters. Bayes nets. Importance sampling. Gibbs sampling | Lecture 21 slides Bishop, Sec. 8.1, 8.2, 13.3 See also Particle filter tutorial by Arulampalam et al (2002) See also graphical models tutorial by Koller et al, 2007. |
| Nov. 22 | MCMC methods in general. Learning in graphical models revisited | Lecture 22 |
| Nov. 24 | No lecture |
|
| Nov. 29 | Reinforcement learning | Lecture 23
|
| Dec. 1 | More reinforcement learning. Current frontiers and open problems in machine learning |