Lecture Schedule
| Date | Topic | Materials |
| Sep.6 | Introduction. Types of machine learning. Linear regression. Overfitting. | Lecture 1 slides Bishop, Sec. 1.1 NY Times Article on Statistics If you need to catch up on the math, a brief probability review and linear algebra review from Stanford University |
| Sep.11 | Overfitting, cross-validation and bias-variance error decomposition. Linear models with basis functions. Gradient descent | Lecture 2 slides Bishop, Sec. 3.1, 3.2 |
Sep.13 | More on linear methods for regression. Analysis of least-squares as maximum-likelihood learning. L2 and L1 regularization. Bayesian learning | Lecture 3 slides Bishop, Sec. 3.1, 3.3 |
| Sep.18 | More on regularization. Classification. Generative vs. discriminative learning. Logistic regression | Lecture 4 slides Bishop, Sec. 4.2. |
| Sep.20 | Logistic regression. Feed-foreward neural networks. | Lecture 5 slides Bishop, Sec. 4.3 |
| Sep.25 | Naive Bayes. Gaussian Discriminant Analysis | Lecture 6 slides Bishop, Sec. 5.1, 5.2, 5.3, 5.4, 5.5 |
| Sep.27 | Instance-based (non-parametric) learning |
Lecture 7 slides Bishop, Sec. 2.5 |
| Oct.2 | Decision trees | Lecture 8 slides
Bishop, Sec. 1.6, 14.4 |
| Oct. 4 | Support vector machines. Kernels | Lecture 9 slides |
| Oct. 9 | Kernelizing other algorithms. Wrap-up of SVMs | Lecture 10 slides Bishop, Sec. 4.1, 7.1. See also the kernel machines web site |
| Oct.16 | Ensemble methods. Bagging. Boosting | Lecture 11 slides Bishop, Sec. 6.1, 6.2, 7.1 See also the kernel machines web site |
| Oct. 18 | Experimental analysis. Active learning | Lecture 12 slides Active learning tutorial by Burr Settles (2010) Mitchell, Chapter 5 See also ROC Graphs: Notes and Practical Considerations for Researchers by Tom Fawcett (2004). |
| Oct. 23 and 25 | Computational learning theory | Lecture 13 and 14 slides |
| Oct. 30 | Clustering | Lecture 15 slides Bishop, Sec. 9.1 |
| Nov. 1 | Midterm exam. | Covering lectures 1-13. You are allowed one double-sided "cheat sheet" Midterm from 2011. Midterm from 2009 |
| Nov. 6 | Expectation Maximization (EM) and mixture of Gaussians. Introduction to dimensionality reduction | Lecture 16 slides
Bishop, Sec. 9.2, 9.3, 9.4 |
| Nov. 8 | PCA, Kernel PCA and other dimensionality reduction methods. | Lecture 17 slides Bishop, Sec. 12.1, 12.3, 12.4 |
| Nov. 15 | Time series data. Hidden Markov Models | Lecture 18 slides Bishop, Sec. 13.1, 13.2 See also HMM tutorial by Rabiner (1989) |
| Nov. 20 | More on time series data. Kalman and particle filters. Bayes nets. Importance sampling. Gibbs sampling | Bishop, Sec. 8.1, 8.2, 13.3 See also Particle filter tutorial by Arulampalam et al (2002) See also graphical models tutorial by Koller et al, 2007. |
| Nov. 13 | Introduction to graphical models. Undirected models. Belief propagation |
Bishop, Sec. 8.3, 8.4 See also graphical models tutorial by Koller et al (2007). |
| Nov. 22 | MCMC methods in general. Learning in graphical models revisited | |
| Nov. 27 | Reinforcement learning |
|
| Nov. 29 | More reinforcement learning | |
| Dec. 4 | Frontiers and current trends in machine learning |