CSC 588: Machine Learning Theory - Spring 2021
Tentative schedule
Date | Topics | Scribe notes | Handwritten notes | Additional readings | Homework | |
---|---|---|---|---|---|---|
Jan 14 | Introduction, motivation, course mechanics | SSBD App B.1 and B.2 | HW0 (calibration) | |||
Jan 19 | The PAC learning framework; the consistency algorithm | Scribe note (Please feel free to make a copy of this as a starting point when scribing) | Note 2 | SSBD Chap 2, Sec 3.1 | ||
Jan 21 | Analysis of the consistency algorithm; agnostic PAC learning; Hoeffding’s Inequality | Note 3 | SSBD Chap 2, Sec 3.1 | |||
Jan 26 | Proof of Hoeffding’s inequality; Bernstein’s inequality | Scribe note by Brian Toner | Note 4 | SSBD B.4, B.5 | ||
Jan 28 | Analysis of ERM; VC theory | Scribe note by Alonso Granados | Note 5 | SSBD Chap 4, Sec 6.1, 6.2 | ||
Feb 2 | VC dimension examples; Sauer’s Lemma | Scribe note by Katherine Best | Note 6 | SSBD Sec 6.2 - 6.5.1 | HW1 | |
Feb 4 | Proof of Sauer’s Lemma; VC dimension of composite hypothesis classes | Scribe note by Marium Yousuf | Note 7 | SSBD Sec 6.2 - 6.5.1 | ||
Feb 9 | Uniform convergence via Rademacher complexity | Scribe note by Brady Gales | Note 8 | SSBD Sec 6.5.2, 26.1, 28.1 | ||
Feb 11 | Proof of uniform convergence of VC classes | Scribe note by Yinan Li | Note 9 | SSBD Sec 6.5.2, 26.1, 28.1 | ||
Feb 16 | Lower bounds of PAC learning with VC classes; fundamental theorem of statistical learning | Scribe note by Xiaolan Gu | Note 10 | SSBD Sec 5.1 | ||
Feb 18 | Error decomposition in statistical learning; model selection | Scribe note by Jie Bian | Note 11 | SSBD Chap 7 | ||
Feb 23 | Finish SRM; Adaboost and its training error analysis | Scribe note by Yichen Li | Note 12 | SSBD Chap 10 | ||
Feb 25 | No class - Reading day | |||||
Mar 2 | Weak learnability implies “linear separability” through minimax theorem; Margin-based generalization error bounds for AdaBoost and linear classification | Scribe note by Ryan Sullivant | Note 13 | SSBD 26.1, 26.2, 26.4 | HW2 | |
Mar 4 | Proof of margin-based generalization error bounds; Contraction inequality of Rademacher complexity; SVM formulations | Scribe note by Ruby Abrams | Note 14 | SSBD 26.3 | ||
Mar 9 | No class - Reading day | |||||
Mar 11 | ell_2-norm-based margin bounds; Extensions of SVM; Regularized loss minimization formulations | Scribe note by Shahriar Golchin | Note 15 | SSBD 26.3; Chap 15; Spectrally-normalized margin bounds for neural networks | ||
Mar 16 | Stability, strong convexity, and regularization | Scribe note by Adrienne Kinney | Note 16 | SSBD Chap 13 | ||
Mar 18 | Stability-fitting tradeoff; online learning: definitions and examples | Scribe note by Sarah Luca | Note 17 | SSBD Chap 13.4, O Chap 1, Haipeng Luo’s online learning lecture notes 1 (which this lecture is based heavily on) | ||
Mar 23 | Online to batch conversion; Azuma’s Inequality; online gradient descent | Scribe note by Sheldon Deeny | Note 18 | O Chap 3, Chap 2 | ||
Mar 25 | Analysis of online gradient descent; Online mirror descent: basic definitions (class meeting cancelled; Pre-recorded lecture on Panopto) | Scribe note by Caleb Dahlke | Note 19 | O Chap 2, Sec 6.1-6.3 | ||
Mar 30 | Online mirror descent examples: p-norm, exponential weights; Fenchel conjugate | Scribe note by Erik Wessel | Note 20 | O Theorem 2.19, 5.2.1, 6.4.1, 6.6, 6.7 | HW3 | |
Apr 1 | Online mirror descent analysis; Online learning odds & ends: unknown time horizon, lower bounds, Follow the Regularized Leader | Scribe note by Yao Zhao | Note 21 | O 6.4, 5.1, 7.1 | ||
Apr 6 | Online gradient descent for strongly convex functions; kernel methods | Scribe note by Zisu Wang | Note 22 | SSBD 14.4.4, 14.5.3, 15.5, 16.2, 16.3 | ||
Apr 8 | Finish kernel methods; online Newton step for exp-concave functions | Scribe note by Robert Vacareanu | Note 23 | SSBD 16.3, O 7.9 | ||
Apr 13 | Finish online Newton step; begin multi-armed bandits (MAB) | Note 24 | LS Chap 4 | |||
Apr 15 | Explore-then-commit; Upper confidence bound (UCB) algorithm and analysis | Scribe note by Jesse Friedbaum | Note 25 | LS Chap 6,7 | ||
Apr 20 | Finish UCB analysis; Adversarial MAB; EXP3 algorithm | Scribe note by Bohan Li | Note 26 | LS Chap 11 | HW4 | |
Apr 22 | Stochastic linear contextual bandits and the LinUCB/OFUL algorithm | Scribe note by Dan Li | Note 27 | LS Chap 19, 20 | ||
Apr 27 | Episodic MDPs, Optimistic Q-learning (based on the original Optimistic Q-Learning paper and Haipeng Luo’s lecture note) | Note 28 (page 7 onwards will not appear in the exam) | Chi Jin’s RL theory course notes RL theory book by Alekh Agarwal, Nan Jiang, Sham Kakade, and Wen Sun | |||
Apr 29 | Project presentation I | |||||
May 4 | Project presentation II |
|Nov 7 | Prediction with expert advice (1) | | SSBD Sec 21.2 | | |Nov 12 | Prediction with expert advice (2) | | | | |Nov 14 | Stochastic multi-armed bandits (1) | | LS Chaps 6, 7 | | |Nov 19 | Stochastic multi-armed bandits (2) | | | | |Nov 21 | Adversarial multi-armed bandits | | LS Chap 11 | | LS Chap 19 |Dec 3 | Stochastic linear bandits (2) | | LS Chap 20 | | |Aug 27 | Introduction, motivation, course mechanics | slides probability review | SSBD App B.1 and B.2 | HW0 | |Aug 29 | Concentration of measure | concentration of measure note 1 | SSBD App B.4 | | |Sep 3 | Chernoff bound for Bernoulli random variables, McDiarmid’s Inequality | concentration of measure note 2 | SSBD App B.3, Lemma 26.4 | HW0 due in class | |Sep 5 | The PAC learning framework, finite classes | PAC learning note | SSBD Chap 2, Sec 3.1 | | |Sep 10 | Agnostic PAC learning, infinite classes | | SSBD Sec 3.2-3.3, Chap 4, Sec 6.1 | | |Sep 12 | VC Theory, Sauer’s Lemma | VC Theory note | SSBD Sec 6.2 - 6.5.1 | | |Sep 17 | Rademacher complexity | Uniform convergence / Rademacher complexity note | SSBD Sec 6.5.2, 26.1, 28.1 | HW1 | |Sep 19 | Information-theoretic lower bounds of PAC learning | | SSBD 28.2.1 | | |Sep 24 | Le Cam’s method | | SSBD 28.2.1 | | |Sep 26 | Assouad’s method, fundamental theorem of statistical learning | Lower bound note | SSBD Sec 28.2.2, Sec 5.1 | HW2 | |Oct 1 | Support Vector Machine (SVM); the maximum margin hyperplane problem | | SSBD Chap 15, Secs 26.2, 26.3 | | |Oct 3 | Soft margin, nonlinear mapping, the dual of SVM | | SSBD Sec 15.4 | | |Oct 8 | Kernel trick | | SSBD Chap 16 | | |Oct 10 | Margin bounds, generalization bounds of SVM | | SSBD Sec 26.3 | | |Oct 15 | Proof of margin bounds via Rademacher complexity | SVM note | SSBD Secs 26.2, 26.4 | | |Oct 17 | Model selection, structural risk minimization | Model selection note | SSBD Chap 7, Modern bias-variance tradeoff | Midterm (take home) | |Oct 22 | Boosting: AdaBoost | | SSBD Chap 10 | | |Oct 24 | Boosting and margin bound; two styles of margin bounds | Boosting note | SSBD Sec 26.2, Boosted ResNet paper | | |Oct 29 | Intro to Online learning; Online classification | | SSBD Sec 21.1 | HW3 | |Oct 31 | Minimax analysis: Littlestone’s dimension; Expert problem | Online classification note | SSBD Sec 21.1.1 | | |Nov 5 | Cover’s Impossibility Result; Decision-theoretic online learning. | Handwritten notes | SSBD Sec 21.2 | | |Nov 7 | Hedge; Online to Batch conversion. | Online to batch conversion notes | SSBD Sec 21.2.1, Hedge paper; Loss range adaptivity | | |Nov 12 | Convexity, subgradients | | SSBD Sec 12.1, 14.2 | | |Nov 14 | Bregman divergence, norms, dual norms | | | | |Nov 19 | Lipschitzness, strong convexity and smoothness | | SSBD Sec 12.1, 13.3 | HW4 | |Nov 21 | Legendre-Fenchel conjugacy, online convex optimization, Follow the regularized leader (FTRL) | OCO note | Section 2.7, 2.3 of Shalev-Shwartz’s Survey | | |Nov 26 | Analysis of FTRL; FTRL Examples | | H Sec 5.3 | | |Nov 28 | No Class. Thanksgiving Holiday! | | | | |Dec 3 | Online linear classification with margins; FTRL with Adaptive regularization | | Section 3.3 of Shalev-Shwartz’s Survey; H Sec 5.6 | | |Dec 5 | Adagrad, online optimization of strongly convex functions | | H Sec 3.3 | | |Dec 10 | Final presentation | | | Topics to review | )