CSC 588: Machine Learning Theory - Spring 2021

CSC 588: Machine Learning Theory - Spring 2021

Tentative schedule

Date Topics Scribe notes Handwritten notes Additional readings Homework  
Jan 14 Introduction, motivation, course mechanics     SSBD App B.1 and B.2 HW0 (calibration)  
Jan 19 The PAC learning framework; the consistency algorithm Scribe note (Please feel free to make a copy of this as a starting point when scribing) Note 2 SSBD Chap 2, Sec 3.1    
Jan 21 Analysis of the consistency algorithm; agnostic PAC learning; Hoeffding’s Inequality   Note 3 SSBD Chap 2, Sec 3.1    
Jan 26 Proof of Hoeffding’s inequality; Bernstein’s inequality Scribe note by Brian Toner Note 4 SSBD B.4, B.5    
Jan 28 Analysis of ERM; VC theory Scribe note by Alonso Granados Note 5 SSBD Chap 4, Sec 6.1, 6.2    
Feb 2 VC dimension examples; Sauer’s Lemma Scribe note by Katherine Best Note 6 SSBD Sec 6.2 - 6.5.1 HW1  
Feb 4 Proof of Sauer’s Lemma; VC dimension of composite hypothesis classes Scribe note by Marium Yousuf Note 7 SSBD Sec 6.2 - 6.5.1    
Feb 9 Uniform convergence via Rademacher complexity Scribe note by Brady Gales Note 8 SSBD Sec 6.5.2, 26.1, 28.1    
Feb 11 Proof of uniform convergence of VC classes Scribe note by Yinan Li Note 9 SSBD Sec 6.5.2, 26.1, 28.1    
Feb 16 Lower bounds of PAC learning with VC classes; fundamental theorem of statistical learning Scribe note by Xiaolan Gu Note 10 SSBD Sec 5.1    
Feb 18 Error decomposition in statistical learning; model selection Scribe note by Jie Bian Note 11 SSBD Chap 7    
Feb 23 Finish SRM; Adaboost and its training error analysis Scribe note by Yichen Li Note 12 SSBD Chap 10    
Feb 25 No class - Reading day          
Mar 2 Weak learnability implies “linear separability” through minimax theorem; Margin-based generalization error bounds for AdaBoost and linear classification Scribe note by Ryan Sullivant Note 13 SSBD 26.1, 26.2, 26.4 HW2  
Mar 4 Proof of margin-based generalization error bounds; Contraction inequality of Rademacher complexity; SVM formulations Scribe note by Ruby Abrams Note 14 SSBD 26.3    
Mar 9 No class - Reading day          
Mar 11 ell_2-norm-based margin bounds; Extensions of SVM; Regularized loss minimization formulations Scribe note by Shahriar Golchin Note 15 SSBD 26.3; Chap 15; Spectrally-normalized margin bounds for neural networks    
Mar 16 Stability, strong convexity, and regularization Scribe note by Adrienne Kinney Note 16 SSBD Chap 13    
Mar 18 Stability-fitting tradeoff; online learning: definitions and examples Scribe note by Sarah Luca Note 17 SSBD Chap 13.4, O Chap 1, Haipeng Luo’s online learning lecture notes 1 (which this lecture is based heavily on)    
Mar 23 Online to batch conversion; Azuma’s Inequality; online gradient descent Scribe note by Sheldon Deeny Note 18 O Chap 3, Chap 2    
Mar 25 Analysis of online gradient descent; Online mirror descent: basic definitions (class meeting cancelled; Pre-recorded lecture on Panopto) Scribe note by Caleb Dahlke Note 19 O Chap 2, Sec 6.1-6.3    
Mar 30 Online mirror descent examples: p-norm, exponential weights; Fenchel conjugate Scribe note by Erik Wessel Note 20 O Theorem 2.19, 5.2.1, 6.4.1, 6.6, 6.7 HW3  
Apr 1 Online mirror descent analysis; Online learning odds & ends: unknown time horizon, lower bounds, Follow the Regularized Leader Scribe note by Yao Zhao Note 21 O 6.4, 5.1, 7.1    
Apr 6 Online gradient descent for strongly convex functions; kernel methods Scribe note by Zisu Wang Note 22 SSBD 14.4.4, 14.5.3, 15.5, 16.2, 16.3    
Apr 8 Finish kernel methods; online Newton step for exp-concave functions Scribe note by Robert Vacareanu Note 23 SSBD 16.3, O 7.9    
Apr 13 Finish online Newton step; begin multi-armed bandits (MAB)   Note 24 LS Chap 4    
Apr 15 Explore-then-commit; Upper confidence bound (UCB) algorithm and analysis Scribe note by Jesse Friedbaum Note 25 LS Chap 6,7    
Apr 20 Finish UCB analysis; Adversarial MAB; EXP3 algorithm Scribe note by Bohan Li Note 26 LS Chap 11 HW4  
Apr 22 Stochastic linear contextual bandits and the LinUCB/OFUL algorithm Scribe note by Dan Li Note 27 LS Chap 19, 20    
Apr 27 Episodic MDPs, Optimistic Q-learning (based on the original Optimistic Q-Learning paper and Haipeng Luo’s lecture note)   Note 28 (page 7 onwards will not appear in the exam) Chi Jin’s RL theory course notes RL theory book by Alekh Agarwal, Nan Jiang, Sham Kakade, and Wen Sun    
Apr 29 Project presentation I          
May 4 Project presentation II          

|Nov 7 | Prediction with expert advice (1) | | SSBD Sec 21.2 | | |Nov 12 | Prediction with expert advice (2) | | | | |Nov 14 | Stochastic multi-armed bandits (1) | | LS Chaps 6, 7 | | |Nov 19 | Stochastic multi-armed bandits (2) | | | | |Nov 21 | Adversarial multi-armed bandits | | LS Chap 11 | | LS Chap 19 |Dec 3 | Stochastic linear bandits (2) | | LS Chap 20 | | |Aug 27 | Introduction, motivation, course mechanics | slides probability review | SSBD App B.1 and B.2 | HW0 | |Aug 29 | Concentration of measure | concentration of measure note 1 | SSBD App B.4 | | |Sep 3 | Chernoff bound for Bernoulli random variables, McDiarmid’s Inequality | concentration of measure note 2 | SSBD App B.3, Lemma 26.4 | HW0 due in class | |Sep 5 | The PAC learning framework, finite classes | PAC learning note | SSBD Chap 2, Sec 3.1 | | |Sep 10 | Agnostic PAC learning, infinite classes | | SSBD Sec 3.2-3.3, Chap 4, Sec 6.1 | | |Sep 12 | VC Theory, Sauer’s Lemma | VC Theory note | SSBD Sec 6.2 - 6.5.1 | | |Sep 17 | Rademacher complexity | Uniform convergence / Rademacher complexity note | SSBD Sec 6.5.2, 26.1, 28.1 | HW1 | |Sep 19 | Information-theoretic lower bounds of PAC learning | | SSBD 28.2.1 | | |Sep 24 | Le Cam’s method | | SSBD 28.2.1 | | |Sep 26 | Assouad’s method, fundamental theorem of statistical learning | Lower bound note | SSBD Sec 28.2.2, Sec 5.1 | HW2 | |Oct 1 | Support Vector Machine (SVM); the maximum margin hyperplane problem | | SSBD Chap 15, Secs 26.2, 26.3 | | |Oct 3 | Soft margin, nonlinear mapping, the dual of SVM | | SSBD Sec 15.4 | | |Oct 8 | Kernel trick | | SSBD Chap 16 | | |Oct 10 | Margin bounds, generalization bounds of SVM | | SSBD Sec 26.3 | | |Oct 15 | Proof of margin bounds via Rademacher complexity | SVM note | SSBD Secs 26.2, 26.4 | | |Oct 17 | Model selection, structural risk minimization | Model selection note | SSBD Chap 7, Modern bias-variance tradeoff | Midterm (take home) | |Oct 22 | Boosting: AdaBoost | | SSBD Chap 10 | | |Oct 24 | Boosting and margin bound; two styles of margin bounds | Boosting note | SSBD Sec 26.2, Boosted ResNet paper | | |Oct 29 | Intro to Online learning; Online classification | | SSBD Sec 21.1 | HW3 | |Oct 31 | Minimax analysis: Littlestone’s dimension; Expert problem | Online classification note | SSBD Sec 21.1.1 | | |Nov 5 | Cover’s Impossibility Result; Decision-theoretic online learning. | Handwritten notes | SSBD Sec 21.2 | | |Nov 7 | Hedge; Online to Batch conversion. | Online to batch conversion notes | SSBD Sec 21.2.1, Hedge paper; Loss range adaptivity | | |Nov 12 | Convexity, subgradients | | SSBD Sec 12.1, 14.2 | | |Nov 14 | Bregman divergence, norms, dual norms | | | | |Nov 19 | Lipschitzness, strong convexity and smoothness | | SSBD Sec 12.1, 13.3 | HW4 | |Nov 21 | Legendre-Fenchel conjugacy, online convex optimization, Follow the regularized leader (FTRL) | OCO note | Section 2.7, 2.3 of Shalev-Shwartz’s Survey | | |Nov 26 | Analysis of FTRL; FTRL Examples | | H Sec 5.3 | | |Nov 28 | No Class. Thanksgiving Holiday! | | | | |Dec 3 | Online linear classification with margins; FTRL with Adaptive regularization | | Section 3.3 of Shalev-Shwartz’s Survey; H Sec 5.6 | | |Dec 5 | Adagrad, online optimization of strongly convex functions | | H Sec 3.3 | | |Dec 10 | Final presentation | | | Topics to review | )