% Small choose
$$
Deep Learning: https://www.deeplearningbook.org
Dive into Deep Learning: https://d2l.ai
Week | Date | Topics | Reading |
---|---|---|---|
1 | 9/3 | Course Introduction | |
2 | 9/10 | Review of Linear Models | |
3 | 9/17 | No class (Mid-Autumn Festival) | |
4 | 9/24 | Machine Learning Basics | DL Ch. 5 |
5 | 10/1 | Multilayer Perceptron | D2L Ch. 5 & DL Ch. 6 |
6 | 10/8 | Regularization for Deep Learning | DL Ch. 7 |
7 | 10/15 | Optimization for DL Models | D2L Ch. 12 & DL Ch. 8 |
8 | 10/22 | Project Proposal | |
9 | 10/29 | Implementation of DL Models | D2L Ch. 6 |
10 | 11/5 | Convolutional Networks | D2L Ch. 7, 8 & DL Ch. 9 |
11 | 11/12 | Recurrent Networks | D2L Ch. 9, 10 & DL Ch. 10 |
12 | 11/19 | Hyperparameter Optimization and Tuning | D2L Ch. 19 & DL Ch. 11 |
13 | 11/26 | Generative Models: Autoencoders, GAN, Diffusion models | D2L Ch. 20 & DL Ch. 14 |
14 | 12/3 | Additional Topics: Attention Mechanisms and Gaussian Process | D2L Ch. 11, 18 |
15-16 | 12/10-17 | Final Project Presentation |
In 1962, Novikoff1 proved the first theorem about the PLA. If
the norm of the training vectors \(z\) is bounded by some constant \(R\) (\(|z| \leq R\)),and
(linear separability) the training data can be separated with margin \(\rho\): \[ \sup_w \min_i y_i(z_i \cdot w) > \rho \]
Then after at most \(N \leq \frac{R^2}{\rho^2}\) steps, the hyperplane that separates the training data will be constructed.
Novikoff’s result and Rosenblatt’s experiment raised several questions:
These questions led to the development of the statistical learning theory during 70s-80s.
Important results include:
… networks with one internal layer and an arbitrary continuous sigmoidal function can approximate continuous functions wtih arbitrary precision providing that no constraints are placed on the number of nodes or the size of the weights.
Goodfellow et al. (2016) pointed out
The second wave of neural networks research lasted until the mid-1990s. Ventures based on neural networks and other AI technologies began to make unrealistically ambitious claims while seeking investments. When AI research did not fulfill these unreasonable expectations, investors were disappointed.