Course notes of CS231n, Convolutional Neural Networks for Visual Recognition.
2. Image Classification pipeline
- Challenges
- Viewpoint variation 多视角
- Illumination 光线
- Deformation 变形
- Occlusion 遮挡
- Background Clutter 杂乱的背景
- Intraclass variation 内类多样性
- Nearest Neighbor
- Compute the distances of testing images and training images
- \(L_n\) norm
- \((\sum_{i}{|X^i|^n})^{\frac{1}{n}}\)
- \(L_n\) distance
- \((\sum_{i}{|X_1^i - X_2^i|^n})^{\frac{1}{n}}\)
- method
- For each testing image, find the nearest training image
- Use the label of finding training image as the prediction
- complexity
- Training: \(O(1)\)
- Testing: \(O(n)\)
- K-Nearest Neighbor
- Find k nearest t raining images to the testing image
- Use the most voted label
- K-Nearest Neighbors Demo
- Never used on images
- very slow when testing
- Distance metrics on pixels are not infomative
- Dataset spliting
- train, test
- train, validation, test
- Cross-Validation
- folds
- each fold as validation and average the results
- Useful for small datasets, but not used too frequently in deep learning
- errorbar, violin plot https://matplotlib.org/gallery/index.html#statistics
- Linear Classifier
- \(f(\mathbf x, W) = W\mathbf x+b\)
- have hard cases
- xor
- circles