CS231n Convolutional Neural Networks for Visual Recognition

@(机器学习和人工智能)[机器学习, CNN]

video link

Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition

  1. History:
    • 1960s:recognize & reconstruct
    • object recognition is so hard first we do object segmetation
      CS231n Convolutional Neural Networks for Visual Recognition
    • feature based segmetation:
      CS231n Convolutional Neural Networks for Visual Recognition
    • SVM, boosting: complex;overfit(data quailty is changing)
  2. 两个最经典的data set:
    • PASCAL Visual Object Challenge(object detection benchmark )
    • ImageNet Large Scale Visual Recognition Challenge
  3. CNN基本算法在1998年由LeCun等提出,2012年在ImgeNet上大显身手火了起来,再次火起来原因:电路集成规模越来越大,GPU的快速发展,data的质量和数量爆炸式增长。
  4. 学习CNN的预备知识:微积分,线性代数,CS229

Lecture 2 | Image Classification

  1. Data Driven Approach
    1. Collect a dataset of images and labels
    2. Use Machine Learining to train a classifier
    3. Evaluate the classifier on new images

k-Nearest Neighbors(kNN)

  1. 在最近的k个邻居中,哪一类个数最多,就归为哪一类
  2. CS231n Convolutional Neural Networks for Visual Recognition
  3. hyperparameters: choices about the algorithm that we set rather than learn
    how to set proper hyperparameters: split dataset into train, validation and test set
    • train set(most data)
    • validation set: envaluate
    • test set: test once
  4. k-Nearest Neighbors on imgages never used

Linear Classification

  1. super important and help us build CNNs
  2. parametric approach: image(array of numbers) f(x,W)(score function) 10 numbers giving class scores
    • x: input
    • W: weight or parameters
    • b: bias
      CS231n Convolutional Neural Networks for Visual Recognition
      假设有10类,则最终得到10行1列的列向量,其中每个数字代表了是该类的可能性,数字越大可能性越大。
  3. 举例说明,下面是对于一个给定的W,4个像素的image,分为3类的计算过程:
    CS231n Convolutional Neural Networks for Visual Recognition
    训练结果的可视化:
    CS231n Convolutional Neural Networks for Visual Recognition
  4. Linear Classification可以理解为平面上的直线,各分类器将平面上的不同区域分为不同类别:
    CS231n Convolutional Neural Networks for Visual Recognition
    所以有一些线性不可分问题,一层线性分类器是解决不了的,因为在平面上无法用一条直线将两类分开,如异或,或下图中的例子。
    CS231n Convolutional Neural Networks for Visual Recognition

Lecture 3 | Loss Functions and Optimization

  1. loss funciton: quantify how good/bad our current classifier is given a dataset {(xi,yi)}Ni=1, where xi is image and yi is (integer) label.
    1. L=1NiLi(f(xi,W),yi)
    2. Multiclass SVM loss: si=f(xi,W)
      Li=jyimax(0,sisj+1)
      CS231n Convolutional Neural Networks for Visual Recognition

CS231n Convolutional Neural Networks for Visual Recognition
若s都很小,约等于0,则loss等于类别数量减一,可以用来debug。
4. Loss等于0的W不只一个,比如2W。
5. 不应该关注training data上的performance,而关注testing data上的。
CS231n Convolutional Neural Networks for Visual Recognition
CS231n Convolutional Neural Networks for Visual Recognition

回归项使其倾向于选择一个更简单的W
6. 常见regularizaton:举例
CS231n Convolutional Neural Networks for Visual Recognition