Lecture 6 Training Neural Network (1)

each neuron performs a dot product with the input and its weights, adds the bias and applies the non-linearity (or activation function)。 换句通俗易懂的话就是说,每一个神经元做的事情实质上是用输入的像素和权重做了点积,之后加上了bias。得到了一个类似于scores的东西,然后将这个scores通过一个非线性计算(**方程)。

A single neuron can be used to implement a binary classifier (e.g. binary Softmax or binary SVM classifiers)



Neural Network architectures

Layer-wise organization

  • 命名习惯:一般我们说的 N-layer 神经网络,实际上是不包括 input layer 的。所以平时说的 single layer 指得就是没有隐藏层,从输入层直接映射到输出层。
  • 输出层:输出层后面是没有**方程的,因为最后一层代表的是 class scores。

feed-forward computation

  • The forward pass of a fully-connected layer corresponds to one matrix multiplication followed by a bias offset and an activation function. 翻译过来就是正向计算时要做的就是:矩阵相乘后加上bias,然后结果通过一个**方程。


Setting number of layers and their sizes

  • 直接上结论:The takeaway is that you should not be using smaller networks because you are afraid of overfitting. Instead, you should use as big of a neural network as your computational budget allows, and use other regularization techniques to control overfitting.
    翻译一下意思就是说:不要使用小的网络。你应该使用尽量大的网络,并且使用一些正则化方法(such as L2 regularization, dropout, input noise,higher weight decay)来避免过度拟合。