Principles of training multi-layer neural network using backpropagation 使用后向传播算法训练多层神经网络的规则
转自:http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html The project describes teaching process of multi-layer neural network employing backpropagation algorithm. To illustrate this process the three layer neural network with two inputs and one output,which is shown in the picture below, is used: Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise nonlinear function, called neuron activation function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also output signal of neuron. 每个神经元包含两个单元。第一个单元是加法器,把权重和输入相乘并相加,得e,第二个单元是非线性函数,叫做神经元**函数f,他把加法器的结果输入非线性函数,即y=f(e),y也是神经元的输出。 To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer. Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n. 训练神经网络需要数据集,数据集包括x1和x2,和对应的目标值z。网络的训练过程是一个循环的过程,在每次循环中,新的数据将会修改节点的系数。修改的过程如下:每次训练时,从训练集中获取数据,经过计算获得每层里神经元的输出值。如图,w(xm)n表示输入层xm 和神经元 n 之间的权重,yn 表示神经元n的输出值。 算法的下一步是网络的输出值和目标值z进行比较,目标值z是训练数据集的数据,他们之间的差 d 叫做误差信号。 直接计算内部神经元的误差信号是不可能的,因为这三个神经元的输出是不知道的,在以前的一些年里,有效训练多层神经网络是不知道的。在20世纪80年代,后向传播算法才有明显的效果。内容是误差信号d (在每层里计算的)传播回来到所有神经元,输出变成输入。 The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below: 权重wmn ,用在传播误差和用在计算输出值是一样的,只有数据的流向是改变的(信号从输出传播到输入),每层都使用这个技术,如图。 When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified). 当每个神经元的误差信号计算好后,每层的输入节点的权重可能改变。 df(e)/de 表示神经元**函数的导数。 Coefficient h affects network teaching speed. There are a few techniques to select this parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs. 系数 h 影响网络的训练速度。有多种方法来选择这个参数。第一种是一开始使用很大的参数值,当权重逐渐训练获得时,这个参数逐渐降低。第二种,一开始使用很小的参数值。在训练过程,随着训练的前进,这个参数逐渐增大,最后降低。开始小的参数可以决定权重。 |