Neural Networks - Examples and intuitions II
摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第九章《神经网络学习》中第70课时《例子与直觉理解II》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助.
————————————————
In this video, I’d like to working through our example to show how a neural network can compute nonlinear hypotheses.
In the last video, we saw how a neural network can be used to compute the functions , and the function , when and are binary, that is, what they take on values is 0 and 1. We can also have a network to compute negation, that’s to compute the function . Let me just write down the ways associated to this network. We have only one input feature in this case, and the bias unit +1. And if I associate this with the weights, +10 and -20, then my hypotheses is computing this . So when is equal to 0, my hypotheses will be computing g(10-20*0) which is g(10), and so that’s approximately 1. And when equals 1, this will be g(-10), which is therefore approximately equal to 0. And if you look at what these values are, that’s essentially the function. So to include negations, the general idea is to put a large negative weight in front of the variable you want to negate, so if it’s -20, multiplied by , that’s the general idea of how you end up negating . And so in an example that I hope you will figure out yourself, if you want to compute a function like this , you know, well, part of that would probably be putting large negative weights in front of and . But it should be feasible to get a neural network with just one output unit to compute this as well. So, this logical function is going to be equal to 1, if and only if . So, this is a logical function, that is means must be 0, and that means must be equal to 0 as well. So, this logical function is equal to 1, if and only if . And hopefully, you should be able to figure out how to make a small neural network to compute this logical function as well.
Now, taking the 3 pieces that we have, put together to network, the network for computing , and the network for computing , and one last network for computing . We should be able to put these 3 pieces together to compute this function. And just remind you if this was , , this function that we want to compute would have negative examples here and here and we’d have positive examples there and there. So, clearly, we’ll need a nonlinear decision boundary in order to separate the positive and negative examples. Let’s draw the network. I’m going to take my input plus 1, , , and create my first hidden unit there. I’m going to call this , because it’s my first hidden unit. And I’m going to copy the weights over from the red network that’s networks. So now -30, 20, 20. Next, let me create a second hidden unit, which I’m going to call , that is the second hidden unit of layer two. And I’m going to copy over the Cyan network in the middle, so I have the weights 10, -20, -20. And, so let’s pull some of the true table values. For the Red network, we know it was computing . And so this will be approximately 0, 0, 0, 1 depending the values of and . And for , that’s the Cyan network, well we know the function then outputs 1, 0, 0, 0. So, for the 4 values of and . Finally, I’m going to create my output node, my output unit that is . This is what will output , and I’m going to copy over the network for that and I’m going to need a plus 1 bias unit here, so I draw that in. And I’m going to copy over the weights from the green networks, so, -10, 20, 20. And we know earlier that this computes the function. So, let’s go on the truth table entries. For the first entry is 0 OR 1, which is gonna be 1. Then next 0 OR 0, which is 0. 0 OR 0, which is 0, 1 or 0, and that is to 1. And thus, is equal to 1, when either both and are 0, or when and are both 1. And concretely, outputs 1 exactly at these 2 locations, and it outputs 0 otherwise. And thus, with this neural network, which has an input layer, one hidden layer and one output layer, we end up with a nonlinear decision boundary that computes this XNOR function. And the more general intuition is that in the input layer, we just had all inputs. Then we had a hidden layer, which computed some slightly more complex functions of the inputs that is shown here, these are slightly more complex functions. And then by adding yet another layer, we end up an even more complex nonlinear neural networks can compute pretty complicated functions, that when you have multiple layers, you have relatively simple function of the inputs the second layer, but the third layer can build on it to compute even more complex functions, and then the layer after that can compute even more complex functions.
To wrap up this video, I want to show you a fun example of an application of a neural network that capture this intuition of the deeper layers computing more complex features.
I want to show you the video that I got from a good friend of mine, Yann LeCun. Yann is a professor at New York University, NYU, and he was one of the early pioneers of neural network research, and he’s sort of a legend in the field. Now and his ideas are used in all sorts of products and applications throughout the world now. So, I want to show you a video from some of his early work in which who was using a neural network to recognize handwriting, to do handwritten digit recognition. You might remember early in this class, at the start of this class, I said that one of early successes of neural networks was trying to use it to read zip codes, to help us send mail alone, so, read postal codes. So, this is one of the attempts, this is one of the algorithms used to address that problem. In the video, I’ll show you this area here is the input area that shows a handwritten character shown to the network. This column here shows a virtualization of the features computed by sort of the first hidden layer of the network. And so the first hidden layer, this visualization shows different features, different edges and lines and so on detected. This is the visualization of the next hidden layer. It’s kind of hard to see, how to understand deeper hidden layers. And that’s the visualization of what the next hidden layer is computing. You probably have a hard time seeing what’s going on much beyond the first hidden layer. But then finally, all of these learned features get fed to the output layer, and shown over here is the final answers, this is the final predictive value for what handwritten digit the neural network things that is being shown. So, let’s take a look at the video. So, I hope you enjoyed the video. And that this hopefully gave you some intuition about the sorts of pretty complicated functions neural networks can learn in which it takes this input, this image, just takes this input the raw pixels, and the first hidden layer, computes some set of features, the next hidden layer computes even more complex features, and these features can then be used by essentially the final layer of logistic regression classifiers to make accurate predictions about what are the numbers that the network sees.
<end>