pytorch入门学习:神经网络
定义网络
继承torch.nn
的nn.Module
我们要做的就是定义forward前向传播函数。
import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 1 input image channel, 6 output channels, 5x5 square convolution
# kernel
self.conv1 = nn.Conv2d(1, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
# an affine operation: y = Wx + b
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
# Max pooling over a (2, 2) window
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# If the size is a square you can only specify a single number
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x)) # 相当于resize:batch dimension×输入图片的维度
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
net = Net()
print(net)
输出:
查看参数
########################################################################
# You just have to define the ``forward`` function, and the ``backward``
# function (where gradients are computed) is automatically defined for you
# using ``autograd``.
# You can use any of the Tensor operations in the ``forward`` function.
#
# The learnable parameters of a model are returned by ``net.parameters()``
params = list(net.parameters())
print(len(params))
for i in range(len(params)):
print(i,':',params[i].size()) # conv1's .weight
输出:
????这里不太懂????
以下是猜测:
6,1,5,5表示参数w(也就是权重)的size
6表示参数b(偏置)的size
输出和反向传播 Processing inputs and calling backward
处理输入,获得输出:
########################################################################
# Let try a random 32x32 input
# Note: Expected input size to this net(LeNet) is 32x32. To use this net on
# MNIST dataset, please resize the images from the dataset to 32x32.
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)
输出:
Zero the gradient buffers of all parameters and backprops with random gradients:
net.zero_grad()
out.backward(torch.randn(1, 10))
总结一下 Recap
-
torch.Tensor
- A multi-dimensional array with support for autograd operations like backward(). Also holds the gradient w.r.t. the tensor. -
nn.Module
- Neural network module. Convenient way of encapsulating parameters, with helpers for moving them to GPU, exporting, loading, etc. -
nn.Parameter
- A kind of Tensor, that is automatically registered as a parameter when assigned as an attribute to a Module. -
autograd.Function
- Implements forward and backward definitions of an autograd operation. Every Tensor operation, creates at least a single Function node, that connects to functions that created a Tensor and encodes its history.
笔记:
torch.nn只支持输入是mini-batches,不支持单张图像样本。
For example, nn.Conv2d
will take in a 4D Tensor of nSamples
x nChannels
x Height
x Width
.
损失函数和反向传播
nn package
中有几种损失函数
如,mse损失(均方差)。
# Loss Function
# -------------
# A loss function takes the (output, target) pair of inputs, and computes a
# value that estimates how far away the output is from the target.
#
# There are several different
# `loss functions <http://pytorch.org/docs/nn.html#loss-functions>`_ under the
# nn package .
# A simple loss is: ``nn.MSELoss`` which computes the mean-squared error
# between the input and the target.
#
# For example:
output = net(input)
target = torch.randn(10) # a dummy target, for example
target = target.view(1, -1) # make it the same shape as output
criterion = nn.MSELoss()
loss = criterion(output, target)
print(loss)
数据在网络中的流程:
反向传播
print(loss.grad_fn) # MSELoss
print(loss.grad_fn.next_functions[0][0]) # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU
输出:
BP算法
BP算法介绍
激励传播和权重更新这两个过程反复循环迭代,直到网络对输入的响应达到预定的目标范围为止。
激励传播
用到的函数:loss.backward()
需要先将梯度清零,否则是在已有的梯度上进行计算的
########################################################################
# Backprop
# --------
# To backpropagate the error all we have to do is to ``loss.backward()``.
# You need to clear the existing gradients though, else gradients will be
# accumulated to existing gradients.
#
#
# Now we shall call ``loss.backward()``, and have a look at conv1's bias
# gradients before and after the backward.
net.zero_grad() # zeroes the gradient buffers of all parameters
print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)
loss.backward()
print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)
输出:
更新权重
随机梯度下降:Stochastic Gradient Descent (SGD):
weight = weight - learning_rate * gradient
########################################################################
# Now, we have seen how to use loss functions.
#
# **Read Later:**
#
# The neural network package contains various modules and loss functions
# that form the building blocks of deep neural networks. A full list with
# documentation is `here <http://pytorch.org/docs/nn>`_.
#
# **The only thing left to learn is:**
#
# - Updating the weights of the network
#
# Update the weights
# ------------------
# The simplest update rule used in practice is the Stochastic Gradient
# Descent (SGD):
#
# ``weight = weight - learning_rate * gradient``
#
# We can implement this using simple python code:
#
# .. code:: python
#
# learning_rate = 0.01
# for f in net.parameters():
# f.data.sub_(f.grad.data * learning_rate)
#
# However, as you use neural networks, you want to use various different
# update rules such as SGD, Nesterov-SGD, Adam, RMSProp, etc.
# To enable this, we built a small package: ``torch.optim`` that
# implements all these methods. Using it is very simple:
import torch.optim as optim
# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)
# in your training loop:
optimizer.zero_grad() # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # Does the update
输出: