用keras框架搭建神经网络 ——（三）识别手写数字

今天进阶到手写识别神经网络的搭建。
源代码：https://download.csdn.net/download/rance_king/11010836

导入包

import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.utils.np_utils import to_categorical
import random
np.random.seed(0)

导入数据并确认数据格式正确，这里有个附加的知识点，使用assert语法来鉴别数据的格式正确，如果正确则放行，如果不正确则报错。

(X_train, y_train), (X_test, y_test) = mnist.load_data()
assert(X_train.shape[0] ==  y_train.shape[0]), "The number of images is not equal to the number of labels."
assert(X_test.shape[0] == y_test.shape[0]), "The number of images is not equal to the number of labels."
assert(X_train.shape[1:] == (28,28)), "The dimensions of the images are not 28x28"
assert(X_test.shape[1:] == (28,28)), "The dimensions of the images are not 28x28"

接下来将可视化拿下来的数据集，用比较好看的格式将数据集中的图片内容呈现出来。

#我想查看一下这个数据集里面每个标签下有多少个样本，用空数组接收后面被放入的每个标签下样本数量
num_of_samples = []
 
cols = 5
num_classes = 10
#plt.subplots是创建子图，子图是把很多图画到一张图里，nrows是多少行，ncols是多少列，figsize是整体图的宽高。
fig, axs = plt.subplots(nrows=num_classes, ncols = cols, figsize=(5, 8))
#采用了满的布局，fig是整张图的情况，axs是控制单张图的情况
fig.tight_layout()
#用图片填充建好的子图框架
for i in range(cols):
    for j in range(num_classes):
        #再次出现过滤技巧，因为分类为10，每一个分类的j就是分类标签，用y_train训练集标签来判断是否
        #其等于j，借此返回布尔数组，并获取相关联的X_train图片 
        x_selected = X_train[y_train == j]
        #在被选中的X数据集中随机选择图片放进框架里面，因为X的shape是(60000,28,28)所以第一个标号是选择图片
        #把图像调成灰度模式cmap（colormap）
        axs[j][i].imshow(x_selected[random.randint(0, len(x_selected - 1)), :, :], cmap=plt.get_cmap("gray"))
        #不显示坐标轴
        axs[j][i].axis("off")
        #对每个第二列的位置放置一个标题
        if i == 2:
            axs[j][i].set_title(str(j))
            num_of_samples.append(len(x_selected))

用keras框架搭建神经网络 ——（三）识别手写数字

进行可视化，查看关于训练数据集的信息

print(num_of_samples)
plt.figure(figsize=(12, 4))
plt.bar(range(0, num_classes), num_of_samples)
plt.title("Distribution of the training dataset")
plt.xlabel("Class number")
plt.ylabel("Number of images")

显示结果
用keras框架搭建神经网络 ——（三）识别手写数字

把拿到的数据集进行处理，将测试集和训练集的标签变成one-hot形式

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

处理训练和测试数据集，将这两个数据集进行标准化（normalization），这是一个对数据进行预处理的过程，其本质目标是要将整体数据都限制在一个数据范围之内，由此可以让数据更好地被学习。这里每个像素点最高的亮度数值就是255，所以用除255的方法进行标准化。
所谓的“让数据可以被更好地学习”，实际上本质是将所有图片映射到同一个坐标系中，比如说有两张图片我们看起来相同的图片因为曝光度不同，灰度不同等等导致被神经网络判断为不同的图片，但从关系上来讲，这两张图片应该是表达了相同的事物，这个时候，标准化就将两张图片弄到同一个坐标系统里去，从一定程度上减轻了噪声、曝光等等因素对识别特征的影响。

X_train = X_train/255 
X_test = X_test/255

然后将数据摊平

X_train = X_train.reshape(X_train.shape[0], num_pixels)
X_test = X_test.reshape(X_test.shape[0], num_pixels)

开始创建神经网络，这一次**函数使用的是Relu，这个**函数的特性是在数据小于0时取0，在数据大于0时保持原样，回顾以前的**函数，step_function用于感知机，是以非0即1的方式控制输出，sigmoid同样将输出控制在0和1之间，用于线性分类，并将输出控制在大于0，这里的relu则开放了输出的范围，单独控制输出应大于0。
输出层的**函数是softmax，softmax实际上就是用来多个分类下每个类别的概率的，本来求概率直观地说应该是x1/ (x1 + x2 + x3…) 这种情况有可能出现负数，所以引入了自然指数，具体的softmax知识还要继续理解，但这里先使用上就好

num_pixels = 784

def create_model():
    model = Sequential()
    model.add(Dense(10, input_dim=num_pixels, activation='relu'))
    model.add(Dense(30, activation='relu'))
    model.add(Dense(10, activation='relu'))
    #输出层一共有10个对应0-9十个数字，输出的**函数是softmax
    model.add(Dense(num_classes, activation='softmax'))
    model.compile(Adam(lr=0.01), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

对神经网络进行实例化，并查看报告

model = create_model()
print(model.summary())

训练模型，并可视化模型的损失不断下降，这里有趣的是可以简单地用一个参数validation_split将数据集中一部分（这里0.1就是取了十分之一）数据作为测试集不参与学习，并自动用于测试模型对外部数据的判断准确度。

history = model.fit(X_train, y_train, validation_split=0.1, epochs = 10, batch_size = 200, verbose = 1, shuffle = 1)
 
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.legend(['loss', 'val_loss'])
plt.title('Loss')
plt.xlabel('epoch')

测试模型

score = model.evaluate(X_test, y_test, verbose=0)
print(type(score))
print('Test score:', score[0])
print('Test accuracy:', score[1])

现在实际地检测一下成果，从网上下载一张图片并且进行转换，放进训练好的模型里面进行预测，得出结果。代码中逐行解释。

import requests
from PIL import Image
#url地址
url = 'https://www.researchgate.net/profile/Jose_Sempere/publication/221258631/figure/fig1/AS:[email protected]/Handwritten-digit-2.png'
#进行request请求发送图片
response = requests.get(url, stream=True)
#打开图片
img = Image.open(response.raw)
#以灰度模式展示图片
plt.imshow(img, cmap=plt.get_cmap('gray'))
#导入opencv2库
import cv2
#将img转换为numpy.array，并且缩放为28*28的灰度图片
img = np.asarray(img)
img = cv2.resize(img, (28, 28))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#对二进制图像进行逻辑‘非’的操作，实际上是反转写了字的区域，将其变为白色
img = cv2.bitwise_not(img)
plt.imshow(img, cmap=plt.get_cmap('gray'))
 
img = img/255
img = img.reshape(1, 784)
 
prediction = model.predict_classes(img)
print("predicted digit:", str(prediction))

猜测完了！然后猜错啦啦啦啦~~号称百分之九十多准确率的模型也不过如此呢，嗯嗯…

用keras框架搭建神经网络 ——（三）识别手写数字

相关推荐