基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

0.MNIST数据集介绍：

MNIST数据集是由10000张28*28*1手写数字图片所构成的数据集，MNIST 数据集来自美国国家标准与技术研究所, National Institute of Standards and Technology (NIST). 训练集 (trainingset) 由来自 250 个不同人手写的数字构成, 其中 50% 是高中学生, 50% 来自人口普查局 (the Census Bureau) 的工作人员. 测试集(test set) 也是同样比例的手写数字数据.

下面是部分数据截图：

1.MNIST数据集目录结构：

基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

Train-images* 是训练数据

Train-labels* 是训练数据的标签

T10k-image* 是测试图片

T10k-labels* 是测试图片的正确结果标签

2.本次训练采用的卷积神经网络结构：

LeNet卷积网络结构：

基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

本次试验训练模型使用Tensorboard可视化：

基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

本次实验采用了类似于LeNet模型的卷积结构：

神经网络层次：	层次类别：	Size
第一层	MNIST数据层	28281
第二层	Conv1(第一卷积层)	3255
第三层	池化层1	Size:2*2, stride:2,2
第四层	Conv2(第二卷积层)	6455
第五层	池化层2	Size:2*2 ,strie:2,2
第六层	全连接层1	Size：1024
第七层	第一**层	Function: Relu
第八层	全连接层2	Size: 10
第九层	第二**层	Function：Softmax

3.训练环境：

1.cuda 9.0 运算环境

2.cudnn 7.05 运算环境

3.GTX 960 Nvidia显卡，显存4G

4.Linux ubuntu 64位操作系统

5.操作系统内存 8G

6.Tensorflow-gpu 框架

7.Tensorboard 训练过程可视化

4.计算过程简述

卷积层：

卷积输入层数据为28*28的二位矩阵，第一卷积层节点个数设置为32个（即：神经元个数为32个），卷积方式采用填充0的方式，使得卷积后的数据大小保持28*28。池化层使用最大池化（max-pooling)方式。第一次卷积过后，数据量变为：28*28*32，对数据进行第一次max-pooling，数据维度变为：14*14*32,。第二卷基层节点个数设置为64个，卷积过后（同样采取填充0的方式使得卷积后图像的height, 和 width不变）数据维度为：14*14*64,进行max-pooling:7*7*64.

基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

全连接层：

卷积层最后输出的数据维度为：7*7*64，在第一个全连接层设置1024个神经元节点，那么全连接输入层与全连接第一层的weights_1维度为：（7*7*64） ×（1024）, bias_1 维度为 1*1024。Fc_1层采用Relu**函数。

第二层全连接层神经元个数设置为10个，因为我们最终所要预测的结果集中的标签种类只有10类。fc_2层：weights_2:1024*10, bias: 1*10。**函数采用Softmax函数，输出结果为输入数据所属每一类的概率。

5.代码实现

# -*- coding: utf-8 -*-
"""
    python文件说明：卷积神经网络的程序
"""
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
batch_size = 100
n_batch = mnist.train.num_examples // batch_size

导入tensorflow框架，mnist数据集，one_hot作用：将训练数据结果标签向量化，eg: 9 : 0000000001; 2:0010000000

Batch_size:每一批次数据量的大小（数据是分批次训练）

#初始化权值
def weight_variable(shape, name):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial, name=name)

#初始化偏执值
def bias_variable(shape, name):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial, name=name)

这里的两个函数分别初始化权值，和偏执值shape是初始化权值w的形状（维度数）（和bias的维度数）

#卷基层
def conv2d(x, w):
    return tf.nn.conv2d(x, w, strides=[1,1,1,1], padding='SAME')

X:传入的需要卷积的参数（即：28*28*1的图片）

W:卷积核权重值（是个四维向量：[卷积核的高度，卷积核的宽度，图像通道数，卷积核个数])

Strides:卷积时在图像每一维的步长，这是一个一维的向量,[1,heigh, weigth, 1]

Padding:填充或者不填充，（这里采用全零填充，使得卷积之后的图像数据大小任然是28*28）

#池化层
def max_pool_2x2(h_conv):
    return tf.nn.max_pool(h_conv, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

采用2*2的池化大小，步长在x,y方向均为2, 经过一次池化过后，图像大小变为原来的1/2

#命名空间
with tf.name_scope('input'):
    x = tf.placeholder(tf.float32, [None, 784], name='x-input')
    y = tf.placeholder(tf.float32, [None, 10], name='y-input')

    with tf.name_scope('x_image'):
        x_image = tf.reshape(x, [-1, 28, 28, 1], name='x_image')


输入层数据，x:长度为784的向量，y：长度为10的向量

X_image: 将x向量转化为28*28×1的卷积输入数据


with tf.name_scope('conv1'):
    with tf.name_scope('w_conv1'):
        w_conv1 = weight_variable([5, 5, 1, 32], 'w_conv1')
    with tf.name_scope('b_conv1'):
        b_conv1 = bias_variable([32], name='b_conv1')
    with tf.name_scope('conv2d_1'):
        conv2d_1 = conv2d(x_image, w_conv1) + b_conv1
    with tf.name_scope('relu_1'):
        h_conv1 = tf.nn.relu(conv2d_1)
    with tf.name_scope('pool_1'):
        h_pool1 = max_pool_2x2(h_conv1)

第一卷积层，filter:5*5,size_of(filter):32,the activition function: Relu .Pooling: max_pooling


with tf.name_scope('conv2'):
    with tf.name_scope('w_conv2'):
        w_conv2 = weight_variable([5, 5, 32, 64], 'w_conv2')
    with tf.name_scope('b_conv2'):
        b_conv2 = bias_variable([64], name='b_conv2')
    with tf.name_scope('conv2d_2'):
        conv2d_2 = conv2d(h_pool1, w_conv2) + b_conv2
    with tf.name_scope('relu_2'):
        h_conv2 = tf.nn.relu(conv2d_2)
    with tf.name_scope('pool_2'):
        h_pool2 = max_pool_2x2(h_conv2)


第二卷积层， filter:5*5, size_of(filter):64, activity function: Relu

The method of pooling: max_pooling


#全连接层1
with tf.name_scope('fc1'):
    with tf.name_scope('w_fc1'):
        w_fc1 = weight_variable([7*7*64, 1024], 'w_fc1')
    with tf.name_scope('b_fc1'):
        b_fc1 = bias_variable([1024], 'b_fc1')
    # 把池化层2的输出扁平化为1维
    with tf.name_scope('h_pool2_flat'):
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64], name='h_pool2_flat')
    with tf.name_scope('fc1_result'):
        fc1_result1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

    #drpoout防止过拟合
    with tf.name_scope('keep_prob1'):
        keep_prob = tf.placeholder(tf.float32, name='keep_prob1')
    with tf.name_scope('h_fc1_drop1'):
        h_fc1_drop1 = tf.nn.dropout(fc1_result1, keep_prob=keep_prob, name='h_fc1_drop1')

全连接层1：neurons:1024,w_fc1:[7*7*64, 1024],b_fc1:[1, 1024], activity function:Relu


# 全连接层2
with tf.name_scope('fc2'):
    with tf.name_scope('w_fc2'):
        w_fc2 = weight_variable([1024, 10], 'w_fc2')
    with tf.name_scope('b_fc2'):
        b_fc2 = bias_variable([10], 'b_fc2')
    with tf.name_scope('fc1_result2'):
        fc1_result2 = tf.nn.softmax(tf.matmul(h_fc1_drop1, w_fc2) + b_fc2)

全连接层2：neurons:10, w_fc2:[1024, 10], b_fc1:[10], activity function:Softmax. Return the Probability of every class.
# 交叉熵代价函数
with tf.name_scope('cross_entropy'):
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=fc1_result2),
                                   name='cross_entropy')
    tf.summary.scalar('cross_entropy', cross_entropy)

交叉熵模型：

               离散型：H(p,q)=

连续型：

注意:在使用交叉熵过程中要注意处理梯度消失，和梯度无穷大的异常。

       
#优化函数
with tf.name_scope('train'):
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

优化算法：Adam优化算法

 
with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
        correct_prediction = tf.equal(tf.arg_max(fc1_result2, 1),tf.arg_max(y, 1))
    with tf.name_scope('accuracy'):
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.summary.scalar('accuracy', accuracy)
#合并所有的summary
merged = tf.summary.merge_all()

下面开始训练神经网络
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter('/home/wangheng/path/train', sess.graph)
    test_writer = tf.summary.FileWriter('/home/wangheng/path/test', sess.graph)

    for i in range(3001):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        sess.run(train_step, feed_dict={x: batch_xs, y:batch_ys, keep_prob:0.5})
        #记录训练集计算的参数
        summary = sess.run(merged, feed_dict={x: batch_xs, y:batch_ys, keep_prob:1.0})
        train_writer.add_summary(summary, i)

        #记录测试集计算的参数
        batch_xs,  batch_ys = mnist.test.next_batch(batch_size)
        summary = sess.run(merged, feed_dict={x:batch_xs, y:batch_ys, keep_prob:1.0})
        test_writer.add_summary(summary, i)

        if i% 100 ==0:
            test_acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels, keep_prob:1.0})
            train_acc = sess.run(accuracy, feed_dict={x:mnist.train.images[:10000], y:mnist.train.labels[:10000], keep_prob:1.0})
            print('Iter'+str(i)+',Testing Accuracy='+str(test_acc)+',Train Accuracy='+str(train_acc))

6.结果展示

经过20分钟， 3000次的训练，

课件刚开始的准确率并不是太高，只有百分之八十几，

训练3000次结果之后，准去率最高能达到98.16%，这已经是一个非常不错的结果了。

准去率变化：

错误率变化曲线：

基于tensorflow框架对MNIST数据集手写识别卷积神经网络的实现

0.MNIST数据集介绍：

1.MNIST数据集目录结构：

2.本次训练采用的卷积神经网络结构：

3.训练环境：

4.计算过程简述

卷积层：

全连接层：

5.代码实现

6.结果展示

相关推荐