【李宏毅ML笔记】14 Keras2.0 and 15 Keras Demo

深度学习实验:

1 构建输入空间。如图像等转换为向量,多张图像向量构成一个矩阵。一般图像,将其每一个像素,作为一个维度,如28*28,拉成28*28维度的向量。

2 确定输出。如输出二分类,多分类,则定义个对应个数的维度的输出向量。

3 确定隐藏层。



keras:

【李宏毅ML笔记】14 Keras2.0 and 15 Keras Demo

声明:model = Sequential()


添加神经元层:mode.add(Dense(p1,p2,p3))  // dense 代表full connected layer,也可以是converlution2d,则是卷积的。【一般带input_dim的是输入层】

                         input_dim代表输入是多少维度的向量,output_layer/units代表输出层有多少个神经元。activation确定该层中使用的神经元的**函数。如relu,sigmoid,softmax(所有输出值为0-1之间,和为1,结果类似于几率),sofplus等。也可以自己定义activation方法。

                         在后面add的神经元层时,就不需要再添加input_dim了,因为第二个layer的input_dim就是前一个的units,只需要定义第二个的输出units和activation即可。【隐层,不带input_dim,只依次定义units和activation】

                        添加最后一层,output_layer,需要输出的units为实际的期望的输出空间y的维数决定。


【李宏毅ML笔记】14 Keras2.0 and 15 Keras Demo

配置configuration

  利用model.compile函数,定义用到的损失函数loss,正则化optimizer(均基于gradient desense),最后一个是metrics是评价指标。


训练:model.fit(输入x_train, y_train,batch_size,epochs)。 xtrain(行为样本格式,列为特征数)和ytrain(label也是二维,行为样本个数,列为该样本对应的值,如过是多分类,多为n则为n维的行向量,,行列,哪个是哪个?从0开始)都是numpy的array形式。


【李宏毅ML笔记】14 Keras2.0 and 15 Keras Demo

保存和启用modelsave load

使用model

一个是evaluation,即训练集和测试集,使用model.evaluate方法(x_test,y_test),输出向量,第一个维度代表在该样本上的loss(损失函数值),第二个维度代表accuracy(正确率,分类时)。

另一个是predict,model.predict(x_test),即在新样本上的预测值。



import keras
import os
import numpy as np
import struct
import gzip
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.optimizers import SGD, Adam
from keras.utils import np_utils
from keras.datasets import mnist


os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  #忽略烦人的警告 # Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
trainfile_X = os.getcwd() + '/dataset/MNIST/train-images-idx3-ubyte.gz'
trainfile_y = os.getcwd() + '/dataset/MNIST/train-labels-idx1-ubyte.gz'
testfile_X = os.getcwd() + '/dataset/MNIST/t10k-images-idx3-ubyte.gz'
testfile_y = os.getcwd() + '/dataset/MNIST/t10k-labels-idx1-ubyte.gz'


def read_data(image_url, label_url):
    with gzip.open(label_url) as flbl:
        magic, num = struct.unpack(">II",flbl.read(8))
        label = np.fromstring(flbl.read(),dtype=np.int8)
    with gzip.open(image_url,'rb') as fimg:
        magic, num, rows, cols = struct.unpack(">IIII",fimg.read(16))
        image = np.fromstring(fimg.read(),dtype=np.uint8).reshape(len(label),rows,cols)
    return (image, label)


def load_data():
    # (x_train, y_train),(x_test, y_test) = mnist.load_data()
    (x_train, y_train) = read_data(trainfile_X, trainfile_y)
    (x_test, y_test) = read_data(testfile_X, testfile_y)
    # print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
    number = 10000
    x_train = x_train[0:number]  # 截取部分样本
    y_train = y_train[0:number]
    x_train = x_train.reshape(number, 28*28)  # ??
    x_test = x_test.reshape(x_test.shape[0], 28*28)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    # print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
    y_train = np_utils.to_categorical(y_train, 10)  # ??
    y_test = np_utils.to_categorical(y_test, 10)
    x_train = x_train / 255  # ??
    x_test = x_test / 255
    return (x_train, y_train), (x_test, y_test)

(x_train, y_train),(x_test, y_test) = load_data()
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)

# the Sequential model
model = Sequential()

# Stacking layers is as easy as .add():
model.add(Dense(input_dim=28*28, units=688, activation='relu'))  # input_layer写确定输入维数
model.add(Dense(units=688, activation='sigmoid'))  # 隐藏层不需要写输入维数,默认用前面的
# for i in range(11):
#     model.add(Dense(units=688, activation='sigmoid'))  # 隐藏层不需要写输入维数,默认用前面的
model.add(Dense(units=10, activation='softmax'))  # 最后一个output_layer要与输出空间样本的特征维数一致

# configure its learning process with .compile()
model.compile(loss='mse', optimizer=SGD(lr=0.1), metrics=['accuracy'])  # SGD(lr=0.01, momentum=0.9, nesterov=True)
# model.compile(loss='mse', optimizer='sgd', metrics=['accuracy'])  另一种方法

# train, x_train and y_train are Numpy arrays --just like in the Scikit-Learn API.
model.fit(x_train, y_train, batch_size=100, epochs=20)  # fit中也可以放入测试集,直接做evaluation

# 用model在测试集上作evaluation
loss_and_metrics = model.evaluate(x_test,y_test, batch_size=100)
print("Test Acc", loss_and_metrics[1])

# Or generate predictions on new data:
classes = model.predict(x_test, batch_size=100)