恢复模型时使用批量规范?

问题描述:

我有一个小问题,在张量流中恢复模型时使用批量常量。恢复模型时使用批量规范?

下面是我的批处理规范从here

def _batch_normalization(self, input_tensor, is_training, batch_norm_epsilon, decay=0.999): 
    """batch normalization for dense nets. 

    Args: 
     input_tensor: `tensor`, the input tensor which needed normalized. 
     is_training: `bool`, if true than update the mean/variance using moving average, 
          else using the store mean/variance. 
     batch_norm_epsilon: `float`, param for batch normalization. 
     decay: `float`, param for update move average, default is 0.999. 

    Returns: 
     normalized params. 
    """ 
    # actually batch normalization is according to the channels dimension. 
    input_shape_channels = int(input_tensor.get_shape()[-1]) 

    # scala and beta using in the the formula like that: scala * (x - E(x))/sqrt(var(x)) + beta 
    scale = tf.Variable(tf.ones([input_shape_channels])) 
    beta = tf.Variable(tf.zeros([input_shape_channels])) 

    # global mean and var are the mean and var that after moving averaged. 
    global_mean = tf.Variable(tf.zeros([input_shape_channels]), trainable=False) 
    global_var = tf.Variable(tf.ones([input_shape_channels]), trainable=False) 

    # if training, then update the mean and var, else using the trained mean/var directly. 
    if is_training: 
     # batch norm in the channel axis. 
     axis = list(range(len(input_tensor.get_shape()) - 1)) 
     batch_mean, batch_var = tf.nn.moments(input_tensor, axes=axis) 

     # update the mean and var. 
     train_mean = tf.assign(global_mean, global_mean * decay + batch_mean * (1 - decay)) 
     train_var = tf.assign(global_var, global_var * decay + batch_var * (1 - decay)) 
     with tf.control_dependencies([train_mean, train_var]): 
      return tf.nn.batch_normalization(input_tensor, 
              batch_mean, batch_var, beta, scale, batch_norm_epsilon) 
    else: 
     return tf.nn.batch_normalization(input_tensor, 
             global_mean, global_var, beta, scale, batch_norm_epsilon) 

我训练模型,并使用tf.train.Saver()保存。下面是测试代码:

def inference(self, images_for_predict): 
    """load the pre-trained model and do the inference. 

    Args: 
     images_for_predict: `tensor`, images for predict using the pre-trained model. 

    Returns: 
     the predict labels. 
    """ 

    tf.reset_default_graph() 
    images, labels, _, _, prediction, accuracy, saver = self._build_graph(1, False) 

    predictions = [] 
    correct = 0 
    with tf.Session() as sess: 
     sess.run(tf.global_variables_initializer()) 
     # saver = tf.train.import_meta_graph('./models/dense_nets_model/dense_nets.ckpt.meta') 
     # saver.restore(sess, tf.train.latest_checkpoint('./models/dense_nets_model/')) 
     saver.restore(sess, './models/dense_nets_model/dense_nets.ckpt') 
     for i in range(100): 
      pred, corr = sess.run([tf.argmax(prediction, 1), accuracy], 
            feed_dict={ 
             images: [images_for_predict.images[i]], 
             labels: [images_for_predict.labels[i]]}) 
      correct += corr 
      predictions.append(pred[0]) 
    print("PREDICTIONS:", predictions) 
    print("ACCURACY:", correct/100) 

但预测结果总是很糟糕,这样的:

('PREDICTIONS:', [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) 

('ACCURACY:', 0.080000000000000002) 

一些提示:images_for_predict = mnist.testself._build_graph方法有两个参数:batch_sizeis_training

任何人都可以帮到我吗?

看到您的一批规范的执行,当你加载你的模型,你需要保持与images, labels, _, _, prediction, accuracy, saver = self._build_graph(1, False)内置图形并加载重量为chekpoint值,但元图。我认为saver.restore(sess, './models/dense_nets_model/dense_nets.ckpt')现在也恢复元图(抱歉,如果我错了),所以你只需要恢复它的“数据”部分。

否则,您只是使用图形进行训练,其中批量规范中使用的平均值和方差是从批次中获得的平均值和方差。但是当你测试批量大小为1时,所以通过批处理的均值和方差进行归一化总是会使数据为0,因此是恒定的输出。

在任何情况下,我建议使用tf.layers.batch_normalization而是用is_training占位符,你需要喂到你的网络...

+0

谢谢!但是如果我的测试批量大小为1,如何将其放入批量大于1的训练模型中? – Yang

+0

嗨,gdelab,我改变我的'batch_norm'到'tf.layers.batch_normalization(input_tensor,training = is_training)'但它似乎不工作,我更新github帖子,你能帮我吗? – Yang

尝试了很多方法,我解决这个问题,以下是我所做的。

首先感谢@gdelab,我用tf.layers.batch_normalization代替,所以像我的批量规范功能:

def _batch_normalization(self, input_tensor, is_training): 
    return tf.layers.batch_normalization(input_tensor, training=is_training) 

帕拉姆is_training就像是一个占位符:is_training = tf.placeholder(tf.bool)

构建图表时,请记住在您的优化中添加此代码:

extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 
with tf.control_dependencies(extra_update_ops): 
    train_step = tf.train.AdamOptimizer(self.learning_rate).minimize(cross_entropy) 

因为tf.layers.batch_normalization增加了日期平均值和方差不会自动作为列车运行的依赖项添加 - 因此,如果您没有做任何额外的事情,那么它们永远不会运行。

所以begain训练网络,后完成培训,使用这样的代码保存模型:

saver = tf.train.Saver(var_list=tf.global_variables()) 
savepath = saver.save(sess, 'here_is_your_personal_model_path') 

注意var_list=tf.global_variables() PARAM确保tensorflow保存所有PARAMS包括全球平均/ VAR这被设置为不可训练。

恢复和测试模型时,做这样的:

# build the graph like training: 
images, labels, _, _, prediction, accuracy, saver = self._build_graph(1, False) 
saver = tf.train.Saver() 
saver.restore(sess, 'here_is_your_personal_model_path') 

现在可以测试他/她的模型,希望它能帮助ü,谢谢!