有人可以从TensorFlow RNN代码解释我简短的_linear函数吗?

问题描述:

这里的核心(MB)tensorflow RNN执行力度的功能有人可以从TensorFlow RNN代码解释我简短的_linear函数吗?

Linear map:

def _linear(args, output_size, bias, bias_start=0.0, scope=None): 
    """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable. 
    Args: 
    args: a 2D Tensor or a list of 2D, batch x n, Tensors. 
    output_size: int, second dimension of W[i]. 
    bias: boolean, whether to add a bias term or not. 
    bias_start: starting value to initialize the bias; 0 by default. 
    scope: VariableScope for the created subgraph; defaults to "Linear". 
    Returns: 
    A 2D Tensor with shape [batch x output_size] equal to 
    sum_i(args[i] * W[i]), where W[i]s are newly created matrices. 
    Raises: 
    ValueError: if some of the arguments has unspecified or wrong shape. 
    """ 
    if args is None or (nest.is_sequence(args) and not args): 
    raise ValueError("`args` must be specified") 
    if not nest.is_sequence(args): 
    args = [args] 

    # Calculate the total size of arguments on dimension 1. 
    total_arg_size = 0 
    shapes = [a.get_shape().as_list() for a in args] 
    for shape in shapes: 
    if len(shape) != 2: 
     raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes)) 
    if not shape[1]: 
     raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes)) 
    else: 
     total_arg_size += shape[1] 

    # Now the computation. 
    with vs.variable_scope(scope or "Linear"): 
    matrix = vs.get_variable("Matrix", [total_arg_size, output_size]) 
    if len(args) == 1: 
     res = math_ops.matmul(args[0], matrix) 
    else: 
     res = math_ops.matmul(array_ops.concat(1, args), matrix) 
    if not bias: 
     return res 
    bias_term = vs.get_variable(
     "Bias", [output_size], 
     initializer=init_ops.constant_initializer(bias_start)) 
    return res + bias_term 

至于象我已了解args包含的值,我们应该多样信息(点积),它与权重矩阵W[i]并添加bias。我无法打开的东西:

当我们在不重复使用变量标志的情况下调用vs.get_variable("Matrix", [total_arg_size, output_size])时,我们是否每次都会创建每次都随机初始化的新权重矩阵?我认为在这种情况下,我们将失败..我找不到scope.reuse_variables()reuse=True任何地方在rnn_cell.py代码。并找不到“矩阵”variable(weights)更新或保存的位置......所以看起来每次都会随机加权。这一切如何运作?我们每次都使用随机权重矩阵吗?也许有人可以解释_linear是如何工作的?

+0

如果任何岗位的回答你的问题。请接受它。 –

linear计算sum_i(args[i] * W[i]) + bias其中W是大小n x outputsizebias的矩阵变量的列表的大小为outputsize的变量。

Tensorflow使用左转换符号---行向量乘以矩阵的转置---所以在linear中,args是行向量的列表。

矩阵W和偏移量b在哪里?这取自基于变量当前范围的内存,因为Wb是变量张量,它们是学习值。

Check this

tf.variable_scope文档说,如果variable_scope“重用”不指定参数,则默认参数为None。您提供的代码就是这种情况:with vs.variable_scope(scope or "Linear")

当未指定范围重用或None时,继承父级的范围重用设置。 See here

对于你的特定实例中,RNN细胞的重量变量无需在_linear函数的变量范围指定reuse=True因为rnn功能,在rnn.py定义,它调用RNN细胞有其可变范围设定为reuse共享。

请参见下面的摘录from rnn.py

def rnn(cell, inputs, initial_state=None, dtype=None, 
     sequence_length=None, scope=None): 

    .... 

    for time, input_ in enumerate(inputs): 
     if time > 0: varscope.reuse_variables() 
     # pylint: disable=cell-var-from-loop 
     call_cell = lambda: cell(input_, state) 
     # pylint: enable=cell-var-from-loop 
     if sequence_length is not None: 
      (output, state) = _rnn_step(
       time=time, 
       sequence_length=sequence_length, 
       min_sequence_length=min_sequence_length, 
       max_sequence_length=max_sequence_length, 
       zero_output=zero_output, 
       state=state, 
       call_cell=call_cell, 
       state_size=cell.state_size) 
     else: 
      (output, state) = call_cell()