有人可以从TensorFlow RNN代码解释我简短的_linear函数吗？

问题描述：

这里的核心（MB）tensorflow RNN执行力度的功能有人可以从TensorFlow RNN代码解释我简短的_linear函数吗？

def _linear(args, output_size, bias, bias_start=0.0, scope=None): 
    """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable. 
    Args: 
    args: a 2D Tensor or a list of 2D, batch x n, Tensors. 
    output_size: int, second dimension of W[i]. 
    bias: boolean, whether to add a bias term or not. 
    bias_start: starting value to initialize the bias; 0 by default. 
    scope: VariableScope for the created subgraph; defaults to "Linear". 
    Returns: 
    A 2D Tensor with shape [batch x output_size] equal to 
    sum_i(args[i] * W[i]), where W[i]s are newly created matrices. 
    Raises: 
    ValueError: if some of the arguments has unspecified or wrong shape. 
    """ 
    if args is None or (nest.is_sequence(args) and not args): 
    raise ValueError("`args` must be specified") 
    if not nest.is_sequence(args): 
    args = [args] 

    # Calculate the total size of arguments on dimension 1. 
    total_arg_size = 0 
    shapes = [a.get_shape().as_list() for a in args] 
    for shape in shapes: 
    if len(shape) != 2: 
     raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes)) 
    if not shape[1]: 
     raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes)) 
    else: 
     total_arg_size += shape[1] 

    # Now the computation. 
    with vs.variable_scope(scope or "Linear"): 
    matrix = vs.get_variable("Matrix", [total_arg_size, output_size]) 
    if len(args) == 1: 
     res = math_ops.matmul(args[0], matrix) 
    else: 
     res = math_ops.matmul(array_ops.concat(1, args), matrix) 
    if not bias: 
     return res 
    bias_term = vs.get_variable(
     "Bias", [output_size], 
     initializer=init_ops.constant_initializer(bias_start)) 
    return res + bias_term

至于象我已了解args包含的值，我们应该多样信息（点积），它与权重矩阵W[i]并添加bias。我无法打开的东西：

当我们在不重复使用变量标志的情况下调用vs.get_variable("Matrix", [total_arg_size, output_size])时，我们是否每次都会创建每次都随机初始化的新权重矩阵？我认为在这种情况下，我们将失败..我找不到scope.reuse_variables()或reuse=True任何地方在rnn_cell.py代码。并找不到“矩阵”variable(weights)更新或保存的位置......所以看起来每次都会随机加权。这一切如何运作？我们每次都使用随机权重矩阵吗？也许有人可以解释_linear是如何工作的？

如果任何岗位的回答你的问题。请接受它。 –

答

linear计算sum_i(args[i] * W[i]) + bias其中W是大小n x outputsize和bias的矩阵变量的列表的大小为outputsize的变量。

Tensorflow使用左转换符号---行向量乘以矩阵的转置---所以在linear中，args是行向量的列表。

矩阵W和偏移量b在哪里？这取自基于变量当前范围的内存，因为W和b是变量张量，它们是学习值。

Check this

答

为tf.variable_scope文档说，如果variable_scope“重用”不指定参数，则默认参数为None。您提供的代码就是这种情况：with vs.variable_scope(scope or "Linear")。

当未指定范围重用或None时，继承父级的范围重用设置。 See here。

对于你的特定实例中，RNN细胞的重量变量无需在_linear函数的变量范围指定reuse=True因为rnn功能，在rnn.py定义，它调用RNN细胞有其可变范围设定为reuse共享。

请参见下面的摘录from rnn.py：

def rnn(cell, inputs, initial_state=None, dtype=None, 
     sequence_length=None, scope=None): 

    .... 

    for time, input_ in enumerate(inputs): 
     if time > 0: varscope.reuse_variables() 
     # pylint: disable=cell-var-from-loop 
     call_cell = lambda: cell(input_, state) 
     # pylint: enable=cell-var-from-loop 
     if sequence_length is not None: 
      (output, state) = _rnn_step(
       time=time, 
       sequence_length=sequence_length, 
       min_sequence_length=min_sequence_length, 
       max_sequence_length=max_sequence_length, 
       zero_output=zero_output, 
       state=state, 
       call_cell=call_cell, 
       state_size=cell.state_size) 
     else: 
      (output, state) = call_cell()

有人可以从TensorFlow RNN代码解释我简短的_linear函数吗？

相关推荐