有人可以从TensorFlow RNN代码解释我简短的_linear函数吗?
这里的核心(MB)tensorflow RNN执行力度的功能有人可以从TensorFlow RNN代码解释我简短的_linear函数吗?
def _linear(args, output_size, bias, bias_start=0.0, scope=None):
"""Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.
Args:
args: a 2D Tensor or a list of 2D, batch x n, Tensors.
output_size: int, second dimension of W[i].
bias: boolean, whether to add a bias term or not.
bias_start: starting value to initialize the bias; 0 by default.
scope: VariableScope for the created subgraph; defaults to "Linear".
Returns:
A 2D Tensor with shape [batch x output_size] equal to
sum_i(args[i] * W[i]), where W[i]s are newly created matrices.
Raises:
ValueError: if some of the arguments has unspecified or wrong shape.
"""
if args is None or (nest.is_sequence(args) and not args):
raise ValueError("`args` must be specified")
if not nest.is_sequence(args):
args = [args]
# Calculate the total size of arguments on dimension 1.
total_arg_size = 0
shapes = [a.get_shape().as_list() for a in args]
for shape in shapes:
if len(shape) != 2:
raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
if not shape[1]:
raise ValueError("Linear expects shape[1] of arguments: %s" % str(shapes))
else:
total_arg_size += shape[1]
# Now the computation.
with vs.variable_scope(scope or "Linear"):
matrix = vs.get_variable("Matrix", [total_arg_size, output_size])
if len(args) == 1:
res = math_ops.matmul(args[0], matrix)
else:
res = math_ops.matmul(array_ops.concat(1, args), matrix)
if not bias:
return res
bias_term = vs.get_variable(
"Bias", [output_size],
initializer=init_ops.constant_initializer(bias_start))
return res + bias_term
至于象我已了解args
包含的值,我们应该多样信息(点积),它与权重矩阵W[i]
并添加bias
。我无法打开的东西:
当我们在不重复使用变量标志的情况下调用vs.get_variable("Matrix", [total_arg_size, output_size])
时,我们是否每次都会创建每次都随机初始化的新权重矩阵?我认为在这种情况下,我们将失败..我找不到scope.reuse_variables()
或reuse=True
任何地方在rnn_cell.py
代码。并找不到“矩阵”variable(weights)
更新或保存的位置......所以看起来每次都会随机加权。这一切如何运作?我们每次都使用随机权重矩阵吗?也许有人可以解释_linear
是如何工作的?
linear
计算sum_i(args[i] * W[i]) + bias
其中W
是大小n x outputsize
和bias
的矩阵变量的列表的大小为outputsize
的变量。
Tensorflow使用左转换符号---行向量乘以矩阵的转置---所以在linear
中,args是行向量的列表。
矩阵W
和偏移量b
在哪里?这取自基于变量当前范围的内存,因为W
和b
是变量张量,它们是学习值。
为tf.variable_scope
文档说,如果variable_scope
“重用”不指定参数,则默认参数为None
。您提供的代码就是这种情况:with vs.variable_scope(scope or "Linear")
。
当未指定范围重用或None
时,继承父级的范围重用设置。 See here。
对于你的特定实例中,RNN细胞的重量变量无需在_linear
函数的变量范围指定reuse=True
因为rnn
功能,在rnn.py
定义,它调用RNN细胞有其可变范围设定为reuse
共享。
请参见下面的摘录from rnn.py
:
def rnn(cell, inputs, initial_state=None, dtype=None,
sequence_length=None, scope=None):
....
for time, input_ in enumerate(inputs):
if time > 0: varscope.reuse_variables()
# pylint: disable=cell-var-from-loop
call_cell = lambda: cell(input_, state)
# pylint: enable=cell-var-from-loop
if sequence_length is not None:
(output, state) = _rnn_step(
time=time,
sequence_length=sequence_length,
min_sequence_length=min_sequence_length,
max_sequence_length=max_sequence_length,
zero_output=zero_output,
state=state,
call_cell=call_cell,
state_size=cell.state_size)
else:
(output, state) = call_cell()
如果任何岗位的回答你的问题。请接受它。 –