返回所有可能的预测值

问题描述：

上输入[[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]此神经网络的火车用标记的输出：[[0.0], [1.0], [1.0], [0.0]]返回所有可能的预测值

import numpy as np 
import tensorflow as tf 

sess = tf.InteractiveSession() 
sess.run(init) 
# a batch of inputs of 2 value each 
inputs = tf.placeholder(tf.float32, shape=[None, 2]) 

# a batch of output of 1 value each 
desired_outputs = tf.placeholder(tf.float32, shape=[None, 1]) 

# [!] define the number of hidden units in the first layer 
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([2, HIDDEN_UNITS])) 

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS])) 

# connect 2 inputs to every hidden unit. Add bias 
layer_1_outputs = tf.nn.sigmoid(tf.matmul(inputs, weights_1) + biases_1) 

print layer_1_outputs 

NUMBER_OUTPUT_NEURONS = 1 

biases_2 = tf.Variable(tf.zeros([NUMBER_OUTPUT_NEURONS])) 
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, NUMBER_OUTPUT_NEURONS])) 
finalLayerOutputs = tf.nn.sigmoid(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

tf.global_variables_initializer().run() 

logits = tf.nn.sigmoid(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

training_inputs = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]] 
training_outputs = [[0.0], [1.0], [1.0], [0.0]] 

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs)) 
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function) 

for i in range(15): 
    _, loss = sess.run([train_step, error_function], 
         feed_dict={inputs: np.array(training_inputs), 
            desired_outputs: np.array(training_outputs)}) 

print(sess.run(logits, feed_dict={inputs: np.array([[0.0, 1.0]])}))

在训练该网络为值[[0.0, 1.0]]返回[[ 0.61094815]]

[[ 0.61094815]]是具有最高概率值的训练该网络之后分配给输入值[[0.0，1.0]]？更低的概率值是否也可以被访问，而不仅仅是最可能的？

如果我增加训练时期的数量，我会得到更好的预测，但在这种情况下，我只想访问所有潜在值和它们对给定输入的概率。

更新：

已更新代码以使用softmax的多类分类。但[[0.0, 1.0, 0.0, 0.0]]的预测是[array([0])]。我更新了吗？

import numpy as np 
import tensorflow as tf 

init = tf.global_variables_initializer() 
sess = tf.InteractiveSession() 
sess.run(init) 
# a batch of inputs of 2 value each 
inputs = tf.placeholder(tf.float32, shape=[None, 4]) 

# a batch of output of 1 value each 
desired_outputs = tf.placeholder(tf.float32, shape=[None, 3]) 

# [!] define the number of hidden units in the first layer 
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([4, HIDDEN_UNITS])) 

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS])) 

# connect 2 inputs to every hidden unit. Add bias 
layer_1_outputs = tf.nn.softmax(tf.matmul(inputs, weights_1) + biases_1) 

biases_2 = tf.Variable(tf.zeros([3])) 
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, 3])) 
finalLayerOutputs = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

tf.global_variables_initializer().run() 

logits = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

training_inputs = [[0.0, 0.0 , 0.0, 0.0], [0.0, 1.0 , 0.0, 0.0], [1.0, 0.0 , 0.0, 0.0], [1.0, 1.0 , 0.0, 0.0]] 
training_outputs = [[0.0,0.0,0.0], [1.0,0.0,0.0], [1.0,0.0,0.0], [0.0,0.0,1.0]] 

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs)) 
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function) 

for i in range(15): 
    _, loss = sess.run([train_step, error_function], 
         feed_dict={inputs: np.array(training_inputs), 
            desired_outputs: np.array(training_outputs)}) 

prediction=tf.argmax(logits,1) 
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])}) 
print(best)

它打印[array([0])]

更新2：

更换

prediction=tf.argmax(logits,1) 
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])}) 
print(best)

有了：

prediction=tf.nn.softmax(logits) 
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])}) 
print(best)

看似解决问题。

所以现在完整的源代码是：

import numpy as np 
import tensorflow as tf 

init = tf.global_variables_initializer() 
sess = tf.InteractiveSession() 
sess.run(init) 
# a batch of inputs of 2 value each 
inputs = tf.placeholder(tf.float32, shape=[None, 4]) 

# a batch of output of 1 value each 
desired_outputs = tf.placeholder(tf.float32, shape=[None, 3]) 

# [!] define the number of hidden units in the first layer 
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([4, HIDDEN_UNITS])) 

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS])) 

# connect 2 inputs to every hidden unit. Add bias 
layer_1_outputs = tf.nn.softmax(tf.matmul(inputs, weights_1) + biases_1) 

biases_2 = tf.Variable(tf.zeros([3])) 
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, 3])) 
finalLayerOutputs = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

tf.global_variables_initializer().run() 

logits = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2) 

training_inputs = [[0.0, 0.0 , 0.0, 0.0], [0.0, 1.0 , 0.0, 0.0], [1.0, 0.0 , 0.0, 0.0], [1.0, 1.0 , 0.0, 0.0]] 
training_outputs = [[0.0,0.0,0.0], [1.0,0.0,0.0], [1.0,0.0,0.0], [0.0,0.0,1.0]] 

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs)) 
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function) 

for i in range(1500): 
    _, loss = sess.run([train_step, error_function], 
         feed_dict={inputs: np.array(training_inputs), 
            desired_outputs: np.array(training_outputs)}) 

prediction=tf.nn.softmax(logits) 
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])}) 
print(best)

它打印

[array([[ 0.49810624, 0.24845563, 0.25343812]], dtype=float32)]

答

您当前的网络做（逻辑）回归，而不是真的分类：给定的输入x，它试图评估f(x)（这里的f(x) = x1 XOR x2这里，但网络不知道那训练前），这是回归。为此，它会学习一个函数f1(x)，并尝试在所有训练样本上尽可能接近f(x)。 [[ 0.61094815]]只是f1([[0.0, 1.0]])的值。在这种情况下，由于没有阶级，所以“没有阶级的概率”就没有了。只有用户（您）选择将f1(x)解释为输出为1的概率。由于您只有2个类，所以会告诉您其他类的概率为1-0.61094815（也就是说，您正在进行分类与网络的输出，但它本身并没有真正做到这一点）。这种用作分类的方法在某种程度上是一种（广泛使用的）执行分类的技巧，但仅适用于有两类的分类。

建立一个真实的分类网络会有点不同：你的logits的形状是(batch_size, number_of_classes) - 你的情况如此（1,2），你对它们应用sofmax，然后预测为argmax(softmax)，概率max(softmax)。然后你也可以得到每个输出的概率，根据网络：probability(class i) = softmax[i]。在这里，网络真正得到了训练，可以学习每个班级中的x的概率。

对不起，如果我的解释是模糊的，或者0和1之间的回归和分类之间的区别在2类的设置中似乎是哲学的，但是如果添加更多的类，您可能会看到我的意思。

编辑回答你的2更新。

您的训练样本中，标签（training_outputs）必须是概率分布，即，它们必须具有总和1对于每个样品（它们的形式为（1，0的99％的时间，0），（0,1,0）或（0,0,1）），所以你的第一个输出[0.0,0.0,0.0]无效。如果要在两个第一个输入上学习XOR，则第一个输出应与最后一个输出相同：[0.0,0.0,1.0]。
prediction=tf.argmax(logits,1) = [array([0])]是完全正常：logins包含您的概率，并prediction是预测，这是最大的概率，这是你的情况类0类：在你的训练集，[0.0, 1.0, 0.0, 0.0]与输出[1.0, 0.0, 0.0]相关，即它是0级的概率为1，其他级别的概率为0.经过足够的训练后，print(best)与prediction=tf.argmax(logits,1)输入[1.0, 1.0 , 0.0, 0.0]应该给你[array（[2]）]，2是这个类的索引输入你的训练集。

感谢您的建议，我试图将其纳入神经网络。你能在我的问题更新中注意到神经网络的问题吗？ –

我认为我解决了问题并更新了问题。 –

我编辑了我的答案 – gdelab

返回所有可能的预测值

相关推荐