用python和petersburg解决两个信封问题

In a couple of previous posts (here, here, and here) we’ve explored how petersburg represents uncertain decisions as a directed acyclic graph with weighted random decisions for which edge to take out of a given node.  It turns out this is very similar to Bayesian networks, which will be the subject of post in the coming weeks, but for today, we are going to examine a classical decision theory problem with petersburg and it’s primary object: the directed acyclic graph (DAG).

在之前的几篇文章( 此处此处此处 )中,我们探讨了彼得斯堡如何将不确定的决策表示为有向无环图,并带有加权随机决策,以便从给定节点中取出哪个边。 事实证明,这与贝叶斯网络非常相似,贝叶斯网络将在未来几周内发布。但是,今天,我们将研究彼得斯堡的经典决策理论问题,它的主要对象是:有向无环图(DAG) )。

The Two Envelopes Problem
两个信封问题

The two envelopes problem (or exchange paradox) is a classical problem in decision theory.  It goes:

两个包络问题(或交换悖论)是决策理论中的经典问题。 它去了:

Of two indistinguishable envelopes, each containing money, one contains twice as much as the other.The subject may pick one envelope and keep the money it contains.Having chosen an envelope at will, but before inspecting it, the subject gets the chance to take the other envelope instead.What is the optimal rational strategy for maximizing the amount of money to be gained?

在两个无法区分的信封(每个都包含金钱)中,一个包含另一个的两倍。 对象可以选择一个信封并保留其中的钱。 随意选择了一个信封,但是在检查之前,对象有机会取下另一个信封。 使获取的金钱最大化的最佳理性策略是什么?

This seems like a trivial problem, but it’s called a paradox for a reason.  A common line of reasoning, via wikipedia in this case, is:

这似乎是一个微不足道的问题,但出于某种原因却被称为悖论。 在这种情况下,通过Wikipedia进行推理的常见思路是:

  1. I denote by A the amount in my selected envelope.
  2. The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
  3. The other envelope may contain either 2A or A/2.
  4. If A is the smaller amount, then the other envelope contains 2A.
  5. If A is the larger amount, then the other envelope contains A/2.
  6. Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2.
  7. So the expected value of the money in the other envelope is:
    用python和petersburg解决两个信封问题
  8. This is greater than A, so I gain on average by swapping.
  9. After the switch, I can denote that content by B and reason in exactly the same manner as above.
  10. I will conclude that the most rational thing to do is to swap back again.
  11. To be rational, I will thus end up swapping envelopes indefinitely.
  12. As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.
  1. 我用A表示所选信封中的金额。
  2. A较小的值的概率为1/2,较大的概率的值也为1/2。
  3. 另一个信封可能包含2 AA / 2。
  4. 如果A是较小的数量,则另一个包络包含2 A。
  5. 如果A较大,则另一个信封包含A / 2。
  6. 因此,另一个包络包含概率为1/2的2 A和概率为1/2的A / 2。
  7. 因此,另一个信封中的钱的期望值为:
  8. 这大于A ,所以我平均通过交换获得收益。
  9. 切换之后,我可以用与上述完全相同的方式通过B和原因来表示该内容。
  10. 我将得出结论,最合理的做法是再次交换。
  11. 理性地说,我最终将无限期地交换信封。
  12. 因为打开任何信封似乎比无限期交换更为合理,所以我们有一个矛盾。

Clearly, this doesn’t make sense, the rational strategy can’t be to keep switching forever.  There are a number of different resolutions to this problem that use different techniques to show that in fact, the switching does not increase your expected payoff, but let’s try modeling this problem as a DAG and simulate it in petersburg to see if this construct is any simpler or more enlightening.

显然,这没有任何意义,合理的策略不能永远保持切换。 对于此问题,有许多不同的解决方案,它们使用不同的技术来表明实际上,切换并不会增加您的预期收益,但是让我们尝试将此问题建模为DAG并在彼得斯堡进行仿真,以查看此构造是否有效更简单或更具有启发性。

Petersburg DAG
圣彼得堡DAG

The first step in creating a petersburg DAG is to draw a starting node with edges to each option available.  After that the structure of the outcomes for each option can be drawn out as separate subgraphs.  We call this the verbose graph, because it is simple to draw and interpret, but often has many redundant nodes.  For the two envelopes problem, we have two options: use the switching strategy, or don’t.

创建彼得斯堡DAG的第一步是绘制一个带有每个可用选项边缘的起始节点。 之后,可以将每个选项的结果结构绘制为单独的子图。 我们称其为冗长图,因为它易于绘制和解释,但通常具有许多冗余节点。 对于两个信封问题,我们有两个选择:使用切换策略,或者不使用。

用python和petersburg解决两个信封问题

Here, in the switching strategy subgraph, if envelope A is drawn, then it is immediately switched to envelope B, while in the keep strategy subgraph, there is no switch.  In this case, the payoff of node A or B in the final nodes would correspond to the dollar amount in the respective envelopes, and there are no edge costs.  M and N are used to represent the edge weights for the initial drawing, if each envelope is equally likely to be drawn, then the edge weights are equal to each other (M=N).

在此,在切换策略子图中,如果绘制了信封A,则立即将其切换到信封B,而在保持策略子图中,则没有切换。 在这种情况下,最终节点中节点A或B的收益将对应于相应信封中的美元金额,并且没有边际成本。 M和N用于表示初始绘图的边缘权重,如果每个信封都可能被绘制,则边缘权重彼此相等(M = N)。

As you can see, there are many redundant nodes in this graph, so we simplify it down to a reduced graph:

如您所见,此图中有许多冗余节点,因此我们将其简化为简化的图:

用python和petersburg解决两个信封问题

This, even without simulation makes it abundantly clear that if M and N are equal, then the two strategies are equal.  But let’s simulate it anyway with petersburg. In the simulation, we will run games with M=N=1, A=$50, B=$100 until the outcome converges.

即使没有仿真,这也很清楚地表明,如果M和N相等,则两种策略是相等的。 但是无论如何,让我们用彼得斯堡模拟它。 在模拟中,我们将运行M = N = 1,A = $ 50,B = $ 100的游戏,直到结果收敛为止。

用python和petersburg解决两个信封问题

So there you have it, not only is the expected outcome of each strategy is equal, the potential for ruin and the potential for windfall are the same as well. The reduced petersburg DAG can help to quickly make these kinds of conclusions apparent, often in a more intuitive way than traditional Bayesian probability calculations (though that is what’s happening under the hood regardless).  Because of this they can be a very useful tool for quick sketching of complicated relationships.  In future posts, we will explore some more of these classical cases, and some more real-world applications.

这样一来,不仅每个策略的预期结果都相等,而且破坏的可能性和意外的收获的可能性也相同。 简化的彼得斯堡DAG可以帮助使这些结论Swift变得明显,通常以比传统贝叶斯概率计算更直观的方式进行( 尽管无论如何都是如此 )。 因此,它们可以成为快速绘制复杂关系的非常有用的工具。 在以后的文章中,我们将探讨更多这些经典案例,以及更多实际应用。

翻译自: https://www.pybloggers.com/2016/01/solving-the-two-envelopes-problem-with-python-and-petersburg/