在handle_call中发生陷阱进程崩溃

问题描述：

我有2个GenServer模块 - A和B. B监视A并执行handle_info以在A崩溃时接收:DOWN消息。在handle_call中发生陷阱进程崩溃

在我的示例代码中，B向A发出同步请求（handle_call）。在处理请求时，A崩溃。 B应该收到:DOWN消息，但它没有。为什么？

当我将handle_call替换为handle_cast时，B收到:DOWN消息。你能告诉我为什么handle_call不起作用，而handle_cast呢？

这是简单的例子的代码：

defmodule A do 
    use GenServer 

    def start_link do 
    GenServer.start_link(__MODULE__, :ok, name: :A) 
    end 

    def fun(fun_loving_person) do 
    GenServer.call(fun_loving_person, :have_fun) 
    end 

    def init(:ok) do 
    {:ok, %{}} 
    end 

    def handle_call(:have_fun, _from, state) do 
    ######################### Raise an error to kill process :A 
    raise "TooMuchFun" 

    {:reply, :ok, state} 
    end 
end 

defmodule B do 
    use GenServer 

    def start_link do 
    GenServer.start_link(__MODULE__, :ok, name: :B) 
    end 

    def spread_fun(fun_seeker) do 
    GenServer.call(:B, {:spread_fun, fun_seeker}) 
    end 

    def init(:ok) do 
    {:ok, %{}} 
    end 

    def handle_call({:spread_fun, fun_seeker}, _from, state) do 
    ######################### Monitor :A 
    Process.monitor(Process.whereis(:A)) 

    result = A.fun(fun_seeker) 
    {:reply, result, state} 
    rescue 
    _ -> IO.puts "Too much fun rescued" 
    {:reply, :error, state} 
    end 

    ######################### Receive :DOWN message because I monitor :A 
    def handle_info({:DOWN, _ref, :process, _pid, _reason}, state) do 
    IO.puts "============== DOWN DOWN DOWN ==============" 
    {:noreply, state} 
    end 
end 

try do 
    {:ok, a} = A.start_link 
    {:ok, _b} = B.start_link 
    :ok = B.spread_fun(a) 
rescue 
    exception -> IO.puts "============= #{inspect exception, pretty: true}" 
end

答

在我的示例代码，B使同步请求（handle_call）至A.在处理请求，A崩溃。 B应该接收：DOWN消息，但它不。为什么？

B则收到:DOWN消息时崩溃，而是因为你仍然在处理程序调用A，它不会有机会，直到handle_call回调完成处理:DOWN消息。它不会完成，因为调用会失败并退出，这也会导致B崩溃。

当我用handle_cast替换handle_call时，B收到：DOWN消息。你能告诉我为什么handle_call不起作用，而handle_cast呢？

调用是同步的，铸件是异步的，所以在这种情况下，回调handle_call其施放到A完成，和B就成为空闲处理:DOWN消息。 B不会崩溃，因为演员隐含地忽略了尝试发送该消息的任何失败，这是“火并且忘记”。

在我看来，你想打电话给它，当处理一个崩溃，就是平凡的完成，像这样：

def handle_call({:spread_fun, fun_seeker}, _from, state) do 
    ######################### Monitor :A 
    Process.monitor(Process.whereis(:A)) 

    result = A.fun(fun_seeker) 
    {:reply, result, state} 
catch 
    :exit, reason -> 
    {:reply, {:error, reason}, state} 
rescue 
    _ -> 
    IO.puts "Too much fun rescued" 
    {:reply, :error, state} 
end

这将赶在远程过程是不是活的发生退出，死亡或超时。您可以根据具体的退出原因进行匹配，例如:noproc，您可以通过在要防范的catch条款中指定原因。

我不清楚你是否需要显示器，我想这取决于你想要使用它，但在你给出的例子中，我会说你不知道。

非常感谢！真棒解释。我有一个RabbitMQ消息使用者（模块B），依赖于另一个模块（A）。当A崩溃并且B不能拒绝并重新发送消息时，我不希望消息陷入僵局。你知道更好的方法吗？ –

那么一种方法是让B在A失败时丢弃消息，并让消息过期并自动重新由RabbitMQ重新发送消息（假设队列已正确配置）。另一种方法是使用上面的“捕捉”策略，在放弃之前重试该信息几次。另一种方法是让B将消息传递给负责处理该消息的进程，释放B继续消费其他消息并让该进程重试直至成功为止;但这只适用于消息顺序不重要的情况，然后你有一个流量控制问题来处理 – bitwalker

在我看来，你最好是找到一种方法让B拒绝/重新发送消息（这当然应该是可行的，如果你正在捕捉退出），或者丢弃失败的消息，并让它们自动失效并重新排序。这实际上取决于你需要提供什么保证，如果消息的顺序很重要，重试可能是一个更好的选择。 – bitwalker

在handle_call中发生陷阱进程崩溃

相关推荐