You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The nameserver crashed on shutdown and I could not restart it because it was left hanging, waiting for a rogue agent to shut down, which apparently is the expected behavior.
Surprisingly enough the error message shown was:
TimeoutError: Chances are [] were not shutdown after 10.0 s!
So it would appear like the agent was still alive after the call to async_kill_agents but it effectively died in the milliseconds between us checking if it was alive and the TimeoutErrror being raised just after that. I find it very very strange, especially considering that we set a default timeout of 10 seconds, which should be plenty for any kind of agent to shut down.
It probably has something to do with the agent being unresponsive and having broken the connection between it and the nameserver, but it's hard to know for sure until we can get a reproducible case.
The text was updated successfully, but these errors were encountered:
To be fair, there are quite a few things that don't seem to work with pypy, so I'm not sure if this counts as "reproducing" the error.
My guess from a few minutes of running this is that pypy must handle threads in a different way than what we are used to. The ContextTerminated errors that pop up when running this test certainly look like the context is being terminated before we expected.
What is happening on pypy reminds me of this other test I wrote when I first tried to reproduce the error.
The agent ends up in a very wrong state, and the output looks kind of similar:
I still haven't found a way to reproduce the original Chances are [] were not shutdown error 😞. There could be many factors involved, but what exactly happened is still beyond me.
The nameserver crashed on shutdown and I could not restart it because it was left hanging, waiting for a rogue agent to shut down, which apparently is the expected behavior.
Surprisingly enough the error message shown was:
So it would appear like the agent was still alive after the call to
async_kill_agents
but it effectively died in the milliseconds between us checking if it was alive and theTimeoutErrror
being raised just after that. I find it very very strange, especially considering that we set a default timeout of 10 seconds, which should be plenty for any kind of agent to shut down.It probably has something to do with the agent being unresponsive and having broken the connection between it and the nameserver, but it's hard to know for sure until we can get a reproducible case.
The text was updated successfully, but these errors were encountered: