-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with OSBrain Communication Across Separate Machines? #341
Comments
You can bind to Try to connect to your Linux machine's LAN IP address, which you may find running the command Also, it seems from the screenshots that you are running both scripts from the Windows machine, right? (no Linux machine) Note that in one of the screenshots you are binding the nameserver to the localhost (127.0.0.1), which means you would not be able to connect to it from the LAN. |
Also:
😉 |
Hi there, Apologies, I had switched the roles of the machines out of convenience for the screenshots. Please see new attached screenshots with the Linux machine acting as the host (bound to 0.0.0.0) and the Windows machine acting as the agent (reaching out to the IP address of the Linux computer: 192.168.0.42). The port used on both machines is 5020. As you can see, the problem persists. Do you have any ideas on why this might be occurring? Also, thank you for your additional advice on use of ns_proxy.agents() and the lack of need for locate_ns. I will update the code to include those changes next time I am at this computer. Thanks so much. Sam |
@sjanko2 Would you mind sharing the full |
Sure, it is the same code as I put in my original post. See Below. Agent_Computer.py
|
Weird... I am not able to reproduce it. Although I do not have a Windows machine at hand.
|
Hi Peque, Apologies for the delay, it has been a busy two weeks. We connected them Linux (Host) to Linux (Agent) and it worked! See attached screenshot and code for sake of completeness of this forum post. Can you please help us understand why this may be necessary? For our purposes, we may need to connect a Windows to Linux in the future. Thanks! Sam Agent_Computer.py
Host_Computer.py
|
@sjanko2 What happens if you try with the Windows machine but you bind to |
I believe we tried that a few weeks ago and had the same result. We are skeptical that it may be related to the imaging of Windows from our university's IT team on our computers. Could it have something to do with firewalls? |
Unfortunately I do not have a Windows machine at hand, so I will not be able to help much with debugging. Just giving random ideas/advise hoping to find something... 😅 If you try again with Linux+Windows combination with |
PS: since you are using osBrain over a LAN, firewalls can definitely influence the result. It could be possible that your Windows firewall is affecting somehow the communications. Although the fact that you got a Maybe binding to a specific address ( |
I was able to reproduce the error, but I'm still not sure why it's happening. The main thing I've found is that binding to Apart from that, there seems to be a weird issue when connecting to a Windows host (either from a Linux or a Windows agent), where the connection works just fine, but the nameserver is unable to shut down the agent, and they both block waiting for a response, rather than shutting down. Shutting down the agents before the name server works OK. I'm not sure if this is the same issue or not. The summary is:
I will need to look deeper into the code to see what is going on with these |
It's been a while, but I think I managed to track down the issue. I believe we are registering agents the wrong way. When an agent is registered in the nameserver, it uses its own local address (usually This is also the problem when we bind to We can overcome these problems by always specifying our IP address when A slight modification in ...
# Join nameserver, if can't find then try again
while True:
try:
# Activate and register with server
print('Registering Agent with server...')
addr = '192.168.0.49:5021' # Or whatever IP your agent has
agent_proxy = run_agent('Agent1', ns_addr, addr)
... @Peque I think we should consider this a bug and find some way to fix it. |
@ocaballeror Thanks for having a look at it. I am not sure we should consider this a bug. Having to specify the address explicitly when setting up a distributed architecture does not sound like a bad idea. An option would be to, by default, make agents bind to X.X.X.X instead of localhost if the nameserver was bound to X.X.X.X. But I am not sure that is a good idea. Opening ports by default to an external network may not be smart. It seems right now the problem is easily fixable in user's code. We should probably update the documentation though, to make it clearer. |
Hello,
Thank you for your development on this wonderful library. I have been using it for the last few years in work towards my PhD and very much appreciate the work you all do!
I have been running simulations with multiple agents connected to a local host server on one computer for a while now. This works successfully. Now, my coworker and I are attempting to put the scripts on separate computers and have them communicate through direct ethernet connection via the OSBrain library and are having difficulty. We have one computer hosting the nameserver ("Host_Computer", a Linux OS) and the other computer attempting to register with it ("Agent_Computer", a Windows OS). We get an error that indicates the following on the Agent_Computer:
OSError: [WinError 10049] The requested address is not valid in its context
and
Pyro4.errors.CommunicationError: cannot connect to ('0.0.0.0', 5020): [WinError 10049] The requested address is not valid in its context
I have included the scripts I made to illustrate the problem, and attached a screenshot of the full stacktrace on the Agent_Computer (see first screenshot). The Host_Computer runs perfectly and simply "waits" for the Nameserver to be populated with the Agent_Computer (see second screenshot).
I'd just like to note that this code works perfectly fine if run on the same computer (such as with the Multirun plug-in in Pycharm). The Agent_Computer connects to the Host_Computer and all is well. We have only run into this issue when trying to connect on multiple machines.
Any advice you can provide would be great, thanks so much!
Sam
Host_Computer.py
Agent_Computer.py
The text was updated successfully, but these errors were encountered: