-
Notifications
You must be signed in to change notification settings - Fork 75
Does benchmark_pipe support ibv transport and cuda channel? #452
Comments
It should support all combinations of everything, don't pay attention to the help string. :) |
When I use
|
Perhaps you didn't build TensorPipe with InfiniBand support, or the InfiniBand support didn't detect the right hardware/software requirements on your machine and decided to turn itself off. You can check the latter by launching with |
After add
I think this may be due to me running in a docker container(host support rdma) and not configuring the rdma NIC driver correctly. |
Hi, @lw
On client side:
|
To me this looks like your network isn't set up correctly. I can't help you with that. You should check this yourself or with your administrator, and there's standard diagnostic tools that can help you. |
Thanks, actually my rdma(RoCE) network is connected, I tested the connectivity with ib_send_bw. |
Hi @lw , I solved this ibv transport error by changing the default kGlobalIdentifierIndex to 3 since gid 3 is available in my environment. It would be great if tensorpipe could automatically detect the available gid or allow users to set it. |
Glad to see you figured out the issue! I'm not that familiar with ibverbs to know how to auto-determine the gid, if you know how and would like to submit a patch that'd be great! |
It seems like benchmark_pipe only support [shm|uv] transport and [basic] channel. Does benchmark_pipe also support ibv transport and cuda channel? Is there a complete example of different transport and channel combinations, including cpu-cpu communication and GPU-GPU communication?
The text was updated successfully, but these errors were encountered: