You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The NCCL backend's in-place reduce-scatter uses sendbuf = recvbufhere. But per NCCL documentation, the in-place reduce-scatter should actually have recvbuf be the appropriate offset into the recvbuf (see here). This appears appears to differ from the usual MPI semantics. Our current version works, but I'm not sure whether using overlapping buffers without telling NCCL that we are is safe. We may also be missing some performance benefits.
The text was updated successfully, but these errors were encountered:
The NCCL backend's in-place reduce-scatter uses
sendbuf = recvbuf
here. But per NCCL documentation, the in-place reduce-scatter should actually haverecvbuf
be the appropriate offset into the recvbuf (see here). This appears appears to differ from the usual MPI semantics. Our current version works, but I'm not sure whether using overlapping buffers without telling NCCL that we are is safe. We may also be missing some performance benefits.The text was updated successfully, but these errors were encountered: