Replies: 1 comment 1 reply
-
@lijh5 I would guess the issue is memory copy bandwidth , memory copy is needed for MPI at least on the receiver side, and is not needed for the basic ib_send_bw test.
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Performance issues with UCX UD transmission when packet size is 4K
bytes BW average[MB/sec]
2 11.17
4 23.28
8 46.95
16 93.43
32 187.06
64 374.52
128 750.02
256 1500.93
512 2992.35
1024 5976.25
2048 11844.51
4096 15569.28
Size Bandwidth (MB/s)
2 9.28
4 19.05
8 38.43
16 69.23
32 138.07
64 272.90
128 503.77
256 857.66
512 2063.33
1024 4201.60
2048 8411.68
4096 9389.34
bandwidth (MB/s)
average
9046.25
9864.50
9865.95
9863.96
9852.61
bandwidth (MB/s)
average
10858.72
10861.54
10852.11
10863.82
As can be seen, the performance of osu_bw is relatively low, especially in 4K, and there is no performance improvement when tested with ucx_perf_test. Do you have any optimization methods?
Can ucx_perftest only test one package length at a time? If you want to test 2-4K package lengths, how should I write it?
thank you very much!
Beta Was this translation helpful? Give feedback.
All reactions