Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.19.x] prov/efa: Use long CTS protocol when memory registration limits are reached #9623

Merged
merged 11 commits into from
Dec 6, 2023

Conversation

sunkuamzn
Copy link
Contributor

Cherry-pick of #9493 and #9533 to v1.19.x branch

FI_ENOMR is returned when the hardware memory registration limit is
reached

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 72a3c31)
The READ_NACK feature is checked before sending a EFA_RDM_READ_NACK_PKT
packet. The EFA_RDM_READ_NACK_PKT packet is sent by a receiver when it
fails to register a buffer to receive the RDMA read data in a long read
or runting read protocol

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 25b8636)
…ENOMR

Long read protocol could fail with ENOMR if the EFA provider is unable
to register the buffer with the NIC. In that case, we should fall back
to long CTS instead

This commit is for the changes when the sender fails to register the
source buffer. The sender will switch to the long CTS protocol.

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 3810dd0)
This change is required for the long read nack protocol where we get the
msg_id from ope instead of from the pke

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 8755e9c)
Long read protocol could fail with ENOMR if the EFA provider is unable
to register the buffer with the NIC. In that case, we should fall back
to long CTS protocol.

This commit is for the changes when the receiver fails to register the
destination memory. Receiver sends a NACK packet (packet type
EFA_RDM_READ_NACK_PKT) to the sender. The sender switches to the long
CTS protocol.

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 59e0e1f)
…ENOMR

Runting read protocol could fail with ENOMR if the EFA provider is unable
to register the buffer with the NIC. In that case, we should fall back
to long CTS instead

This commit is for the changes when the sender fails to register the
source buffer. The sender will switch to the long CTS protocol.

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit a29640f)
Runting read protocol could fail with ENOMR if the EFA provider is unable
to register the buffer with the NIC. In that case, we should fall back
to long CTS protocol.

This commit is for the changes when the receiver fails to register the
destination memory. Receiver sends a NACK packet (packet type
EFA_RDM_READ_NACK_PKT) to the sender. The sender switches to the long
CTS protocol.

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit a5b7e59)
Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 592dd68)
This commit introduces common functions that can be used to directly
manipulate memory registrations on EFA devices using the verbs API

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit 6ad6a83)
Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit d3480a5)
Add a pingpong test that exhausts MRs on both client and server

This test first exhausts MRs on the server and runs a pingpong test
It then exhausts MRs on the client and runs another pingpong test

A pytest hook is used to run this test after all other tests to prevent
MR exhaustion from affecting other tests

Signed-off-by: Sai Sunku <[email protected]>
(cherry picked from commit a715487)
@sunkuamzn sunkuamzn requested a review from a team December 5, 2023 18:28
@shijin-aws
Copy link
Contributor

bot:aws:retest

@shijin-aws shijin-aws merged commit 32a60f8 into ofiwg:v1.19.x Dec 6, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants