-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle 410 Gone response when watching resources #478
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #478 +/- ##
==========================================
+ Coverage 94.61% 94.98% +0.36%
==========================================
Files 29 30 +1
Lines 3141 3909 +768
==========================================
+ Hits 2972 3713 +741
- Misses 169 196 +27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great thanks! Usually I would ask for tests but I can't think of a clean way to test this without mocking out the whole call.
I left one small nitpick comment, but otherwise this loks good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking some more about this. I'm not confident we are handling this in the right place.
The guidance from Kubernetes is that clients should restart the watch from the most recent resource version when it gets a 410.
However, this proposal is to just raise an exception in watch()
and handle it in wait()
. This doesn't account for folks using watch()
directly, they will just get a GoneError
every now and then that they need to handle, but really we should be handling it for them.
I think we should move the while True
into async_watch()
and never raise a GoneError
.
e0a9466
to
fdb1dce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like tests are failing on older Python versions. I think you want to use anyio.TaskGroup()
instead of asyncio.TaskGroup()
for backward compatibility (and trio
support).
@jacobtomlinson thanks for your help, it was more complicated than I thought 😅 |
eb92322
to
60aaeaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate you taking the time to get really deep with that test. However I'm worried that it digs so far into kr8s
internals with mocking that it will become brittle in the future when we want to change things.
It's a little annoying that the default etcd behaviour is to cache for 5 mins, so we would need a test that runs for 5 mins to test this properly without mocks.
I think it's probably best to just merge this without a test to save maintenance issues in the future. I've also added some debug logging so that we can verify it is restarting for ourselves.
Closes #476
This MR:
GoneError
when the watch request return a 410 code,GoneError
is raise, while we waiting conditions on the object