Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix runtime error: comparing uncomparable type #18893

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

ktalg
Copy link
Contributor

@ktalg ktalg commented Nov 14, 2024

Fixes a runtime panic that occurs when KeepAlive is called with a Context implemented by an uncomparable type, which is later canceled. The panic message is:

panic: runtime error: comparing uncomparable type lease_test.uncomparableCtx

goroutine 380 [running]:
go.etcd.io/etcd/client/v3.(*lessor).keepAliveCtxCloser(0xc000327720, {0x13416c8, 0xc00011b9b0}, 0x14c4932ade455905, 0xc002291420)
    /home/kevin/IdeaProjects/etcd/client/v3/lease.go:351 +0x1d6
created by go.etcd.io/etcd/client/v3.(*lessor).KeepAlive in goroutine 156
    /home/kevin/IdeaProjects/etcd/client/v3/lease.go:299 +0x554

To reproduce the issue, the existing test has been updated to include a custom uncomparable Context implementation. Previously, this test consistently caused a panic due to the uncomparable Context type. The updated implementation modifies keepAliveCtxCloser to properly handle such cases.

@k8s-ci-robot
Copy link

Hi @ktalg. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ktalg
Copy link
Contributor Author

ktalg commented Nov 16, 2024

Could you kindly review this? @serathius

@ktalg
Copy link
Contributor Author

ktalg commented Nov 19, 2024

Just a friendly reminder to take a look at this PR. Let me know if there’s anything I need to adjust. @ahrtr @jmhbnz @ivanvc @fuweid

}

ctx, cancel := context.WithCancel(context.Background())
rc, kerr := lapi.KeepAlive(uncomparableCtx{Context: ctx}, resp.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest workaround is to pass in a pointer &uncomparableCtx{Context: ctx}.

@ahrtr
Copy link
Member

ahrtr commented Nov 19, 2024

Thanks for raising this PR. The root cause is that the etcd client SDK tries to compare two context intances, which might not be comparable.

if c == ctx {

Is this your a real use case in your production code? The easiest workaround is to pass in a pointer of your customized context intance as mentioned in #18893 (comment).

We can attach an unique ID for each context passed to KeepAlive something like below, and compare the IDs when we need to identify the context,

const uniqueContextKey = "uniqueContextKey"
func withUniqueID(ctx context.Context, id string) context.Context {
	return context.WithValue(ctx, uniqueContextKey, id)
}

func getUniqueID(ctx context.Context) (string, bool) {
	id, ok := ctx.Value(uniqueContextKey).(string)
	return id, ok
}

We also need to review the source code to check if there are other places which compare the context as well.

@ahrtr
Copy link
Member

ahrtr commented Nov 19, 2024

Or we can add a function something like below,

func CompareContexts(ctx1, ctx2 context.Context, key string) bool {
    return ctx1.Value(key) == ctx2.Value(key)
}

I am not sure whether there is an existing linter which can detect direct context comparison like below case. Probably we can add a such linter and integrate into golangci-lint if possible? @ivanvc @mmorel-35 @ldez

if c == ctx {

@ldez
Copy link

ldez commented Nov 19, 2024

Hello,

I am not sure whether there is an existing linter which can detect direct context comparison like below case.

There is no linter on this specific case.

Probably we can add a such linter and integrate into golangci-lint if possible?

The contexts are just structs so they are technically comparable.

https://go.dev/play/p/dyfw2rkHfTi

I understand your current issue, I need to investigate to kown if there are "valid" use cases behind this kind of comparison.

@ahrtr
Copy link
Member

ahrtr commented Nov 19, 2024

There is no linter on this specific case.

Thanks for the confirmation.

The contexts are just structs so they are technically comparable.

Not really.

Firstly the applications (including etcd) interact with with interface context.Context
https://github.com/golang/go/blob/ed07b321aef7632f956ce991dd10fdd7e1abd827/src/context/context.go#L68

Secondly, struct types are comparable if all their field types are comparable. In other words, if any field isn't comparable, then the struct isn't comparable.

Usually the default implementations of contexts included golang std lib are comparable. But users' customized implementation might not be comparable.

@ldez
Copy link

ldez commented Nov 19, 2024

I'm aware of the interface context.Context, I was talking about the standard context.
Yes, you can create any kind of implementation, I think it's a bit niche, but anyway.

I still need to investigate the feasibility and the relevance of a linter on this topic.

@ahrtr
Copy link
Member

ahrtr commented Nov 19, 2024

It might not be reasonable to create a linter for this. Comparing structs or interfaces using an comparison operators (i.e. ==) seems not an anti-pattern. It's totally up to applications.

The problem in this case is that users may pass in a customized implementation of context.Context instead of using the existing implementations in golang std lib.

The simplest solution is as I mentioned above in #18893 (comment) and #18893 (comment). But it's the first time we see such issue, and most likely it isn't a real production use case. So it's low priority to me.

@ldez
Copy link

ldez commented Nov 19, 2024

FYI, I created a quick linter about context comparison to evaluate the relevance.

I analyzed several large projects, and I found no context comparison.

But I found 2 comparisons inside Go itself:

And as you said, struct comparison is not an anti-pattern.

For now, I think it's not worth adding a linter for this.

@ahrtr
Copy link
Member

ahrtr commented Nov 20, 2024

Thanks @ldez

@ktalg Could you please confirm if this is a real use case from your production environment or a hypothetical scenario?

The workaround is to pass in a pointer as mentioned in #18893 (comment)

@ktalg
Copy link
Contributor Author

ktalg commented Nov 21, 2024

@ahrtr

Thank you very much for reviewing this PR and providing detailed feedback!

Could you please confirm if this is a real use case from your production environment or a hypothetical scenario?

I encountered this issue while working on production-grade project code. The problem arose not from creating a new Context implementation but from passing a struct embedding a Context into the KeepAlive function, as illustrated in the test case I added.

This situation highlights two key points:

  1. When a struct embeds an interface type, it automatically becomes an implementation of that interface. https://go.dev/doc/effective_go#embedding
  2. The == operator is not a guaranteed valid operation for interface types; its behavior depends on the specific implementation. https://go.dev/ref/spec#Comparison_operators

I didn't anticipate that KeepAlive would internally use == to compare the type I passed in. My perspective is that robust code should rely solely on the declared contract of an interface, not on implicit assumptions. In this case, it seems to "assume" that the interface implementation is always comparable, which isn't guaranteed.

I believe this change improves the robustness of the code, ensuring it adheres to these principles.

client/v3/lease.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Nov 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.71%. Comparing base (b1aa164) to head (8176118).
Report is 19 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
client/v3/lease.go 90.94% <100.00%> (+0.06%) ⬆️

... and 19 files with indirect coverage changes

@@           Coverage Diff           @@
##             main   #18893   +/-   ##
=======================================
  Coverage   68.71%   68.71%           
=======================================
  Files         420      420           
  Lines       35558    35560    +2     
=======================================
+ Hits        24433    24435    +2     
- Misses       9694     9700    +6     
+ Partials     1431     1425    -6     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1aa164...8176118. Read the comment docs.

client/v3/lease.go Outdated Show resolved Hide resolved
client/v3/lease.go Outdated Show resolved Hide resolved
client/v3/lease.go Outdated Show resolved Hide resolved
@ahrtr
Copy link
Member

ahrtr commented Nov 22, 2024

Link to #18935

Please also squash the commits. Also do you have time to backport this to 3.5 and 3.4?

@ahrtr
Copy link
Member

ahrtr commented Nov 22, 2024

/ok-to-test

@ahrtr
Copy link
Member

ahrtr commented Nov 22, 2024

Not sure why the workflow check codecov/project failed. Probably rebasing this PR might help? I see that this PR is based on an very old commit.

@ktalg
Copy link
Contributor Author

ktalg commented Nov 22, 2024

/retest

@ktalg
Copy link
Contributor Author

ktalg commented Nov 22, 2024

@ahrtr It seems I made a mistake when syncing with the upstream repository. This has now been fixed—thank you for pointing it out!

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, fuweid, ktalg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr
Copy link
Member

ahrtr commented Nov 25, 2024

@ktalg can you update the changelog for both 3.5 and 3.4? See example below,

#18482

@ahrtr ahrtr merged commit 2abb334 into etcd-io:main Nov 26, 2024
37 checks passed
ktalg added a commit to ktalg/etcd that referenced this pull request Nov 26, 2024
@ktalg
Copy link
Contributor Author

ktalg commented Nov 26, 2024

@ahrtr done.should I also update the changelog for version 3.6?

@ahrtr
Copy link
Member

ahrtr commented Nov 26, 2024

should I also update the changelog for version 3.6?

No, 3.6 changelog only is supposed to only include new features in 3.6.

ktalg added a commit to ktalg/etcd that referenced this pull request Nov 26, 2024
ktalg added a commit to ktalg/etcd that referenced this pull request Nov 26, 2024
@ktalg ktalg deleted the fix-comparing-ctx branch November 26, 2024 13:57
@ktalg ktalg restored the fix-comparing-ctx branch November 26, 2024 13:58
@ktalg ktalg deleted the fix-comparing-ctx branch November 26, 2024 13:58
ktalg added a commit to ktalg/etcd that referenced this pull request Nov 26, 2024
…h a Context implemented by an uncomparable type etcd-io#18893

Signed-off-by: KT <[email protected]>
ahrtr added a commit that referenced this pull request Nov 27, 2024
CHANGELOG: Fix runtime panic that occurs when KeepAlive is called with a Context implemented by an uncomparable type #18893
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants