Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fleet: init fleet webhook and ensure single fleet per namespace #409

Closed
wants to merge 1 commit into from

Conversation

Xieql
Copy link
Contributor

@Xieql Xieql commented Oct 23, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR introduces validation for the Fleet custom resource using webhooks. The primary changes include:

  • Fleet Validation:

    • Added checks to ensure that there's only one Fleet instance in a namespace. see Limit Fleet CRD to One Instance per Namespace #382
    • The kind field of a Cluster can only take the values Cluster, AttachedCluster, or CustomCluster. Any other values are considered invalid.
  • Unit Tests:
    Added unit tests to validate the logic in the Fleet webhook.
    Created sample yaml files for both valid and invalid Fleet configurations to aid testing.
    Here is the result of UT:

=== RUN   TestValidFleetValidation
--- PASS: TestValidFleetValidation (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\fleet.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\fleet.yaml (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\metric\metric-plugin.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\metric\metric-plugin.yaml (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\metric\monitor-demo\avalanche.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\metric\monitor-demo\avalanche.yaml (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\policy\badpod-demo\badpod.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\policy\badpod-demo\badpod.yaml (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\policy\kyverno.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\policy\kyverno.yaml (0.00s)
=== RUN   TestValidFleetValidation/..\..\examples\fleet\quickstart.yaml
    --- PASS: TestValidFleetValidation/..\..\examples\fleet\quickstart.yaml (0.00s)
=== RUN   TestInvalidFleetValidation
--- PASS: TestInvalidFleetValidation (0.00s)
=== RUN   TestInvalidFleetValidation/testdata\fleet\invalid-cluster-kind.yaml
    fleet_webhook_test.go:47: Fleet.fleet.kurator.dev "quickstart" is invalid: spec.clusters[1].kind: Invalid value: "invalid-cluster-kind": unsupported cluster kind; please use AttachedCluster to manage your own cluster
    --- PASS: TestInvalidFleetValidation/testdata\fleet\invalid-cluster-kind.yaml (0.00s)
=== RUN   TestInvalidFleetValidation/testdata\fleet\miss-cluster-name-kind.yaml
    fleet_webhook_test.go:47: Fleet.fleet.kurator.dev "quickstart" is invalid: [spec.clusters[0].name: Required value: name is required, spec.clusters[1].kind: Required value: kind is required]
    --- PASS: TestInvalidFleetValidation/testdata\fleet\miss-cluster-name-kind.yaml (0.00s)
PASS

Which issue(s) this PR fixes:

part of #404
part of #336
Fixes #382

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fleet: init fleet webhook and ensure single fleet per namespace

@netlify
Copy link

netlify bot commented Oct 23, 2023

Deploy Preview for kurator-dev canceled.

Name Link
🔨 Latest commit d6f7e99
🔍 Latest deploy log https://app.netlify.com/sites/kurator-dev/deploys/65361d38b5c1220008b5862e

@kurator-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kevin-wangzefeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

return nsLocks[ns]
}

func (wh *FleetWebhook) ValidateUpdate(_ context.Context, oldObj, newObj runtime.Object) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For update, we should prebent some fields update we do not support, this need to be reviewed one by one


// Check if Fleet instance already exists in the namespace
existing := &v1alpha1.Fleet{}
if err := wh.Client.Get(ctx, client.ObjectKey{Namespace: in.Namespace, Name: in.Name}, existing); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure i understand

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8s has prevented creating same fleet


// Utility function to get or create a mutex for a namespace
func getOrCreateMutexForNamespace(ns string) *sync.Mutex {
if _, exists := nsLocks[ns]; !exists {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data race r/w nsLocks


// Ensure only one Fleet instance in a namespace
mutex := getOrCreateMutexForNamespace(in.Namespace)
mutex.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock can only prevent inner-process creating, it cannot help for multi replicate scenarios.

One way as I said in the original issue, could make use of distributed lock.

@hzxuzhonghu
Copy link
Member

maybe we can use resource quota, install fleet manager with a resourceQuota

@Xieql
Copy link
Contributor Author

Xieql commented Oct 26, 2023

maybe we can use resource quota, install fleet manager with a resourceQuota

ok

@Xieql
Copy link
Contributor Author

Xieql commented Nov 24, 2023

Given that the current design, along with the resourceQuota settings, are inadequate in addressing the issue of ensuring a single fleet per namespace (as detailed in Issue #382), I am closing this pull request.

@Xieql Xieql closed this Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit Fleet CRD to One Instance per Namespace
3 participants