Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chart/redis_ha][BUG] Startup probe has bad defaults #306

Open
hpfmn opened this issue Nov 22, 2024 · 0 comments
Open

[chart/redis_ha][BUG] Startup probe has bad defaults #306

hpfmn opened this issue Nov 22, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@hpfmn
Copy link

hpfmn commented Nov 22, 2024

Describe the bug

The newly introduced startup probe gives the redis less time to startup, leading to crash loops for servers that take longer to startup. The repeated crashes also lead to the PVC filling up because new temp-rdb files are constantly created.

The "old" liveness probes use these defaults:

    initialDelaySeconds: 30
    periodSeconds: 15
    timeoutSeconds: 15
    successThreshold: 1
    failureThreshold: 5

That means it waits 30+(5*15)=105 seconds that redis is allowed to need for startup. The new startup probe has these defaults:

    initialDelaySeconds: 5
    periodSeconds: 10
    timeoutSeconds: 15
    successThreshold: 1
    failureThreshold: 3

Leading to 5+(3*15)=50 seconds slag for the redis, which is less than what the liveness probe allows and the opposite of what a startup probe is made for. To quote kubernetes documentation on startup probes:

The solution is to set up a startup probe with the same command, HTTP or TCP check, with a failureThreshold * periodSeconds long enough to cover the worst case startup time.

To Reproduce
Steps to reproduce the behavior:

  1. Have a redis that needs a bit longer to startup
  2. Update to a version newer than 4.27.8 which introduces the new startup probe
  3. Experience CrashLoops for first new node that is rolled over
  4. See error

Expected behavior

Use at least the same defaults as for the liveness probe!

@hpfmn hpfmn added the bug Something isn't working label Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants