Skip to content

This controller checks the status of etcd and restarts control plane components which are in a state of crashloop-backoff over an extensive period of time.

License

Notifications You must be signed in to change notification settings

gardener-ci-robot/dependency-watchdog

 
 

Repository files navigation

Dependency Watchdog

REUSE status CI Build status Unit Tests Go Report Card GoDoc

Overview

A watchdog which actively looks out for disruption and recovery of critical services. If there is a disruption then it will prevent cascading failure by conservatively scaling down dependent configured resources and if a critical service has just recovered then it will expedite the recovery of dependent services/pods.

Avoiding cascading failure is handled by Prober component and expediting recovery of dependent services/pods is handled by Weeder component. These are separately deployed as individual pods.

Current Limitation & Future Scope

Although in the current offering the Prober is tailored to handle one such use case of kube-apiserver connectivity, but the usage of prober can be extended to solve similar needs for other scenarios where the components involved might be different.

Start using or developing the Dependency Watchdog

See our documentation in the /docs repository, please find the index here.

Feedback and Support

We always look forward to active community engagement.

Please report bugs or suggestions on how we can enhance dependency-watchdog to address additional recovery scenarios on GitHub issues

About

This controller checks the status of etcd and restarts control plane components which are in a state of crashloop-backoff over an extensive period of time.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 94.9%
  • Shell 3.2%
  • Makefile 1.8%
  • Dockerfile 0.1%