-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing break the glass as a principle #38
Conversation
5. **Manageable "always"** | ||
|
||
Desired state is able to be updated according to users' SLA expectations to update system state, even if the "source" is unavailable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grmhay Thank you for the pull request.
Sorry if I'm misunderstanding you but the PRINCIPLES.md
section of this pull request...
Manageable "always"
Desired state is able to be updated according to users' SLA expectations to update system state, even if the "source" is unavailable.
... seems to assume that the source should be centrally managed or always managed with SLA expectations.
In the scenario that you described, it seems that GitHub is used for GitOps:
We (Morgan Stanley) believe that the situation where the source of truth for desired state (e.g. github.com or a git-equivalent that an enterprise may run) is less available than your users' expected SLA for making configuration changes is being left by the community as an issue for the implementer to overcome.
Put succinctly, if Github is unavailable and you want to make changes to your System State, there should be one approach and a set of tooling to allow reconciliation after the fact.
This will both harm adoption of gitops and is inefficient as I believe we shared a common challenge that we can solve once within the project.
The first step, as this project has so well established, is a glossary of terms to allow us to describe the problem and a draft principle to add. I have included these in this PR.
My concern is that the proposed principle, as written, seems to presuppose GitOps only running as a centralized system and always managed with an SLA.
While GitHub can be centrally managed with an SLA, Git isn't centrally managed at all.
The proposed principle, as written, seems to exclude non-centralized usages of GitOps, Git, Kubernetes, etc.
While GitOps doesn't require Git, I am listing Git below because you referenced Git earlier...
• Git, by design, is a distributed revision control system (DVCS), and not managed as a centralized system
Since we are discussing principles, which needs to be applicable in many scenarios... Centralized management wouldn't work in disconnected scenarios, such as:
• Kubernetes on fighter jets, e.g. https://www.cncf.io/blog/2021/09/30/how-to-get-robust-gitops-the-u-s-department-of-defense-uses-flux-and-helm/
• Kubernetes at in-store point of sales systems, e.g. https://www.cncf.io/blog/2021/02/19/how-a-4-billion-retailer-built-an-enterprise-ready-kubernetes-platform-powered-by-linkerd/
• Kubernetes in air-gapped environments, e.g. https://github.com/cncf/cnf-testsuite/blob/main/AIRGAP.md
• Kubernetes at the edge, e.g. https://www.cncf.io/blog/2021/05/04/kubernetes-at-the-edge-organizations-are-using-edge-technologies-but-there-is-room-to-grow/
While GitOps doesn't require Kubernetes, I listed Kubernetes in links above because Kubernetes is a CNCF project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lloydchang. Appreciate your feedback and apologies for the delay in replying - Kubecon then a couple of days off. I work in a large enterprise without disconnected scenarios so it is great to collaborate with someone who has a different perspective! Reflecting on principle #3 "Software agents automatically pull the desired state declarations from the source." Our problem is if the the desired state in the "source" on the "state store" (usage of terms I believe per the Glossary) is less available than the desired SLA the users have to change the desired state of the "Software System", we have a problem.
Reflecting on most of the answers at GitOpsCon to this question that I put to end user organization presenters, this problem is either ignored ("well if Git/Gitlab/... is down, we can't make cluster changes") or unsolved and I believe that will end up in a bad place for GitOps.
I think actually with your example of the disconnected scenario, doesn't the problem I, in the enterprise, outline become even more acute? What happens if you are seeking to update the desired state of a Kubernetes cluster (example software system) but the "state store" is unavailable (e.g. WAN connection down to a branch office holding the cluster). You just can't change the cluster config? Or you break glass and change the cluster config then you are left to reconcile the desired state expression on the "state store" manually to what is on your cluster.
Note: I also have to fix my commits to have DCO signoff so I'll amend my commit based on your feedback and please continue the conversation against my new PR
@grmhay Generally re: DCO, I found the the following useful. Thank you! @scottrigby wrote in gitops-working-group/gitops-working-group#117
Then, specifically to this #38 From https://github.com/open-gitops/documents/pull/38/checks?check_run_id=3876293036
|
TL;DR: Break Glass was already added in RC1 draft at #21, then subsequently removed in RC2 draft at #22. Thank you @grmhay for your pull request. I hope my feedback is useful. It appears the topics "break the glass" and "Break Glass" were discussed in past pull requests #21, #22, and meetings July 28th, July 7th, May 19th. Thank you for your patience because the meeting recordings after May 5th haven't been uploaded yet. Below are more details: From @todaywasawesome at #22
Related, below are from meeting notes in https://docs.google.com/document/d/1hxifmCdOV5_FbKloDJRWZQHq0ge-trXJKF-BgV4wHVk/edit
Meeting recordings are uploaded to https://www.youtube.com/channel/UCI6iqYuuI4gZuOCZaks5i1g/videos Status of meeting recordings: My understanding from @scottrigby via #wg-gitops Slack channel at https://cloud-native.slack.com/archives/C01G9DEE88M/p1634261795135100 is:
Thank you @chris-short and @scottrigby for your time for uploading meeting recordings. |
Will open a new PR now I know what a DCO signoff is! Sorry... |
We (Morgan Stanley) believe that the situation where the source of truth for desired state (e.g. github.com or a git-equivalent that an enterprise may run) is less available than your users' expected SLA for making configuration changes is being left by the community as an issue for the implementer to overcome.
Put succinctly, if Github is unavailable and you want to make changes to your System State, there should be one approach and a set of tooling to allow reconciliation after the fact.
This will both harm adoption of gitops and is inefficient as I believe we shared a common challenge that we can solve once within the project.
The first step, as this project has so well established, is a glossary of terms to allow us to describe the problem and a draft principle to add. I have included these in this PR.