-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lease read safety #10082
Comments
@pifuant let's say we have 5 nodes on term 2 and node 1 is the leader. Although node 5 can communicate with the other nodes it will not receive heartbeat AppendEntries RPC's from leader node 1 and will timeout thus become a candidate and start the election process. At this time node 5 will increment its term to 3 vote for itself and sends out RequestVote RPCs to the other nodes. Because it can still communicate with a majority of the nodes it could become the leader, which in this case we will assume. Once the remaining nodes increment to term 3 node 1 will remain in term 2 and all requests from it will be rejected by the nodes in term 3. When the new leader is elected it will reset all lease TTLs to its original value plus Lines 424 to 436 in e8b940f
Line 1959 in e8b940f
In v3.4 we will begin to optimize this prevent indefinite auto-renewal (see ref below) It is important to note that if entries were committed thus replicated to a majority of other nodes while node 1 was the leader and before node 5 was elected then node 5 will not contain all records of term 2 required to be elected in term 3. So, in this case, node 5 will not be voted for by the remaining nodes. You can read section 5.4 of the raft paper for more details on safety. ref: #9924, #7320, op-guide: leader-failure, raft paper I hope this helps shed a little light on how etcd would handle this situation. Lease data is still safe. |
@hexfusion thanks Important assumption: node1 is leader , node1 and node 5 has identical log, node 1 can't communicate with node 5 Node 1 don't know that node 5 has became leader and don't know TTL reset, because nobody can tell Node 1 all the thing. Unluckily, if node 1 receives a read request, node 1 just think it is leader and it will read local data in valid lease time. |
I think raft lease read is unsafe, but lease read in "paxos made live" is safe. |
no. l-read needs acks from the quorum. |
@xiang90 no, node 1 don't need acks from the quorum |
@hexfusion @xiang90 please take a look again, I really think raft lease read don't promise safety, unless etcd did extra thing for it. |
@pifuant If you have a question about raft's correctness, probably you should start the discussion in the raft mailing list: https://groups.google.com/forum/#!forum/raft-dev |
@xiang90 thanks |
I have a question about the etcd raft lease Read's safety. etcd-io/raft#166. |
120613: raft: remove unused read-only requests r=nvanbenschoten a=pav-kv This PR removes the read-only requests from `pkg/raft`. We don't use them in CRDB, and the leases implementation is known to be incorrect (e.g. see etcd-io/raft#166 and etcd-io/etcd#10082). Epic: none Release note: none 121956: sql: add accounting for entries in Txn Fingerprint ID cache r=yuzefovich a=yuzefovich This commit fixes a bug that could previously result in the memory accounting leak that was exposed by 88ebd70. Namely, the problem is that we previously unconditionally grew the memory account in `Add` even though if the ID is already present in the cache, it wouldn't be inserted again. As a result, we'd only shrink the account once but might have grown it any number of times for a particular ID. Now we check whether the ID is present in the cache and only grow the account if not. Epic: None Release note: None 122011: backupccl: skip BackuprestoreJobDescription under stress race r=kev-cao a=msbutler Informs #121927 Release note: none Co-authored-by: Pavel Kalinnikov <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Michael Butler <[email protected]>
5 nodes(1, 2, 3, 4, 5), node 1 is leader, connection of between node 1 and node 5 is break, others is ok.
In node 1(master) lease time, if node 5 triggers leader election and become leader, and then write some data. it would break lease read safety, how did etcd deal with this situation?
The text was updated successfully, but these errors were encountered: