Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Majority #93

Merged
merged 5 commits into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ A central manifest of specs with descriptions and accounts of their various mode
| 101 | Learn TLA⁺ Proofs | <a href="specifications/LearnProofs">Basic PlusCal algorithms and formal proofs of their correctness</a> | Andrew Helwer | &#10004; | &#10004; | Sequences, Naturals, Integers, TLAPS | &#10004; | |
| 102 | Lexicographically-Least Circular Substring | <a href="specifications/LeastCircularSubstring">Booth's 1980 algorithm to find the lexicographically-least circular substring</a> | Andrew Helwer | | &#10004; | FiniteSets, Integers, Naturals, Sequences | &#10004; | |
| 103 | Distributed Checkpoint Coordination | <a href="specifications/CheckpointCoordination">Algorithm for coordinating checkpoint/snapshot leases in a Paxos ring</a> | Andrew Helwer | | &#10004; | FiniteSets, Naturals, Sequences, TLC | | |
| 104 | Boyer-Moore Majority Vote | <a href="specifications/Majority">Efficient algorithm for finding a majority value</a> | Stephan Merz | &#10004; | &#10004; | Integers, Sequences, FiniteSets, TLAPS | | |

## License

Expand Down
47 changes: 46 additions & 1 deletion manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,7 @@
"tlaLanguageVersion": 2,
"features": [],
"models": []
}
}
]
},
{
Expand Down Expand Up @@ -3698,6 +3698,51 @@
"models": []
}
]
},
{
"path": "specifications/Majority",
"title": "Boyer-Moore majority vote algorithm",
"description": "An efficient algorithm for detecting a majority value in a sequence",
"source": "",
"authors": [
"Stephan Merz"
],
"tags": ["beginner"],
"modules": [
{
"path": "specifications/Majority/Majority.tla",
"communityDependencies": [],
"tlaLanguageVersion": 2,
"features": [],
"models": []
},
{
"path": "specifications/Majority/MCMajority.tla",
"communityDependencies": [],
"tlaLanguageVersion": 2,
"features": [],
"models": [
{
"path": "specifications/Majority/MCMajority.cfg",
"runtime": "00:00:05",
"size": "small",
"mode": "exhaustive search",
"config": ["ignore deadlock"],
"features": [],
"result": "success"
}
]
},
{
"path": "specifications/Majority/MajorityProof.tla",
"communityDependencies": [],
"tlaLanguageVersion": 2,
"features": [
"proof"
],
"models": []
}
]
}
]
}
15 changes: 15 additions & 0 deletions specifications/Majority/MCMajority.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
SPECIFICATION Spec

CONSTANTS
A = A
B = B
C = C
bound = 5
Seq <- BoundedSeq

INVARIANTS
TypeOK
Correct
Inv

CHECK_DEADLOCK FALSE
17 changes: 17 additions & 0 deletions specifications/Majority/MCMajority.tla
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
------------------------- MODULE MCMajority ----------------------------------
(****************************************************************************)
(* TLA+ module for model checking the majority vote algorithm for all *)
(* sequences over three elements of bounded length. *)
(****************************************************************************)
EXTENDS Integers
CONSTANTS A, B, C, bound
ASSUME bound \in Nat

Value == {A,B,C}
BoundedSeq(S) == UNION { [1 .. n -> S] : n \in 0 .. bound }

VARIABLES seq, i, cand, cnt

INSTANCE Majority

==============================================================================
79 changes: 79 additions & 0 deletions specifications/Majority/Majority.tla
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
-------------------------------- MODULE Majority -----------------------------
(****************************************************************************)
(* TLA+ specification and proof of the majority vote algorithm due to Boyer *)
(* and Moore. *)
(* R.S. Boyer, J.S. Moore: MJRTY - A Fast Majority Vote Algorithm. *)
(* In: R.S. Boyer (ed.): Automated Reasoning: Essays in Honor of Woody *)
(* Bledsoe, pp. 105-117. Dordrecht, The Netherlands, 1991. *)
(* Originally published in a technical report from 1981. *)
(* The algorithm takes as input a sequence of values, makes one pass over *)
(* the sequence, and reports an element cand such that no element other *)
(* than cand may have an absolute majority in the sequence. *)
(****************************************************************************)
EXTENDS Integers, Sequences, FiniteSets

CONSTANT Value
ASSUME ConstAssump == Value # {}

(****************************************************************************)
(* Although seq is an input to the algorithm, we make it a variable so that *)
(* we can use the model checker to verify the algorithm for all sequences *)
(* up to some bounded size. *)
(****************************************************************************)
VARIABLES
seq, \* input sequence of values, never changes
i, \* next position of sequence to be checked
cand, \* current candidate for having a majority
cnt \* lower bound for the number of occurrences of the candidate so far

vars == <<seq, i, cand, cnt>>

TypeOK ==
/\ seq \in Seq(Value)
/\ i \in 1 .. Len(seq)+1
/\ cand \in Value
/\ cnt \in Nat

Init ==
/\ seq \in Seq(Value)
/\ i = 1
/\ cand \in Value
/\ cnt = 0

Next ==
/\ i <= Len(seq)
/\ i' = i+1 /\ seq' = seq
/\ \/ /\ cnt = 0
/\ cand' = seq[i]
/\ cnt' = 1
\/ /\ cnt # 0 /\ cand = seq[i]
/\ cand' = cand
/\ cnt' = cnt + 1
\/ /\ cnt # 0 /\ cand # seq[i]
/\ cand' = cand
/\ cnt' = cnt - 1

Spec == Init /\ [][Next]_vars /\ WF_vars(Next)

(****************************************************************************)
(* Definitions used for stating correctness. *)
(****************************************************************************)
\* set of indexes in the prefix of the sequence strictly before j holding v
PositionsBefore(v,j) == { k \in 1 .. (j-1) : seq[k] = v }
\* number of times v occurs in that prefix
OccurrencesBefore(v,j) == Cardinality(PositionsBefore(v,j))
\* number of times v occurs in all of the sequence
Occurrences(x) == OccurrencesBefore(x, Len(seq)+1)

\* main correctness property: cand can be the only value that has a majority
Correct ==
i > Len(seq) =>
\A v \in Value : 2 * Occurrences(v) > Len(seq) => v = cand

\* inductive invariant for proving correctness
Inv ==
/\ cnt <= OccurrencesBefore(cand, i)
/\ 2 * (OccurrencesBefore(cand, i) - cnt) <= i - 1 - cnt
/\ \A v \in Value \ {cand} : 2 * OccurrencesBefore(v, i) <= i - 1 - cnt

==============================================================================
99 changes: 99 additions & 0 deletions specifications/Majority/MajorityProof.tla
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
-------------------------- MODULE MajorityProof ------------------------------
EXTENDS Majority, FiniteSetTheorems, TLAPS

(***************************************************************************)
(* Proving type correctness is easy. *)
(***************************************************************************)
LEMMA TypeCorrect == Spec => []TypeOK
<1>1. Init => TypeOK
BY DEF Init, TypeOK
<1>2. TypeOK /\ [Next]_vars => TypeOK'
BY DEF TypeOK, Next, vars
<1>. QED BY <1>1, <1>2, PTL DEF Spec

(***************************************************************************)
(* Auxiliary lemmas about positions and occurrences. *)
(***************************************************************************)
LEMMA PositionsOne == \A v : PositionsBefore(v,1) = {}
BY DEF PositionsBefore

LEMMA PositionsType == \A v, j : PositionsBefore(v,j) \in SUBSET (1 .. j-1)
BY DEF PositionsBefore

LEMMA PositionsFinite ==
ASSUME NEW v, NEW j \in Int
PROVE IsFiniteSet(PositionsBefore(v,j))
BY 1 \in Int, j-1 \in Int, PositionsType, FS_Interval, FS_Subset, Zenon

LEMMA PositionsPlusOne ==
ASSUME TypeOK, NEW j \in 1 .. Len(seq), NEW v
PROVE PositionsBefore(v, j+1) =
IF seq[j] = v THEN PositionsBefore(v,j) \union {j}
ELSE PositionsBefore(v,j)
BY DEF TypeOK, PositionsBefore

LEMMA OccurrencesType == \A v : \A j \in Int : OccurrencesBefore(v,j) \in Nat
BY PositionsFinite, FS_CardinalityType DEF OccurrencesBefore

LEMMA OccurrencesOne == \A v : OccurrencesBefore(v,1) = 0
BY PositionsOne, FS_EmptySet DEF OccurrencesBefore

LEMMA OccurrencesPlusOne ==
ASSUME TypeOK, NEW j \in 1 .. Len(seq), NEW v
PROVE OccurrencesBefore(v, j+1) =
IF seq[j] = v THEN OccurrencesBefore(v,j) + 1
ELSE OccurrencesBefore(v,j)
<1>1. CASE seq[j] = v
<2>1. IsFiniteSet(PositionsBefore(v,j))
BY PositionsFinite
<2>2. j \notin PositionsBefore(v,j)
BY PositionsType
<2>3. PositionsBefore(v, j+1) = PositionsBefore(v,j) \union {j}
BY <1>1, PositionsPlusOne, Zenon
<2>. QED BY <1>1, <2>1, <2>2, <2>3, FS_AddElement DEF OccurrencesBefore
<1>2. CASE seq[j] # v
BY <1>2, PositionsPlusOne, Zenon DEF OccurrencesBefore
<1>. QED BY <1>1, <1>2

(***************************************************************************)
(* We prove correctness based on the inductive invariant. *)
(***************************************************************************)
LEMMA Correctness == Spec => []Correct
<1>1. Init => Inv
BY OccurrencesOne DEF Init, Inv
<1>2. TypeOK /\ Inv /\ [Next]_vars => Inv'
<2>. SUFFICES ASSUME TypeOK, Inv, Next PROVE Inv'
BY DEF Inv, vars, OccurrencesBefore, PositionsBefore
<2>. i <= Len(seq) /\ i' = i+1 /\ seq' = seq
BY DEF Next
<2>0. \A v \in Value : OccurrencesBefore(v, i)' = OccurrencesBefore(v, i')
BY DEF OccurrencesBefore, PositionsBefore
<2>. USE OccurrencesType DEF TypeOK
<2>1. CASE cnt = 0 /\ cand' = seq[i] /\ cnt' = 1
<3>1. i \in PositionsBefore(seq[i], i+1)
BY DEF PositionsBefore
<3>2. 1 <= OccurrencesBefore(seq[i], i+1)
BY <3>1, PositionsFinite, FS_EmptySet DEF OccurrencesBefore
<3>3. 2 * (OccurrencesBefore(seq[i], i+1) - 1) <= (i+1) - 1 - 1
BY <2>1, OccurrencesPlusOne DEF Inv
<3>4. ASSUME NEW v \in Value \ {seq[i]}
PROVE 2 * OccurrencesBefore(v, i+1) <= (i+1) - 1 - 1
BY <2>1, OccurrencesPlusOne DEF Inv
<3>. QED BY <2>0, <2>1, <3>2, <3>3, <3>4 DEF Inv
<2>2. CASE cnt # 0 /\ cand = seq[i] /\ cand' = cand /\ cnt' = cnt + 1
BY <2>0, <2>2, OccurrencesPlusOne DEF Inv
<2>3. CASE cnt # 0 /\ cand # seq[i] /\ cand' = cand /\ cnt' = cnt - 1
<3>10. cnt' <= OccurrencesBefore(cand', i')
BY <2>3, OccurrencesPlusOne DEF Inv
<3>20. 2 * (OccurrencesBefore(cand', i') - cnt') <= i' - 1 - cnt'
BY <2>3, OccurrencesPlusOne DEF Inv
<3>30. ASSUME NEW v \in Value \ {cand'}
PROVE 2 * OccurrencesBefore(v, i') <= i' - 1 - cnt'
BY <2>3, OccurrencesPlusOne DEF Inv
<3>. QED BY <2>0, <2>3, <3>10, <3>20, <3>30 DEF Inv
<2>. QED BY <2>1, <2>2, <2>3 DEF Next
<1>3. TypeOK /\ Inv => Correct
BY OccurrencesType DEF TypeOK, Inv, Correct, Occurrences
<1>. QED BY <1>1, <1>2, <1>3, TypeCorrect, PTL DEF Spec

==============================================================================
23 changes: 23 additions & 0 deletions specifications/Majority/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Efficient majority vote algorithm

A specification of an efficient algorithm due to R.S. Boyer and J.S. Moore
for computing the only candidate in a sequence who may occur in a strict
majority of positions in the sequence. The algorithm makes a single pass
over the sequence and uses only two integer variables and one variable
holding the current candidate. A very simple second pass over the sequence
could then detect whether the candidate actually holds a majority.

The algorithm was published in a technical report in 1981, but only
appeared in print ten years later.

R.S. Boyer, J.S. Moore: MJRTY - A Fast Majority Vote Algorithm.
In: R.S. Boyer (ed.): Automated Reasoning: Essays in Honor of Woody
Bledsoe, pp. 105-117. Dordrecht, The Netherlands, 1991.


The module Majority contains the specification of the algorithm in TLA+, as well
as its main correctness property, a type-correctness invariant, and an
inductive invariant used for establishing correctness. Module MCMajority
can be used to model check the algorithm for all sequences of three elements
of bounded length. Module MajorityProof contains an interactive proof of
correctness that can be checked using TLAPS.