├── ByzPaxos ├── ByzPaxos.pdf ├── PBFT.pdf ├── README.md └── byz_paxos.pdf ├── CosmosDB ├── CosmosDB.pdf └── README.md ├── FastPaxos ├── FastPaxos.pdf ├── FastPaxos.tla └── README.md ├── FlexiblePaxos ├── FlexiblePaxos.pdf └── README.md ├── GSnapshot ├── GlobalSnapshots.pdf └── README.md ├── MultiPaxos ├── README.md └── SakshamChand_MultiPaxos.pdf ├── Paxos ├── Paxos.pdf ├── Paxos.tla └── README.md ├── README.md ├── Raft ├── README.md └── Raft.pdf └── TiDB └── README.md /ByzPaxos/ByzPaxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/ByzPaxos/ByzPaxos.pdf -------------------------------------------------------------------------------- /ByzPaxos/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Byzantine Paxos (Shuai Mu) 2 | 3 | ### Time 4 | January 20, 2016 - 10-11:30am PST 5 | 6 | ### Abstract 7 | In this lecture we will discuss how to tolerate Byzantine faults in achieving consensus. We illustrate through refining Paxos step by step. This should be most fun to those who have become familiar with Paxos-based distributed consensus through the Series. Enough background of Paxos will be covered so the lecture requires no Paxos expertise. 8 | 9 | We will also explain how Byzantine Paxos is connected with the Practical Byzantine Fault Tolerance (PBFT) protocol, proposed by Castro and Liskov in 1999 to tolerate f byzantine failures with 3f+1 replicas. 10 | 11 | ### Bio 12 | [Shuai Mu](http://mpaxos.com/) is a post-doc in New York University. He recently received his PhD from Tsinghua University. Shuai is studying on how to improve performance, scalability and consistency in distributed systems. 13 | 14 | ### Paper and Spec 15 | + [Byzantizing Paxos by Refinement](http://research.microsoft.com/en-us/um/people/lamport/tla/byzsimple.pdf) 16 | + [Practical Byzantine Fault Tolerance](http://pmg.csail.mit.edu/papers/osdi99.pdf) 17 | + [Byzantine Paxos TLA+ specification](http://research.microsoft.com/en-us/um/people/lamport/tla/byzpaxos.html) 18 | 19 | ### Media 20 | + [video](https://www.youtube.com/watch?v=XnfAZHkyOy4) 21 | + [slides](./byz_paxos.pdf) 22 | 23 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 24 | -------------------------------------------------------------------------------- /ByzPaxos/byz_paxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/ByzPaxos/byz_paxos.pdf -------------------------------------------------------------------------------- /CosmosDB/CosmosDB.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/CosmosDB/CosmosDB.pdf -------------------------------------------------------------------------------- /CosmosDB/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - TLA+ specifications of the consistency guarantees provided by Cosmos DB (Murat Demirbas) 2 | 3 | ### Time 4 | November 01, 2018 - 10-11:30am PST 5 | 6 | ### Abstract 7 | Microsoft Azure Cosmos DB provides 5 well defined operation consistency properties to the clients: strong consistency, bounded staleness, session consistency, consistent prefix, and eventual consistency. Here we provide client-centric TLA+ specifications of these properties to help the users understand the consistency guarantees provided to them. We refrain from discussing the models for Cosmos DB internals as that is not very useful/relevant for the Cosmos DB users. 8 | 9 | ### Bio 10 | [Murat Demirbas](http://muratbuffalo.blogspot.com) is a Professor of Computer Science & Engineering at University at Buffalo, SUNY. He is currently on sabbatical with the Microsoft Azure Cosmos DB team. Murat received his Ph.D. from The Ohio State University in 2004 and did a postdoc at the Theory of Distributed Systems Group at MIT in 2005. His research interests are in distributed and networked systems and cloud computing. Murat received an NSF CAREER award in 2008, UB Exceptional Scholars Young Investigator Award in 2010, UB School of Engineering and Applied Sciences Senior Researcher of the Year Award in 2016. He maintains a popular blog on distributed systems at http://muratbuffalo.blogspot.com. 11 | 12 | ### Paper and Spec 13 | see [azure-cosmos-tla](https://github.com/Azure/azure-cosmos-tla) 14 | 15 | ### Media 16 | + [video](https://youtu.be/Ej6dlMBvUBI) 17 | + [slides](./CosmosDB.pdf) 18 | 19 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 20 | -------------------------------------------------------------------------------- /FastPaxos/FastPaxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/FastPaxos/FastPaxos.pdf -------------------------------------------------------------------------------- /FastPaxos/FastPaxos.tla: -------------------------------------------------------------------------------- 1 | ---------------------------- MODULE FastPaxos --------------------------- 2 | (***************************************************************************) 3 | (* The module imports two standard modules. Module $Naturals$ defines the *) 4 | (* set $Nat$ of naturals and the ordinary arithmetic operators; module *) 5 | (* $FiniteSets$ defines $IsFiniteSet(S)$ to be true iff $S$ is a finite *) 6 | (* set and defines $Cardinality(S)$ to be the number of elements in $S$, *) 7 | (* if $S$ is finite. *) 8 | (***************************************************************************) 9 | EXTENDS Naturals, FiniteSets 10 | ----------------------------------------------------------------------------- 11 | (***************************************************************************) 12 | (* \centering \large\bf Constants *) 13 | (***************************************************************************) 14 | 15 | (***************************************************************************) 16 | (* $Max(S)$ is defined to be the maximum of a nonempty finite set $S$ of *) 17 | (* numbers. *) 18 | (***************************************************************************) 19 | Max(S) == CHOOSE i \in S : \A j \in S : j \leq i 20 | 21 | (***************************************************************************) 22 | (* The next statement declares the specification's constant parameters, *) 23 | (* which have the following meanings:\\ \s{1}% *) 24 | (* \begin{tabular}{l@{ }l} *) 25 | (* $Val$ & the set of values that may be proposed.\\ *) 26 | (* $Acceptor$ & the set of acceptors.\\ *) 27 | (* $FastNum$ & the set of fast round numbers.\\ *) 28 | (* $Quorum(i)$ & the set of $i$-quorums.\\ *) 29 | (* $Coord$ & the set of coordinators.\\ *) 30 | (* $Coord(i)$ & the coordinator of round $i$. *) 31 | (* \end{tabular} *) 32 | (***************************************************************************) 33 | CONSTANTS Val, Acceptor, FastNum, Quorum(_), Coord, CoordOf(_) 34 | 35 | (***************************************************************************) 36 | (* $RNum$ is defined to be the set of positive integers, which is the set *) 37 | (* of round numbers. *) 38 | (***************************************************************************) 39 | RNum == Nat \ {0} 40 | 41 | (***************************************************************************) 42 | (* The following statement asserts the assumption that $FastNum$ is a set *) 43 | (* of round numbers. *) 44 | (***************************************************************************) 45 | ASSUME FastNum \subseteq RNum 46 | 47 | (***************************************************************************) 48 | (* $ClassicNum$ is defined to be the set of classic round numbers. *) 49 | (***************************************************************************) 50 | ClassicNum == RNum \ FastNum 51 | 52 | (***************************************************************************) 53 | (* The following assumption asserts that the set of acceptors is finite. *) 54 | (* It is needed to ensure progress. *) 55 | (***************************************************************************) 56 | ASSUME IsFiniteSet(Acceptor) 57 | 58 | (***************************************************************************) 59 | (* The following asserts the assumptions that $Quorum(i)$ is a set of sets *) 60 | (* of acceptors, for every round number $i$, and that the Quorum *) 61 | (* Requirement (Section~\ref{pg:quorum-requirement}, *) 62 | (* page~\pageref{pg:quorum-requirement}) holds. *) 63 | (***************************************************************************) 64 | ASSUME \A i \in RNum : 65 | /\ Quorum(i) \subseteq SUBSET Acceptor 66 | /\ \A j \in RNum : 67 | /\ \A Q \in Quorum(i), R \in Quorum(j) : Q \cap R # {} 68 | /\ (j \in FastNum) => 69 | \A Q \in Quorum(i) : \A R1, R2 \in Quorum(j) : 70 | Q \cap R1 \cap R2 # {} 71 | 72 | (***************************************************************************) 73 | (* The following asserts the assumptions that $CoordOf(i)$ is a *) 74 | (* coordinator, for every round number $i$, and that every coordinator is *) 75 | (* the coordinator of infinitely many classic rounds. *) 76 | (***************************************************************************) 77 | ASSUME /\ \A i \in RNum : CoordOf(i) \in Coord 78 | \* /\ \A c \in Coord, i \in Nat : 79 | \* \E j \in ClassicNum : (j > i) /\ (c = CoordOf(j)) 80 | 81 | (***************************************************************************) 82 | (* $any$ and $none$ are defined to be arbitrary, distinct values that are *) 83 | (* not elements of $Val$. *) 84 | (***************************************************************************) 85 | any == CHOOSE v : v \notin Val 86 | none == CHOOSE n : n \notin Val \cup {any} 87 | 88 | (***************************************************************************) 89 | (* $Message$ is defined to be the set of all possible messages. A message *) 90 | (* is a record having a $type$ field indicating what phase message it is, *) 91 | (* a $rnd$ field indicating the round number. What other fields, if any, *) 92 | (* a message has depends on its type. *) 93 | (***************************************************************************) 94 | Message == 95 | [type : {"phase1a"}, rnd : RNum] 96 | \cup [type : {"phase1b"}, rnd : RNum, vrnd : RNum \cup {0}, 97 | vval : Val \cup {any}, acc : Acceptor] 98 | \cup [type : {"phase2a"}, rnd : RNum, val : Val \cup {any}] 99 | \cup [type : {"phase2b"}, rnd : RNum, val : Val, acc : Acceptor] 100 | ----------------------------------------------------------------------------- 101 | (***************************************************************************) 102 | (* \centering\large\bf Variables and State Predicates *) 103 | (***************************************************************************) 104 | 105 | (***************************************************************************) 106 | (* The following statement declares the specification's variables, which *) 107 | (* have all been described above---either in Section~\ref{sec:basic-alg} *) 108 | (* on page~\pageref{pg:variables} or in this appendix. *) 109 | (***************************************************************************) 110 | VARIABLES rnd, vrnd, vval, crnd, cval, amLeader, sentMsg, proposed, 111 | learned, goodSet 112 | 113 | (***************************************************************************) 114 | (* Defining the following tuples of variables makes it more convenient to *) 115 | (* state which variables are left unchanged by the actions. *) 116 | (***************************************************************************) 117 | aVars == <> \* Acceptor variables. 118 | cVars == <> \* Coordinator variables. 119 | oVars == <> \* Most other variables. 120 | vars == <> \* All variables. 121 | 122 | (***************************************************************************) 123 | (* $TypeOK$ is the type-correctness invariant, asserting that the value of *) 124 | (* each variable is an element of the proper set (its ``type''). Type *) 125 | (* correctness of the specification means that $TypeOK$ is an *) 126 | (* invariant---that is, it is true in every state of every behavior *) 127 | (* allowed by the specification. *) 128 | (***************************************************************************) 129 | TypeOK == 130 | /\ rnd \in [Acceptor -> Nat] 131 | /\ vrnd \in [Acceptor -> Nat] 132 | /\ vval \in [Acceptor -> Val \cup {any}] 133 | /\ crnd \in [Coord -> Nat] 134 | /\ cval \in [Coord -> Val \cup {any, none}] 135 | /\ amLeader \in [Coord -> BOOLEAN] 136 | /\ sentMsg \in SUBSET Message 137 | /\ proposed \in SUBSET Val 138 | /\ learned \in SUBSET Val 139 | /\ goodSet \subseteq Acceptor \cup Coord 140 | 141 | (***************************************************************************) 142 | (* $Init$ is the initial predicate that describes the initial values of *) 143 | (* all the variables. *) 144 | (***************************************************************************) 145 | Init == 146 | /\ rnd = [a \in Acceptor |-> 0] 147 | /\ vrnd = [a \in Acceptor |-> 0] 148 | /\ vval = [a \in Acceptor |-> any] 149 | /\ crnd = [c \in Coord |-> 0] 150 | /\ cval = [c \in Coord |-> none] 151 | /\ amLeader \in [Coord -> BOOLEAN] 152 | /\ sentMsg = {} 153 | /\ proposed = {} 154 | /\ learned = {} 155 | /\ goodSet \in SUBSET (Acceptor \cup Coord) 156 | ----------------------------------------------------------------------------- 157 | (***************************************************************************) 158 | (* \centering\large\bf Action Definitions *) 159 | (***************************************************************************) 160 | 161 | (***************************************************************************) 162 | (* $Send(m)$ describes the state change that represents the sending of *) 163 | (* message $m$. It is used as a conjunct in defining the algorithm *) 164 | (* actions. *) 165 | (***************************************************************************) 166 | Send(m) == sentMsg' = sentMsg \cup {m} 167 | 168 | (***************************************************************************) 169 | (* \centering \large\bf Coordinator Actions *) 170 | (***************************************************************************) 171 | 172 | (***************************************************************************) 173 | (* Action $Phase1a(c,i)$ specifies the execution of phase 1a of round $i$ *) 174 | (* by coordinator $c$, described in Section~\ref{sec:basic-alg} (on *) 175 | (* page~\pageref{pg:1a}) and refined by CA2$'$ (Section~\ref{sec:CA2'}, *) 176 | (* page~\pageref{sec:CA2'}). *) 177 | (***************************************************************************) 178 | Phase1a(c, i) == 179 | /\ amLeader[c] 180 | /\ c = CoordOf(i) 181 | /\ crnd[c] < i 182 | /\ \/ crnd[c] = 0 183 | \/ \E m \in sentMsg : /\ crnd[c] < m.rnd 184 | /\ m.rnd < i 185 | \/ /\ crnd[c] \in FastNum 186 | /\ i \in ClassicNum 187 | /\ crnd' = [crnd EXCEPT ![c] = i] 188 | /\ cval' = [cval EXCEPT ![c] = none] 189 | /\ Send([type |-> "phase1a", rnd |-> i]) 190 | /\ UNCHANGED <> 191 | 192 | (***************************************************************************) 193 | (* $MsgsFrom(Q, i, phase)$ is defined to be the set of messages in *) 194 | (* $sentMsg$ of type $phase$ (which may equal $"phase1b"$ or $"phase2b"$) *) 195 | (* sent in round $i$ by the acceptors in the set $Q$. *) 196 | (***************************************************************************) 197 | MsgsFrom(Q, i, phase) == 198 | {m \in sentMsg : (m.type = phase) /\ (m.acc \in Q) /\ (m.rnd = i)} 199 | 200 | (***************************************************************************) 201 | (* If $M$ is the set of round $i$ phase 1b messages sent by the acceptors *) 202 | (* in a quorum $Q$, then $IsPickableVal(Q, i, M, v)$ is true iff the rule *) 203 | (* of Figure~\ref{fig:fast-paxos-choice} *) 204 | (* (page~\pageref{fig:fast-paxos-choice}) allows the coordinator to send *) 205 | (* the value $v$ in a phase 2a message for round~$i$. *) 206 | (***************************************************************************) 207 | IsPickableVal(Q, i, M, v) == 208 | LET vr(a) == (CHOOSE m \in M : m.acc = a).vrnd 209 | vv(a) == (CHOOSE m \in M : m.acc = a).vval 210 | k == Max({vr(a) : a \in Q}) 211 | V == {vv(a) : a \in {b \in Q : vr(b) = k}} 212 | O4(w) == \E R \in Quorum(k) : 213 | \A a \in R \cap Q : (vr(a) = k) /\ (vv(a) = w) 214 | IN IF k = 0 THEN \/ v \in proposed 215 | \/ /\ i \in FastNum 216 | /\ v = any 217 | ELSE IF Cardinality(V) = 1 218 | THEN v \in V 219 | ELSE IF \E w \in V : O4(w) 220 | THEN v = CHOOSE w \in V : O4(w) 221 | ELSE v \in proposed 222 | 223 | (***************************************************************************) 224 | (* Action $Phase2a(c,v)$ specifies the execution of phase 2a by *) 225 | (* coordinator $c$ with value $v$, as described in *) 226 | (* Section~\ref{sec:basic-alg} (on page~\pageref{pg:2a}) and *) 227 | (* Section~\ref{sec:picking} (page~\pageref{sec:picking}), and refined by *) 228 | (* CA2$'$ (Section~\ref{sec:CA2'}, page~\pageref{sec:CA2'}). *) 229 | (***************************************************************************) 230 | Phase2a(c, v) == 231 | LET i == crnd[c] 232 | IN /\ i # 0 233 | /\ cval[c] = none 234 | /\ amLeader[c] 235 | /\ \E Q \in Quorum(i) : 236 | /\ \A a \in Q : \E m \in MsgsFrom(Q, i, "phase1b") : m.acc = a 237 | /\ IsPickableVal(Q, i, MsgsFrom(Q, i, "phase1b"), v) 238 | /\ cval' = [cval EXCEPT ![c] = v] 239 | /\ Send([type |-> "phase2a", rnd |-> i, val |-> v]) 240 | /\ UNCHANGED <> 241 | 242 | (***************************************************************************) 243 | (* $P2bToP1b(Q, i)$ is defined to be the set of round $i+1$ phase~1b *) 244 | (* messages implied by the round $i$ phase~2b messages sent by the *) 245 | (* acceptors in the set $Q$, as explained in *) 246 | (* Section~\ref{sec:collision-recovery}. *) 247 | (***************************************************************************) 248 | P2bToP1b(Q, i) == 249 | {[type |-> "phase1b", rnd |-> i+1, vrnd |-> i, 250 | vval |-> m.val, acc |-> m.acc] : m \in MsgsFrom(Q, i, "phase2b")} 251 | 252 | (***************************************************************************) 253 | (* Action $CoordinatedRecovery(c, v)$ specifies the coordinated recovery *) 254 | (* described in Section~\ref{pg:coord-recovery}, *) 255 | (* page~\pageref{pg:coord-recovery}. With this action, coordinator $c$ *) 256 | (* attempts to recover from a collision in round $crnd[c]$ by sending *) 257 | (* round $crnd[c]+1$ phase~2a messages for the value $v$. Although CA2$'$ *) 258 | (* (Section~\ref{sec:CA2'}, page~\pageref{sec:CA2'}) implies that this *) 259 | (* action should be performed only if $crnd[c]+1$ is a classic round, that *) 260 | (* restriction is not required for correctness and is omitted from the *) 261 | (* specification. *) 262 | (***************************************************************************) 263 | CoordinatedRecovery(c, v) == 264 | LET i == crnd[c] 265 | IN /\ amLeader[c] 266 | /\ cval[c] = any 267 | /\ c = CoordOf(i+1) 268 | /\ \E Q \in Quorum(i+1) : 269 | /\ \A a \in Q : \E m \in P2bToP1b(Q, i) : m.acc = a 270 | /\ IsPickableVal(Q, i+1, P2bToP1b(Q, i), v) 271 | /\ cval' = [cval EXCEPT ![c] = v] 272 | /\ crnd' = [crnd EXCEPT ![c] = i+1] 273 | /\ Send([type |-> "phase2a", rnd |-> i+1, val |-> v]) 274 | /\ UNCHANGED <> 275 | 276 | (***************************************************************************) 277 | (* $coordLastMsg(c)$ is defined to be the last message that coordinator *) 278 | (* $c$ sent, if $crnd[c]>0$. *) 279 | (***************************************************************************) 280 | coordLastMsg(c) == 281 | IF cval[c] = none 282 | THEN [type |-> "phase1a", rnd |-> crnd[c]] 283 | ELSE [type |-> "phase2a", rnd |-> crnd[c], val |-> cval[c]] 284 | 285 | 286 | (***************************************************************************) 287 | (* In action $CoordRetransmit(c)$, coordinator $c$ retransmits the last *) 288 | (* message it sent. This action is a stuttering action (meaning it does *) 289 | (* not change the value of any variable, so it is a no-op) if that message *) 290 | (* is still in $sentMsg$. However, this action is needed because $c$ *) 291 | (* might have failed after first sending the message and subsequently have *) 292 | (* been repaired after the message was removed from $sentMsg$. *) 293 | (***************************************************************************) 294 | CoordRetransmit(c) == 295 | /\ amLeader[c] 296 | /\ crnd[c] # 0 297 | /\ Send(coordLastMsg(c)) 298 | /\ UNCHANGED <> \* amLeader, proposed, learned, goodSet 299 | 300 | (***************************************************************************) 301 | (* $CoordNext(c)$ is the next-state action of coordinator $c$---that is, *) 302 | (* the disjunct of the algorithm's complete next-state action that *) 303 | (* represents actions of that coordinator. *) 304 | (***************************************************************************) 305 | CoordNext(c) == 306 | \/ \E i \in RNum : Phase1a(c, i) 307 | \/ \E v \in Val \cup {any} : Phase2a(c, v) 308 | \/ \E v \in Val : CoordinatedRecovery(c, v) 309 | \/ CoordRetransmit(c) 310 | ----------------------------------------------------------------------------- 311 | (***************************************************************************) 312 | (* \centering \large\bf Acceptor Actions *) 313 | (***************************************************************************) 314 | 315 | (***************************************************************************) 316 | (* Action $Phase1b(i, a)$ specifies the execution of phase 1b for round *) 317 | (* $i$ by acceptor $a$, described in Section~\ref{sec:basic-alg} on *) 318 | (* page~\pageref{pg:1a}. *) 319 | (***************************************************************************) 320 | Phase1b(i, a) == 321 | /\ rnd[a] < i 322 | /\ [type |-> "phase1a", rnd |-> i] \in sentMsg 323 | /\ rnd' = [rnd EXCEPT ![a] = i] 324 | /\ Send([type |-> "phase1b", rnd |-> i, vrnd |-> vrnd[a], vval |-> vval[a], 325 | acc |-> a]) 326 | /\ UNCHANGED <> 327 | 328 | (***************************************************************************) 329 | (* Action $Phase2b(i, a, v)$ specifies the execution of phase 2b for round *) 330 | (* $i$ by acceptor $a$, upon receipt of either a phase~2a message or a *) 331 | (* proposal (for a fast round) with value $v$. It is described in *) 332 | (* Section~\ref{sec:basic-alg} on page~\pageref{pg:1a} and *) 333 | (* Section~\ref{sec:basic-fast} on page~\pageref{pg:fast-2b}. *) 334 | (***************************************************************************) 335 | Phase2b(i, a, v) == 336 | /\ rnd[a] \leq i 337 | /\ vrnd[a] < i 338 | /\ \E m \in sentMsg : 339 | /\ m.type = "phase2a" 340 | /\ m.rnd = i 341 | /\ \/ m.val = v 342 | \/ /\ m.val = any 343 | /\ v \in proposed 344 | /\ rnd' = [rnd EXCEPT ![a] = i] 345 | /\ vrnd' = [vrnd EXCEPT ![a] = i] 346 | /\ vval' = [vval EXCEPT ![a] = v] 347 | /\ Send([type |-> "phase2b", rnd |-> i, val |-> v, acc |-> a]) 348 | /\ UNCHANGED <> 349 | 350 | (***************************************************************************) 351 | (* Action $UncoordinatedRecovery(i, a, v)$ specifies uncoordinated *) 352 | (* recovery, described in Section~\ref{pg:uncoord-recovery} on *) 353 | (* page~\pageref{pg:uncoord-recovery}. With this action, acceptor $a$ *) 354 | (* attempts to recover from a collision in round $i$ by sending a round *) 355 | (* $i+1$ phase~2b message with value $v$. *) 356 | (***************************************************************************) 357 | UncoordinatedRecovery(i, a, v) == 358 | /\ i+1 \in FastNum 359 | /\ rnd[a] \leq i 360 | /\ \E Q \in Quorum(i+1) : 361 | /\ \A b \in Q : \E m \in P2bToP1b(Q, i) : m.acc = b 362 | /\ IsPickableVal(Q, i+1, P2bToP1b(Q, i), v) 363 | /\ rnd' = [rnd EXCEPT ![a] = i+1] 364 | /\ vrnd' = [vrnd EXCEPT ![a] = i+1] 365 | /\ vval' = [vval EXCEPT ![a] = v] 366 | /\ Send([type |-> "phase2b", rnd |-> i+1, val |-> v, acc |-> a]) 367 | /\ UNCHANGED <> 368 | 369 | (***************************************************************************) 370 | (* $accLastMsg(a)$ is defined to be the last message sent by acceptor $a$, *) 371 | (* if $rnd[a]>0$. *) 372 | (***************************************************************************) 373 | accLastMsg(a) == 374 | IF vrnd[a] < rnd[a] 375 | THEN [type |-> "phase1b", rnd |-> rnd[a], vrnd |-> vrnd[a], 376 | vval |-> vval[a], acc |-> a] 377 | ELSE [type |-> "phase2b", rnd |-> rnd[a], val |-> vval[a], 378 | acc |-> a] 379 | 380 | (***************************************************************************) 381 | (* In action $AcceptorRetransmit(a)$, acceptor $a$ retransmits the last *) 382 | (* message it sent. *) 383 | (***************************************************************************) 384 | AcceptorRetransmit(a) == 385 | /\ rnd[a] # 0 386 | /\ Send(accLastMsg(a)) 387 | /\ UNCHANGED <> \* amLeader, proposed, learned, goodSet 388 | 389 | (***************************************************************************) 390 | (* $AcceptorNext(a)$ is the next-state action of acceptor $a$---that is, *) 391 | (* the disjunct of the next-state action that represents actions of that *) 392 | (* acceptor. *) 393 | (***************************************************************************) 394 | AcceptorNext(a) == 395 | \/ \E i \in RNum : \/ Phase1b(i, a) 396 | \/ \E v \in Val : Phase2b(i, a, v) 397 | \/ \E i \in FastNum, v \in Val : UncoordinatedRecovery(i, a, v) 398 | \/ AcceptorRetransmit(a) 399 | ----------------------------------------------------------------------------- 400 | (***************************************************************************) 401 | (* \centering \large\bf Other Actions *) 402 | (***************************************************************************) 403 | 404 | (***************************************************************************) 405 | (* Action $Propose(v)$ represents the proposal of a value $v$ by some *) 406 | (* proposer. *) 407 | (***************************************************************************) 408 | Propose(v) == 409 | /\ proposed' = proposed \cup {v} 410 | /\ UNCHANGED <> 411 | 412 | (***************************************************************************) 413 | (* Action $Learn(v)$ represents the learning of a value $v$ by some *) 414 | (* learner. *) 415 | (***************************************************************************) 416 | Learn(v) == 417 | /\ \E i \in RNum : 418 | \E Q \in Quorum(i) : 419 | \A a \in Q : 420 | \E m \in sentMsg : /\ m.type = "phase2b" 421 | /\ m.rnd = i 422 | /\ m.val = v 423 | /\ m.acc = a 424 | /\ learned' = learned \cup {v} 425 | /\ UNCHANGED 426 | <> 427 | 428 | (***************************************************************************) 429 | (* Action $LeaderSelection$ allows an arbitrary change to the values of *) 430 | (* $amLeader[c]$, for all coordinators $c$. Since this action may be *) 431 | (* performed at any time, the specification makes no assumption about the *) 432 | (* outcome of leader selection. (However, progress is guaranteed only *) 433 | (* under an assumption about the values of $amLeader[c]$.) *) 434 | (***************************************************************************) 435 | LeaderSelection == 436 | /\ amLeader' \in [Coord -> BOOLEAN] 437 | /\ UNCHANGED <> 438 | 439 | 440 | (***************************************************************************) 441 | (* Action $FailOrRepair$ allows an arbitrary change to the set $goodSet$. *) 442 | (* Since this action may be performed at any time, the specification makes *) 443 | (* no assumption about what agents are good. (However, progress is *) 444 | (* guaranteed only under an assumption about the value of $goodSet$.) *) 445 | (***************************************************************************) 446 | FailOrRepair == 447 | /\ goodSet' \in SUBSET (Coord \cup Acceptor) 448 | /\ UNCHANGED <> 449 | 450 | (***************************************************************************) 451 | (* Action $LoseMsg(m)$ removes message $m$ from $sentMsg$. It is always *) 452 | (* enabled unless $m$ is the last message sent by an acceptor or *) 453 | (* coordinator in $goodSet$. Hence, the only assumption the *) 454 | (* specification makes about message loss is that the last message sent by *) 455 | (* an agent in $goodSet$ is not lost. Because $sentMsg$ includes *) 456 | (* messages in an agent's output buffer, this effectively means that a *) 457 | (* non-failed process always has the last message it sent in its output *) 458 | (* buffer, ready to be retransmitted. *) 459 | (***************************************************************************) 460 | LoseMsg(m) == 461 | /\ ~ \/ /\ m.type \in {"phase1a", "phase2a"} 462 | /\ m = coordLastMsg(CoordOf(m.rnd)) 463 | /\ CoordOf(m.rnd) \in goodSet 464 | /\ amLeader[CoordOf(m.rnd)] 465 | \/ /\ m.type \in {"phase1b", "phase2b"} 466 | /\ m = accLastMsg(m.acc) 467 | /\ m.acc \in goodSet 468 | /\ sentMsg' = sentMsg \ {m} 469 | /\ UNCHANGED <> \* amLeader, proposed, learned, goodSet 470 | 471 | (***************************************************************************) 472 | (* Action $OtherAction$ is the disjunction of all actions other than ones *) 473 | (* performed by acceptors or coordinators, plus the $LeaderSelection$ *) 474 | (* action (which represents leader-selection actions performed by the *) 475 | (* coordinators). *) 476 | (***************************************************************************) 477 | OtherAction == 478 | \/ \E v \in Val : Propose(v) \/ Learn(v) 479 | \/ LeaderSelection \/ FailOrRepair 480 | \/ \E m \in sentMsg : LoseMsg(m) 481 | 482 | 483 | (***************************************************************************) 484 | (* $Next$ is the algorithm's complete next-state action. *) 485 | (***************************************************************************) 486 | Next == 487 | \/ \E c \in Coord : CoordNext(c) 488 | \/ \E a \in Acceptor : AcceptorNext(a) 489 | \/ OtherAction 490 | ----------------------------------------------------------------------------- 491 | (***************************************************************************) 492 | (* \centering\large\bf Temporal Formulas *) 493 | (***************************************************************************) 494 | 495 | (***************************************************************************) 496 | (* Formula $Fairness$ specifies the fairness requirements as the *) 497 | (* conjunction of weak fairnes formulas. Intuitively, it states *) 498 | (* approximately the following: *) 499 | (* \begin{itemize} *) 500 | (* \item[] A coordinator $c$ in $goodSet$ must perform some action if it *) 501 | (* can, and it must perform a $Phase1a(c,i)$ action for a classic round *) 502 | (* $i$ if it can. *) 503 | (* *) 504 | (* \item[] An acceptor in $goodSet$ must perform some action if it can. *) 505 | (* *) 506 | (* \item[] A value that can be learned must be learned. *) 507 | (* \end{itemize} *) 508 | (* It is not obvious that these fairness requirements suffice to imply the *) 509 | (* progress property, and that fairness of each individual acceptor and *) 510 | (* coordinator action is not needed. Part of the reason is that formula *) 511 | (* $Fairness$ does not allow an agent in $goodSet$ to do nothing but *) 512 | (* $Retransmit$ actions if another of its actions is enabled, since all *) 513 | (* but the first retransmission would be a stuttering step, and weak *) 514 | (* fairness of an action $A$ requires a non-stuttering $A$ step to occur *) 515 | (* if it is enabled. *) 516 | (***************************************************************************) 517 | Fairness == 518 | /\ \A c \in Coord : 519 | /\ WF_vars((c \in goodSet) /\ CoordNext(c)) 520 | /\ WF_vars((c \in goodSet) /\ (\E i \in ClassicNum : Phase1a(c, i))) 521 | /\ \A a \in Acceptor : WF_vars((a \in goodSet) /\ AcceptorNext(a)) 522 | /\ \A v \in Val : WF_vars(Learn(v)) 523 | 524 | (***************************************************************************) 525 | (* Formula $Spec$ is the complete specification of the Fast Paxos *) 526 | (* algorithm. *) 527 | (***************************************************************************) 528 | Spec == Init /\ [][Next]_vars /\ Fairness 529 | 530 | (***************************************************************************) 531 | (* $Nontriviality$ asserts that every learned value has been proposed, and *) 532 | (* $Consistency$ asserts that at most one value has been learned. The *) 533 | (* Nontriviality and Consistency conditions for consensus *) 534 | (* (Section~\ref{sec:problem}) are equivalent to the invariance of these *) 535 | (* state predicates. *) 536 | (***************************************************************************) 537 | Nontriviality == learned \subseteq proposed 538 | Consistency == Cardinality(learned) \leq 1 539 | 540 | (***************************************************************************) 541 | (* The following theorem asserts that the state predicates $TypeOK$, *) 542 | (* $Nontriviality$, and $Consistency$ are invariants of specification *) 543 | (* $Spec$, which implies that $Spec$ satisfies the safety properties of a *) 544 | (* consensus algorithm. It was checked by the TLC model checker on models *) 545 | (* that were too small to find a real bug in the algorithm but would have *) 546 | (* detected most simple errors in the specification. *) 547 | (***************************************************************************) 548 | THEOREM Spec => [](TypeOK /\ Nontriviality /\ Consistency) 549 | 550 | (***************************************************************************) 551 | (* Because the specification does not explicitly mention proposers and *) 552 | (* learners, condition $LA(p,l,c,Q)$ described on *) 553 | (* page~\pageref{pg:condition-LA} of Section~\ref{pg:condition-LA} is *) 554 | (* replaced by $LA(c,Q)$, which depends only on $c$ and $Q$. Instead of *) 555 | (* asserting that some particular proposer $p$ has proposed a value, it *) 556 | (* asserts that some value has been proposed. *) 557 | (***************************************************************************) 558 | LA(c, Q) == 559 | /\ {c} \cup Q \subseteq goodSet 560 | /\ proposed # {} 561 | /\ \A ll \in Coord : amLeader[ll] \equiv (c = ll) 562 | 563 | (***************************************************************************) 564 | (* The following theorem asserts that $Spec$ satisfies the progress *) 565 | (* property of Fast Paxos, described in Sections \ref{sec:progress} *) 566 | (* and~\ref{sec:fast-progress}. The temporal formula $<>[]LA(c,Q)$ *) 567 | (* asserts that $LA(c,Q)$ holds from some time on, and $<>(learned # \{\})$*) 568 | (* asserts that some value is eventually learned. *) 569 | (***************************************************************************) 570 | THEOREM /\ Spec 571 | /\ \E Q \in SUBSET Acceptor : 572 | /\ \A i \in ClassicNum : Q \in Quorum(i) 573 | /\ \E c \in Coord : <>[]LA(c, Q) 574 | => <>(learned # {}) 575 | ============================================================================= 576 | -------------------------------------------------------------------------------- /FastPaxos/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Fast Paxos (Cheng Huang) 2 | 3 | ### Time 4 | August 29, 2016 - 10-11:30am PDT 5 | 6 | ### Abstract 7 | Replicating data across geographically distributed data centers is the new norm in cloud services. Compared to Classic Paxos, Fast Paxos shines more favorably, because 8 | + replication can be initiated from arbitrary DC (and there is no need to pump data through a dedicated primary DC); 9 | + replication can be completed in single WAN round trip. 10 | 11 | This meetup studies Leslie Lamport's seminal paper on Fast Paxos and its TLA+ specification. 12 | 13 | ### Bio 14 | [Dr. Cheng Huang](http://research.microsoft.com/~chengh) is a research scientist and tech lead at Microsoft. Cheng has been with Microsoft Research for 11 years. Most recently, he joined the Azure Storage team to full-time help making the Microsoft cloud more scalable and cost effective. 15 | 16 | ### Prerequisite 17 | + none 18 | + most helpful to review [Andrew Helwer's lecture on Classic Paxos](../Paxos/README.md) 19 | 20 | ### Paper and Spec 21 | (not required, but helpful to take a quick look) 22 | + [Fast Paxos](https://www.microsoft.com/en-us/research/publication/fast-paxos/) 23 | + [TLA+ specification](./FastPaxos.tla) 24 | 25 | ### Media 26 | + [video](https://www.youtube.com/watch?v=eW6Zv0X53T4) 27 | + [slides](./FastPaxos.pdf) 28 | 29 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 30 | -------------------------------------------------------------------------------- /FlexiblePaxos/FlexiblePaxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/FlexiblePaxos/FlexiblePaxos.pdf -------------------------------------------------------------------------------- /FlexiblePaxos/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Flexible Paxos (Heidi Howard) 2 | 3 | ### Time 4 | November 9, 2016 (10am, UK time) 5 | 6 | ### Abstract 7 | The Paxos algorithm is a widely adopted approach to reaching agreement in unreliable asynchronous distributed systems. Since its development in 1998, Paxos has been extensively researched, taught and built upon by systems such as Chubby, Zookeeper and Raft. At its foundation, Paxos uses two phases, each requiring agreement from a quorum of participants to reliably reach consensus. 8 | 9 | This lecture will introduce Flexible Paxos, the simple yet powerful result that each of the phases of Paxos may use non-intersecting quorums. The result means that majorities are no longer the only practical quorum system for Paxos, opening the door to a new breed of performant, scalable and resilient consensus algorithms. This lecture will demonstrate how we are able to test this result with only simple modifications to the existing Paxos TLA+ specification. 10 | 11 | ### Bio 12 | [Heidi Howard](http://hh360.user.srcf.net/blog/) is a PhD student in the Systems Research Group at the University of Cambridge. Under the supervision of Professor [Jon Crowcroft](https://www.cl.cam.ac.uk/~jac22/), Heidi is studying how to develop distributed systems which are scalable, consistent and resilient in the face of failures. She also recently interned with [Dahlia Malkhi](https://dahliamalkhi.wordpress.com/) at the [VMware Research Group](https://research.vmware.com/) and has previously worked as a research assistant and undergraduate researcher at the University of Cambridge. 13 | 14 | ### Prerequisite 15 | + none, but may be helpful to review [Andrew Helwer's lecture on Classic Paxos](../Paxos/README.md) 16 | 17 | ### Paper and Spec 18 | (not required, but helpful to take a quick look) 19 | + [Flexible Paxos: Quorum Intersection Revisited](https://arxiv.org/abs/1608.06696), Heidi Howard, Dahlia Malkhi, Alexander Spiegelman 20 | + [TLA+ Spec for Flexible Paxos](https://github.com/fpaxos) 21 | 22 | ### Media 23 | + [video](https://www.youtube.com/watch?v=LX-WK8EmoFE) 24 | + [slides](./FlexiblePaxos.pdf) 25 | 26 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 27 | -------------------------------------------------------------------------------- /GSnapshot/GlobalSnapshots.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/GSnapshot/GlobalSnapshots.pdf -------------------------------------------------------------------------------- /GSnapshot/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Global Snapshots (K. Rustan M. Leino) 2 | 3 | ### Time 4 | September 23, 2016 - 10-11:30am PDT 5 | 6 | ### Location 7 | Microsoft Building 99, Research Lecture Room B (99/1927) 8 | + 14820 NE 36th St, Redmond, WA 98052 9 | 10 | ### Abstract 11 | A snapshot of the state of a running program is useful in several ways. For example, it can serve as a check point from which to restart the execution, in case the rest of the execution fails in some way. Another example use of a snapshot is to detect that some stable condition, such as a deadlock, has occurred. This lecture will discuss algorithms for capturing a global snapshot of a distributed, asynchronous system. It will focus on writing a formal specification of such algorithms. 12 | 13 | If you want to do or think about something before the lecture, I suggest: 14 | + Think about how you would write a specification of a Global Snapshot 15 | + Read something about Global Snapshots, such as: 16 | * Chapter 10, Parallel Program Design, by K. M. Chandy and J. Misra (Addison-Wesley, 1988) 17 | * “Distributed Snapshots: Determining Global States of Distributed Systems”, K. M. Chandy and L. Lamport, ACM TOCS, 3:1 (1985) 18 | * “The Distributed Snapshot of K. M. Chandy and L. Lamport”, in Control Flow and Data Flow, ed. M. Broy (Springer, 1985), a version of which is found as [EWD 864](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD864.html) 19 | 20 | ### Bio 21 | Rustan Leino is Principal Researcher in the [Research in Software Engineering (RiSE)](http://research.microsoft.com/rise) group at Microsoft Research, Redmond and Visiting Professor in the Department of Computing at [Imperial College London](http://www3.imperial.ac.uk/computing). He is known for his work on programming methods and program verification tools, and is a world leader in building automated program verification tools. These include the languages and tools [Dafny](http://research.microsoft.com/dafny), [Chalice](http://www.pm.inf.ethz.ch/research/chalice.html), Jennisys, [Spec#](http://research.microsoft.com/specsharp), [Boogie](https://github.com/boogie-org/boogie), Houdini, ESC/Java, and ESC/Modula-3. 22 | 23 | Prior to Microsoft Research, Leino worked at DEC/Compaq SRC. Advised by K. Mani Chandy, he received his PhD from Caltech (1995), before which he designed and wrote object-oriented software as a technical lead in the Windows NT group at Microsoft. Leino collects [thinking puzzles](http://research.microsoft.com/en-us/um/people/leino/puzzles.html) on a popular web page and hosts the [Verification Corner](https://www.youtube.com/channel/UCP2eLEql4tROYmIYm5mA27A) channel on youtube. 24 | 25 | ### Media 26 | + [video](https://www.youtube.com/watch?v=ao58xine3jM) 27 | + [slides](./GlobalSnapshots.pdf) 28 | 29 | [back to the complete schedule of Dr. TLA+ Series](https://github.com/tlaplus/DrTLAPlus) 30 | -------------------------------------------------------------------------------- /MultiPaxos/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Specification and Verification of Multi-Paxos (Saksham Chand) 2 | 3 | ### Time 4 | November 15, 2019 5 | 6 | ### Location 7 | Microsoft Building 99, Research Lecture Room A 1915 Hopper 8 | + 14820 NE 36th St, Redmond, WA 98052 9 | 10 | ### Abstract 11 | A critical problem in distributed systems is distributed consensus---a set of servers aiming to agree on a single value or a continuing sequence of values, called single-value consensus or multi-value consensus, respectively. It is essential in any distributed service that must maintain and replicate a state to tolerate failures caused by machine crashes, network outages, etc. To this end, we study Paxos---a well-known algorithm for solving distributed consensus used in many distributed services, including Microsoft's Autopilot and IronRSL. 12 | 13 | In this talk, we detail formal specifications of the exact phases of multi-value Paxos in TLA+, Lamport's Temporal Logic of Actions, and complete proofs of its safety that are mechanically checked in TLAPS, the TLA Proof System. We also discuss general strategies for proving properties involving sets and tuples that helped the proof check succeed insignificantly reduced time. The specification and proof are small (55 lines and 787 lines, respectively) and took about 100 seconds to check, contrasting others that are an order of magnitude larger, more time-consuming, or worse. 14 | 15 | Next, we discuss using message history variables, that is, variables holding all sent messages and all received messages, for verification of distributed algorithms, with variants of Paxos as precise case studies. We show that using history variables, instead of using and maintaining other state variables, yields specifications that are more declarative and easier to understand. It also allows easier proofs to be developed by needing fewer invariants and facilitating proof derivations. Furthermore, the proofs are mechanically checked more efficiently. Our specifications, proofs, and proof checking times were reduced by a quarter or more for single-value Paxos and by about half or more for multi-value Paxos. 16 | 17 | ### Bio 18 | Saksham Chand is a PhD student at the Computer Science Department of Stony Brook University, New York, where he conducts research in specification and verification of distributed algorithms. He supervises graduate students working on byzantine fault tolerance, fast state machine replication, and blockchain systems. In his free time, he enjoys beaches, snowboarding and singing. He received an M.Sc. degree from the Computer Science Department of Stony Brook University (F '15) and a B.Sc. degree in Computer Science from National Institute of Technology, Nagpur, India in (S '12). Prior to joining Stony Brook University, he worked as a System Software Developer at Nvidia where he worked on the products Shield and GRID. 19 | 20 | ### Media 21 | + [video](https://youtu.be/uBQSE4MMWhY) 22 | + [slides](./SakshamChand_MultiPaxos.pdf) 23 | 24 | [back to the complete schedule of Dr. TLA+ Series](https://github.com/tlaplus/DrTLAPlus) 25 | -------------------------------------------------------------------------------- /MultiPaxos/SakshamChand_MultiPaxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/MultiPaxos/SakshamChand_MultiPaxos.pdf -------------------------------------------------------------------------------- /Paxos/Paxos.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/Paxos/Paxos.pdf -------------------------------------------------------------------------------- /Paxos/Paxos.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE Paxos ------------------------------- 2 | (***************************************************************************) 3 | (* This is a TLA+ specification of the Paxos Consensus algorithm, *) 4 | (* described in *) 5 | (* *) 6 | (* Paxos Made Simple: *) 7 | (* http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#paxos-simple *) 8 | (* *) 9 | (* and a TLAPS-checked proof of its correctness. This was mostly done as *) 10 | (* a test to see how the SMT backend of TLAPS is now working. *) 11 | (***************************************************************************) 12 | EXTENDS Integers, TLAPS, TLC 13 | 14 | CONSTANTS Acceptors, Values, Quorums 15 | 16 | ASSUME QuorumAssumption == 17 | /\ Quorums \subseteq SUBSET Acceptors 18 | /\ \A Q1, Q2 \in Quorums : Q1 \cap Q2 # {} 19 | 20 | (***************************************************************************) 21 | (* The following lemma is an immediate consequence of the assumption. *) 22 | (***************************************************************************) 23 | LEMMA QuorumNonEmpty == \A Q \in Quorums : Q # {} 24 | BY QuorumAssumption 25 | 26 | Ballots == Nat 27 | 28 | VARIABLES msgs, \* The set of messages that have been sent. 29 | maxBal, \* maxBal[a] is the highest-number ballot acceptor a 30 | \* has participated in. 31 | maxVBal, \* maxVBal[a] is the highest ballot in which a has 32 | maxVal \* voted, and maxVal[a] is the value it voted for 33 | \* in that ballot. 34 | 35 | vars == <> 36 | 37 | Send(m) == msgs' = msgs \cup {m} 38 | 39 | None == CHOOSE v : v \notin Values 40 | 41 | LEMMA NoneNotAValue == None \notin Values 42 | BY NoSetContainsEverything DEF None 43 | 44 | Init == /\ msgs = {} 45 | /\ maxVBal = [a \in Acceptors |-> -1] 46 | /\ maxBal = [a \in Acceptors |-> -1] 47 | /\ maxVal = [a \in Acceptors |-> None] 48 | 49 | (***************************************************************************) 50 | (* Phase 1a: A leader selects a ballot number b and sends a 1a message *) 51 | (* with ballot b to a majority of acceptors. It can do this only if it *) 52 | (* has not already sent a 1a message for ballot b. *) 53 | (***************************************************************************) 54 | Phase1a(b) == /\ ~ \E m \in msgs : (m.type = "1a") /\ (m.bal = b) 55 | /\ Send([type |-> "1a", bal |-> b]) 56 | /\ UNCHANGED <> 57 | 58 | (***************************************************************************) 59 | (* Phase 1b: If an acceptor receives a 1a message with ballot b greater *) 60 | (* than that of any 1a message to which it has already responded, then it *) 61 | (* responds to the request with a promise not to accept any more proposals *) 62 | (* for ballots numbered less than b and with the highest-numbered ballot *) 63 | (* (if any) for which it has voted for a value and the value it voted for *) 64 | (* in that ballot. That promise is made in a 1b message. *) 65 | (***************************************************************************) 66 | Phase1b(a) == 67 | \E m \in msgs : 68 | /\ m.type = "1a" 69 | /\ m.bal > maxBal[a] 70 | /\ Send([type |-> "1b", bal |-> m.bal, maxVBal |-> maxVBal[a], 71 | maxVal |-> maxVal[a], acc |-> a]) 72 | /\ maxBal' = [maxBal EXCEPT ![a] = m.bal] 73 | /\ UNCHANGED <> 74 | 75 | (***************************************************************************) 76 | (* Phase 2a: If the leader receives a response to its 1b message (for *) 77 | (* ballot b) from a quorum of acceptors, then it sends a 2a message to all *) 78 | (* acceptors for a proposal in ballot b with a value v, where v is the *) 79 | (* value of the highest-numbered proposal among the responses, or is any *) 80 | (* value if the responses reported no proposals. The leader can send only *) 81 | (* one 2a message for any ballot. *) 82 | (***************************************************************************) 83 | Phase2a(b) == 84 | /\ ~ \E m \in msgs : (m.type = "2a") /\ (m.bal = b) 85 | /\ \E v \in Values : 86 | /\ \E Q \in Quorums : 87 | \E S \in SUBSET {m \in msgs : (m.type = "1b") /\ (m.bal = b)} : 88 | /\ \A a \in Q : \E m \in S : m.acc = a 89 | /\ \/ \A m \in S : m.maxVBal = -1 90 | \/ \E c \in 0..(b-1) : 91 | /\ \A m \in S : m.maxVBal =< c 92 | /\ \E m \in S : /\ m.maxVBal = c 93 | /\ m.maxVal = v 94 | /\ Send([type |-> "2a", bal |-> b, val |-> v]) 95 | /\ UNCHANGED <> 96 | 97 | (***************************************************************************) 98 | (* Phase 2b: If an acceptor receives a 2a message for a ballot numbered *) 99 | (* b, it votes for the message's value in ballot b unless it has already *) 100 | (* responded to a 1a request for a ballot number greater than or equal to *) 101 | (* b. *) 102 | (***************************************************************************) 103 | Phase2b(a) == 104 | \E m \in msgs : 105 | /\ m.type = "2a" 106 | /\ m.bal >= maxBal[a] 107 | /\ Send([type |-> "2b", bal |-> m.bal, val |-> m.val, acc |-> a]) 108 | /\ maxVBal' = [maxVBal EXCEPT ![a] = m.bal] 109 | /\ maxBal' = [maxBal EXCEPT ![a] = m.bal] 110 | /\ maxVal' = [maxVal EXCEPT ![a] = m.val] 111 | 112 | Next == \/ \E b \in Ballots : Phase1a(b) \/ Phase2a(b) 113 | \/ \E a \in Acceptors : Phase1b(a) \/ Phase2b(a) 114 | 115 | Spec == Init /\ [][Next]_vars 116 | ----------------------------------------------------------------------------- 117 | (***************************************************************************) 118 | (* How a value is chosen: *) 119 | (* *) 120 | (* This spec does not contain any actions in which a value is explicitly *) 121 | (* chosen (or a chosen value learned). Wnat it means for a value to be *) 122 | (* chosen is defined by the operator Chosen, where Chosen(v) means that v *) 123 | (* has been chosen. From this definition, it is obvious how a process *) 124 | (* learns that a value has been chosen from messages of type "2b". *) 125 | (***************************************************************************) 126 | VotedForIn(a, v, b) == \E m \in msgs : /\ m.type = "2b" 127 | /\ m.val = v 128 | /\ m.bal = b 129 | /\ m.acc = a 130 | 131 | ChosenIn(v, b) == \E Q \in Quorums : 132 | \A a \in Q : VotedForIn(a, v, b) 133 | 134 | Chosen(v) == \E b \in Ballots : ChosenIn(v, b) 135 | 136 | (***************************************************************************) 137 | (* The consistency condition that a consensus algorithm must satisfy is *) 138 | (* the invariance of the following state predicate Consistency. *) 139 | (***************************************************************************) 140 | Consistency == \A v1, v2 \in Values : Chosen(v1) /\ Chosen(v2) => (v1 = v2) 141 | ----------------------------------------------------------------------------- 142 | (***************************************************************************) 143 | (* This section of the spec defines the invariant Inv. *) 144 | (***************************************************************************) 145 | Messages == [type : {"1a"}, bal : Ballots] 146 | \cup [type : {"1b"}, bal : Ballots, maxVBal : Ballots \cup {-1}, 147 | maxVal : Values \cup {None}, acc : Acceptors] 148 | \cup [type : {"2a"}, bal : Ballots, val : Values] 149 | \cup [type : {"2b"}, bal : Ballots, val : Values, acc : Acceptors] 150 | 151 | 152 | TypeOK == /\ msgs \in SUBSET Messages 153 | /\ maxVBal \in [Acceptors -> Ballots \cup {-1}] 154 | /\ maxBal \in [Acceptors -> Ballots \cup {-1}] 155 | /\ maxVal \in [Acceptors -> Values \cup {None}] 156 | /\ \A a \in Acceptors : maxBal[a] >= maxVBal[a] 157 | 158 | (***************************************************************************) 159 | (* WontVoteIn(a, b) is a predicate that implies that a has not voted and *) 160 | (* never will vote in ballot b. *) 161 | (***************************************************************************) 162 | WontVoteIn(a, b) == /\ \A v \in Values : ~ VotedForIn(a, v, b) 163 | /\ maxBal[a] > b 164 | 165 | (***************************************************************************) 166 | (* The predicate SafeAt(v, b) implies that no value other than perhaps v *) 167 | (* has been or ever will be chosen in any ballot numbered less than b. *) 168 | (***************************************************************************) 169 | SafeAt(v, b) == 170 | \A c \in 0..(b-1) : 171 | \E Q \in Quorums : 172 | \A a \in Q : VotedForIn(a, v, c) \/ WontVoteIn(a, c) 173 | 174 | MsgInv == 175 | \A m \in msgs : 176 | /\ (m.type = "1b") => /\ m.bal =< maxBal[m.acc] 177 | /\ \/ /\ m.maxVal \in Values 178 | /\ m.maxVBal \in Ballots 179 | \* conjunct strengthened 2014/04/02 sm 180 | /\ VotedForIn(m.acc, m.maxVal, m.maxVBal) 181 | \* /\ SafeAt(m.maxVal, m.maxVBal) 182 | \/ /\ m.maxVal = None 183 | /\ m.maxVBal = -1 184 | \** conjunct added 2014/03/29 sm 185 | /\ \A c \in (m.maxVBal+1) .. (m.bal-1) : 186 | ~ \E v \in Values : VotedForIn(m.acc, v, c) 187 | /\ (m.type = "2a") => 188 | /\ SafeAt(m.val, m.bal) 189 | /\ \A ma \in msgs : (ma.type = "2a") /\ (ma.bal = m.bal) 190 | => (ma = m) 191 | /\ (m.type = "2b") => 192 | /\ \E ma \in msgs : /\ ma.type = "2a" 193 | /\ ma.bal = m.bal 194 | /\ ma.val = m.val 195 | /\ m.bal =< maxVBal[m.acc] 196 | 197 | (***************************************************************************) 198 | (* The following two lemmas are simple consequences of the definitions. *) 199 | (***************************************************************************) 200 | LEMMA VotedInv == 201 | MsgInv /\ TypeOK => 202 | \A a \in Acceptors, v \in Values, b \in Ballots : 203 | VotedForIn(a, v, b) => SafeAt(v, b) /\ b =< maxVBal[a] 204 | BY DEF VotedForIn, MsgInv, Messages, TypeOK 205 | 206 | LEMMA VotedOnce == 207 | MsgInv => \A a1, a2 \in Acceptors, b \in Ballots, v1, v2 \in Values : 208 | VotedForIn(a1, v1, b) /\ VotedForIn(a2, v2, b) => (v1 = v2) 209 | BY DEF MsgInv, VotedForIn 210 | 211 | AccInv == 212 | \A a \in Acceptors: 213 | /\ (maxVal[a] = None) <=> (maxVBal[a] = -1) 214 | /\ maxVBal[a] =< maxBal[a] 215 | \* conjunct strengthened corresponding to MsgInv 2014/04/02 sm 216 | /\ (maxVBal[a] >= 0) => VotedForIn(a, maxVal[a], maxVBal[a]) \* SafeAt(maxVal[a], maxVBal[a]) 217 | \* conjunct added corresponding to MsgInv 2014/03/29 sm 218 | /\ \A c \in Ballots : c > maxVBal[a] => ~ \E v \in Values : VotedForIn(a, v, c) 219 | 220 | (***************************************************************************) 221 | (* Inv is the complete inductive invariant. *) 222 | (***************************************************************************) 223 | Inv == TypeOK /\ MsgInv /\ AccInv 224 | ----------------------------------------------------------------------------- 225 | (***************************************************************************) 226 | (* The following lemma shows that (the invariant implies that) the *) 227 | (* predicate SafeAt(v, b) is stable, meaning that once it becomes true, it *) 228 | (* remains true throughout the rest of the excecution. *) 229 | (***************************************************************************) 230 | LEMMA SafeAtStable == Inv /\ Next /\ TypeOK' => 231 | \A v \in Values, b \in Ballots: 232 | SafeAt(v, b) => SafeAt(v, b)' 233 | <1> SUFFICES ASSUME Inv, Next, TypeOK', 234 | NEW v \in Values, NEW b \in Ballots, SafeAt(v, b) 235 | PROVE SafeAt(v, b)' 236 | OBVIOUS 237 | <1> USE DEF Send, Inv, Ballots 238 | <1> USE TRUE /\ TRUE 239 | <1>1. ASSUME NEW bb \in Ballots, Phase1a(bb) 240 | PROVE SafeAt(v, b)' 241 | BY <1>1, SMT DEF SafeAt, Phase1a, VotedForIn, WontVoteIn 242 | <1>2. ASSUME NEW a \in Acceptors, Phase1b(a) 243 | PROVE SafeAt(v, b)' 244 | BY <1>2, QuorumAssumption, SMTT(60) DEF TypeOK, SafeAt, WontVoteIn, VotedForIn, Phase1b 245 | <1>3. ASSUME NEW bb \in Ballots, Phase2a(bb) 246 | PROVE SafeAt(v, b)' 247 | BY <1>3, QuorumAssumption, SMT DEF TypeOK, SafeAt, WontVoteIn, VotedForIn, Phase2a 248 | <1>4. ASSUME NEW a \in Acceptors, Phase2b(a) 249 | PROVE SafeAt(v, b)' 250 | <2>1. PICK m \in msgs : Phase2b(a)!(m) 251 | BY <1>4 DEF Phase2b 252 | <2>2 \A aa \in Acceptors, bb \in Ballots, vv \in Values : 253 | VotedForIn(aa, vv, bb) => VotedForIn(aa, vv, bb)' 254 | BY <2>1 DEF TypeOK, VotedForIn 255 | <2>3. \A aa \in Acceptors, bb \in Ballots : maxBal[aa] > bb => maxBal'[aa] > bb 256 | BY <2>1 DEF TypeOK 257 | <2>4. ASSUME NEW aa \in Acceptors, NEW bb \in Ballots, 258 | WontVoteIn(aa, bb), NEW vv \in Values, 259 | VotedForIn(aa, vv, bb)' 260 | PROVE FALSE 261 | <3> DEFINE mm == [type |-> "2b", val |-> vv, bal |-> bb, acc |-> aa] 262 | <3>1. mm \notin msgs 263 | BY <2>4 DEF WontVoteIn, VotedForIn 264 | <3>2. mm \in msgs' 265 | <4>1. PICK m1 \in msgs' : 266 | /\ m1.type = "2b" 267 | /\ m1.val = vv 268 | /\ m1.bal = bb 269 | /\ m1.acc = aa 270 | BY <2>4 DEF VotedForIn 271 | <4>. QED BY <4>1 DEF TypeOK, Messages \* proved by Zenon 272 | <3>3. aa = a /\ m.bal = bb 273 | BY <2>1, <3>1, <3>2 DEF TypeOK 274 | <3>. QED 275 | BY <2>1, <2>4, <3>3 DEF Phase2b, WontVoteIn, TypeOK 276 | <2>5 \A aa \in Acceptors, bb \in Ballots : WontVoteIn(aa, bb) => WontVoteIn(aa, bb)' 277 | BY <2>3, <2>4 DEF WontVoteIn 278 | <2> QED 279 | BY <2>2, <2>5, QuorumAssumption DEF SafeAt 280 | 281 | <1>5. QED 282 | BY <1>1, <1>2, <1>3, <1>4 DEF Next 283 | 284 | THEOREM Invariant == Spec => []Inv 285 | <1> USE DEF Ballots 286 | <1>1. Init => Inv 287 | BY DEF Init, Inv, TypeOK, AccInv, MsgInv, VotedForIn 288 | 289 | <1>2. Inv /\ [Next]_vars => Inv' 290 | <2> SUFFICES ASSUME Inv, Next 291 | PROVE Inv' 292 | BY DEF vars, Inv, TypeOK, MsgInv, AccInv, SafeAt, VotedForIn, WontVoteIn 293 | <2> USE DEF Inv 294 | <2>1. TypeOK' 295 | <3>1. ASSUME NEW b \in Ballots, Phase1a(b) PROVE TypeOK' 296 | BY <3>1 DEF TypeOK, Phase1a, Send, Messages 297 | <3>2. ASSUME NEW b \in Ballots, Phase2a(b) PROVE TypeOK' 298 | <4>1. PICK v \in Values : 299 | /\ Send([type |-> "2a", bal |-> b, val |-> v]) 300 | /\ UNCHANGED <> 301 | BY <3>2 DEF Phase2a 302 | <4>. QED 303 | BY <4>1 DEF TypeOK, Send, Messages 304 | <3>3. ASSUME NEW a \in Acceptors, Phase1b(a) PROVE TypeOK' 305 | <4>. PICK m \in msgs : Phase1b(a)!(m) 306 | BY <3>3 DEF Phase1b 307 | <4>. QED 308 | BY DEF Send, TypeOK, Messages 309 | <3>4. ASSUME NEW a \in Acceptors, Phase2b(a) PROVE TypeOK' 310 | <4>. PICK m \in msgs : Phase2b(a)!(m) 311 | BY <3>4 DEF Phase2b 312 | <4>. QED 313 | BY DEF Send, TypeOK, Messages 314 | <3>. QED 315 | BY <3>1, <3>2, <3>3, <3>4 DEF Next 316 | <2>2. AccInv' 317 | <3>1. ASSUME NEW b \in Ballots, Phase1a(b) PROVE AccInv' 318 | BY <2>1, <3>1, SafeAtStable DEF AccInv, TypeOK, Phase1a, VotedForIn, Send 319 | <3>2. ASSUME NEW b \in Ballots, Phase2a(b) PROVE AccInv' 320 | BY <2>1, <3>2, SafeAtStable DEF AccInv, TypeOK, Phase2a, VotedForIn, Send 321 | <3>3. ASSUME NEW a \in Acceptors, Phase1b(a) PROVE AccInv' 322 | BY <2>1, <3>3, SafeAtStable DEF AccInv, TypeOK, Phase1b, VotedForIn, Send 323 | <3>4. ASSUME NEW a \in Acceptors, Phase2b(a) PROVE AccInv' 324 | <4>1. PICK m \in msgs : Phase2b(a)!(m) 325 | BY <3>4 DEF Phase2b 326 | <4>2. \A acc \in Acceptors : 327 | /\ maxVal'[acc] = None <=> maxVBal'[acc] = -1 328 | /\ maxVBal'[acc] =< maxBal'[acc] 329 | BY <2>1, <4>1, NoneNotAValue DEF AccInv, TypeOK, Messages 330 | <4>3. \A aa,vv,bb : VotedForIn(aa,vv,bb)' <=> 331 | VotedForIn(aa,vv,bb) \/ (aa = a /\ vv = maxVal'[a] /\ bb = maxVBal'[a]) 332 | BY <4>1, Isa DEF VotedForIn, Send, TypeOK, Messages 333 | <4>4. ASSUME NEW acc \in Acceptors, maxVBal'[acc] >= 0 334 | PROVE VotedForIn(acc, maxVal[acc], maxVBal[acc])' 335 | BY <4>1, <4>3, <4>4 DEF AccInv, TypeOK 336 | <4>5. ASSUME NEW acc \in Acceptors, NEW c \in Ballots, c > maxVBal'[acc], 337 | NEW v \in Values, VotedForIn(acc, v, c)' 338 | PROVE FALSE 339 | BY <4>1, <4>3, <4>5, <2>1 DEF AccInv, TypeOK 340 | <4>. QED BY <4>2, <4>4, <4>5 DEF AccInv 341 | <3>. QED 342 | BY <3>1, <3>2, <3>3, <3>4 DEF Next 343 | <2>3. MsgInv' 344 | <3>1. ASSUME NEW b \in Ballots, Phase1a(b) 345 | PROVE MsgInv' 346 | <4>1. \A aa,vv,bb : VotedForIn(aa,vv,bb)' <=> VotedForIn(aa,vv,bb) 347 | BY <3>1 DEF Phase1a, Send, VotedForIn 348 | <4>. QED 349 | BY <3>1, <4>1, SafeAtStable, <2>1 DEF Phase1a, MsgInv, TypeOK, Messages, Send 350 | <3>2. ASSUME NEW a \in Acceptors, Phase1b(a) 351 | PROVE MsgInv' 352 | <4>. PICK m \in msgs : Phase1b(a)!(m) 353 | BY <3>2 DEF Phase1b 354 | <4>1. \A aa,vv,bb : VotedForIn(aa,vv,bb)' <=> VotedForIn(aa,vv,bb) 355 | BY DEF Send, VotedForIn 356 | <4>. DEFINE mm == [type |-> "1b", bal |-> m.bal, maxVBal |-> maxVBal[a], 357 | maxVal |-> maxVal[a], acc |-> a] 358 | <4>2. mm.bal =< maxBal'[mm.acc] 359 | BY DEF TypeOK, Messages 360 | <4>3. \/ /\ mm.maxVal \in Values 361 | /\ mm.maxVBal \in Ballots 362 | /\ VotedForIn(mm.acc, mm.maxVal, mm.maxVBal) 363 | \/ /\ mm.maxVal = None 364 | /\ mm.maxVBal = -1 365 | BY DEF TypeOK, AccInv 366 | <4>4. \A c \in (mm.maxVBal+1) .. (mm.bal-1) : 367 | ~ \E v \in Values : VotedForIn(mm.acc, v, c) 368 | BY DEF AccInv, TypeOK, Messages 369 | <4>. QED 370 | BY <4>1, <4>2, <4>3, <4>4, SafeAtStable DEF MsgInv, TypeOK, Messages, Send 371 | <3>3. ASSUME NEW b \in Ballots, Phase2a(b) 372 | PROVE MsgInv' 373 | <4>1. ~ \E m \in msgs : (m.type = "2a") /\ (m.bal = b) 374 | BY <3>3 DEF Phase2a 375 | <4>1a. UNCHANGED <> 376 | BY <3>3 DEF Phase2a 377 | <4>2. PICK v \in Values : 378 | /\ \E Q \in Quorums : 379 | \E S \in SUBSET {m \in msgs : (m.type = "1b") /\ (m.bal = b)} : 380 | /\ \A a \in Q : \E m \in S : m.acc = a 381 | /\ \/ \A m \in S : m.maxVBal = -1 382 | \/ \E c \in 0..(b-1) : 383 | /\ \A m \in S : m.maxVBal =< c 384 | /\ \E m \in S : /\ m.maxVBal = c 385 | /\ m.maxVal = v 386 | /\ Send([type |-> "2a", bal |-> b, val |-> v]) 387 | BY <3>3 DEF Phase2a 388 | <4>. DEFINE mm == [type |-> "2a", bal |-> b, val |-> v] 389 | <4>3. msgs' = msgs \cup {mm} 390 | BY <4>2 DEF Send 391 | <4>4. \A aa, vv, bb : VotedForIn(aa,vv,bb)' <=> VotedForIn(aa,vv,bb) 392 | BY <4>3 DEF VotedForIn 393 | <4>6. \A m,ma \in msgs' : m.type = "2a" /\ ma.type = "2a" /\ ma.bal = m.bal 394 | => ma = m 395 | BY <4>1, <4>3, Isa DEF MsgInv 396 | <4>10. SafeAt(v,b) 397 | <5>0. PICK Q \in Quorums, 398 | S \in SUBSET {m \in msgs : (m.type = "1b") /\ (m.bal = b)} : 399 | /\ \A a \in Q : \E m \in S : m.acc = a 400 | /\ \/ \A m \in S : m.maxVBal = -1 401 | \/ \E c \in 0..(b-1) : 402 | /\ \A m \in S : m.maxVBal =< c 403 | /\ \E m \in S : /\ m.maxVBal = c 404 | /\ m.maxVal = v 405 | BY <4>2, Zenon 406 | <5>1. CASE \A m \in S : m.maxVBal = -1 407 | \* In that case, no acceptor in Q voted in any ballot less than b, 408 | \* by the last conjunct of MsgInv for type "1b" messages, and that's enough 409 | BY <5>1, <5>0 DEF TypeOK, MsgInv, SafeAt, WontVoteIn 410 | <5>2. ASSUME NEW c \in 0 .. (b-1), 411 | \A m \in S : m.maxVBal =< c, 412 | NEW ma \in S, ma.maxVBal = c, ma.maxVal = v 413 | PROVE SafeAt(v,b) 414 | <6>. SUFFICES ASSUME NEW d \in 0 .. (b-1) 415 | PROVE \E QQ \in Quorums : \A q \in QQ : 416 | VotedForIn(q,v,d) \/ WontVoteIn(q,d) 417 | BY DEF SafeAt 418 | <6>1. CASE d \in 0 .. (c-1) 419 | \* The "1b" message for v with maxVBal value c must have been safe 420 | \* according to MsgInv for "1b" messages and lemma VotedInv, 421 | \* and that proves the assertion 422 | BY <5>2, <6>1, VotedInv DEF SafeAt, MsgInv, TypeOK, Messages 423 | <6>2. CASE d = c 424 | <7>1. VotedForIn(ma.acc, v, c) 425 | BY <5>2 DEF MsgInv 426 | <7>2. \A q \in Q, w \in Values : VotedForIn(q, w, c) => w = v 427 | BY <7>1, VotedOnce, QuorumAssumption DEF TypeOK, Messages 428 | <7>3. \A q \in Q : maxBal[q] > c 429 | BY <5>0 DEF MsgInv, TypeOK, Messages 430 | <7>. QED 431 | BY <6>2, <7>2, <7>3 DEF WontVoteIn 432 | <6>3. CASE d \in (c+1) .. (b-1) 433 | \* By the last conjunct of MsgInv for type "1b" messages, no acceptor in Q 434 | \* voted at any of these ballots. 435 | BY <6>3, <5>0, <5>2 DEF MsgInv, TypeOK, Messages, WontVoteIn 436 | <6>. QED BY <6>1, <6>2, <6>3 437 | <5>. QED BY <5>0, <5>1, <5>2 438 | <4>11. SafeAt(mm.val,mm.bal)' 439 | BY <4>10, <2>1, SafeAtStable 440 | <4>. QED 441 | \* This proof used to work. 442 | BY <2>1, <4>1a, <4>3, <4>4, <4>6, <4>11, SafeAtStable, Zenon 443 | DEF MsgInv, TypeOK, Messages 444 | \* The following decomposition added by LL on 21 Nov 2014 because 445 | \* Zenon failed on this proof. However, ZenonT(200) worked. 446 | (* 447 | <5> SUFFICES ASSUME NEW m \in msgs' 448 | PROVE MsgInv!(m)' 449 | BY DEF MsgInv 450 | 451 | <5>1. m.type = "1b" 452 | => (/\ m.bal =< maxBal[m.acc] 453 | /\ \/ /\ m.maxVal \in Values 454 | /\ m.maxVBal \in Nat 455 | /\ VotedForIn(m.acc, m.maxVal, m.maxVBal) 456 | \/ /\ m.maxVal = None 457 | /\ m.maxVBal = -1 458 | /\ \A c \in m.maxVBal + 1..m.bal - 1 : 459 | ~(\E v_1 \in Values : VotedForIn(m.acc, v_1, c)))' 460 | BY <2>1, <4>1a, <4>3, <4>4, <4>6, <4>11, SafeAtStable \*, Zenon 461 | DEF MsgInv, TypeOK, Messages 462 | 463 | <5>2. m.type = "2a" 464 | => (/\ SafeAt(m.val, m.bal) 465 | /\ \A ma \in msgs : 466 | ma.type = "2a" /\ ma.bal = m.bal => ma = m)' 467 | BY <2>1, <4>1a, <4>3, <4>4, <4>6, <4>11, SafeAtStable \*, Zenon 468 | DEF MsgInv, TypeOK, Messages 469 | 470 | <5>3. m.type = "2b" 471 | => (/\ \E ma \in msgs : 472 | /\ ma.type = "2a" 473 | /\ ma.bal = m.bal 474 | /\ ma.val = m.val 475 | /\ m.bal =< maxVBal[m.acc])' 476 | BY <2>1, <4>1a, <4>3, <4>4, <4>6, <4>11, SafeAtStable\* , Zenon 477 | DEF MsgInv, TypeOK, Messages 478 | 479 | <5>4. QED 480 | BY <5>1, <5>2, <5>3 481 | *) 482 | <3>4. ASSUME NEW a \in Acceptors, Phase2b(a) 483 | PROVE MsgInv' 484 | <4>. PICK m \in msgs : Phase2b(a)!(m) 485 | BY <3>4 DEF Phase2b 486 | <4>1. \A aa, vv, bb : VotedForIn(aa,vv,bb) => VotedForIn(aa,vv,bb)' 487 | BY DEF VotedForIn, Send 488 | <4>2. \A mm \in msgs : mm.type = "1b" 489 | => \A v \in Values, c \in (mm.maxVBal+1) .. (mm.bal-1) : 490 | ~ VotedForIn(mm.acc, v, c) => ~ VotedForIn(mm.acc, v, c)' 491 | BY DEF Send, VotedForIn, MsgInv, TypeOK, Messages 492 | <4>. QED 493 | BY <4>1, <4>2, SafeAtStable, <2>1 DEF MsgInv, Send, TypeOK, Messages 494 | <3>5. QED 495 | BY <3>1, <3>2, <3>3, <3>4 DEF Next 496 | <2>4. QED 497 | BY <2>1, <2>2, <2>3 DEF Inv 498 | 499 | <1>3. QED 500 | BY <1>1, <1>2, PTL DEF Spec 501 | 502 | 503 | THEOREM Consistent == Spec => []Consistency 504 | <1> USE DEF Ballots 505 | 506 | <1>1. Inv => Consistency 507 | <2> SUFFICES ASSUME Inv, 508 | NEW v1 \in Values, NEW v2 \in Values, 509 | NEW b1 \in Ballots, NEW b2 \in Ballots, 510 | ChosenIn(v1, b1), ChosenIn(v2, b2), 511 | b1 =< b2 512 | PROVE v1 = v2 513 | BY DEF Consistency, Chosen 514 | <2>1. CASE b1 = b2 515 | BY <2>1, VotedOnce, QuorumAssumption, SMTT(100) DEF ChosenIn, Inv 516 | (* 517 | <3>1. PICK a1 \in Acceptors : VotedForIn(a1, v1, b1) 518 | BY QuorumAssumption DEF ChosenIn 519 | <3>2. PICK a2 \in Acceptors : VotedForIn(a2, v2, b2) 520 | BY QuorumAssumption DEF ChosenIn 521 | <3>. QED 522 | BY <3>1, <3>2, <2>1, VotedOnce DEF Inv 523 | *) 524 | <2>2. CASE b1 < b2 525 | <3>1. SafeAt(v2, b2) 526 | BY VotedInv, QuorumNonEmpty, QuorumAssumption DEF ChosenIn, Inv 527 | <3>2. PICK Q2 \in Quorums : 528 | \A a \in Q2 : VotedForIn(a, v2, b1) \/ WontVoteIn(a, b1) 529 | BY <3>1, <2>2 DEF SafeAt 530 | <3>3. PICK Q1 \in Quorums : \A a \in Q1 : VotedForIn(a, v1, b1) 531 | BY DEF ChosenIn 532 | <3>4. QED 533 | BY <3>2, <3>3, QuorumAssumption, VotedOnce, Z3 DEF WontVoteIn, Inv 534 | <2>3. QED 535 | BY <2>1, <2>2 536 | 537 | <1>2. QED 538 | BY Invariant, <1>1, PTL 539 | 540 | ----------------------------------------------------------------------------- 541 | chosenBar == {v \in Values : Chosen(v)} 542 | 543 | C == INSTANCE Consensus WITH chosen <- chosenBar 544 | 545 | (***************************************************************************) 546 | (* The following theorem asserts that this specification of Paxos refines *) 547 | (* the trivial specification of consensus in module Consensus. *) 548 | (***************************************************************************) 549 | THEOREM Refinement == Spec => C!Spec 550 | <1>1. Init => C!Init 551 | BY QuorumNonEmpty DEF Init, C!Init, chosenBar, Chosen, ChosenIn, VotedForIn 552 | 553 | <1>2. TypeOK' /\ Consistency' /\ [Next]_vars => [C!Next]_chosenBar 554 | <2> SUFFICES ASSUME TypeOK', Consistency', Next, chosenBar' # chosenBar 555 | PROVE C!Next 556 | BY DEF vars, chosenBar, Chosen, ChosenIn, VotedForIn 557 | <2>1. chosenBar \subseteq chosenBar' 558 | BY DEF Send, chosenBar, Chosen, ChosenIn, VotedForIn, Next, Phase1a, Phase1b, Phase2a, Phase2b 559 | <2>2. \A v, w \in chosenBar': v = w 560 | BY DEF Consistency, chosenBar, ChosenIn, TypeOK 561 | <2>3. chosenBar = {} 562 | BY <2>1, <2>2, SetExtensionality 563 | <2>. QED 564 | BY <2>1, <2>2, <2>3 DEF C!Next, chosenBar 565 | 566 | <1>3. QED 567 | BY <1>1, <1>2, Invariant, Consistent, PTL DEF Spec, C!Spec, Inv 568 | 569 | ============================================================================= 570 | \* Modification History 571 | \* Last modified Fri Nov 28 10:39:17 PST 2014 by lamport 572 | \* Last modified Sun Nov 23 14:45:09 PST 2014 by lamport 573 | \* Last modified Mon Nov 24 02:03:02 CET 2014 by merz 574 | \* Last modified Sat Nov 22 12:04:19 CET 2014 by merz 575 | \* Last modified Fri Nov 21 17:40:41 PST 2014 by lamport 576 | \* Last modified Tue Mar 18 11:37:57 CET 2014 by doligez 577 | \* Last modified Sat Nov 24 18:53:09 GMT-03:00 2012 by merz 578 | \* Created Sat Nov 17 16:02:06 PST 2012 by lamport 579 | -------------------------------------------------------------------------------- /Paxos/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Paxos (Andrew Helwer) 2 | 3 | Welcome to the inaugural lecture in the Dr. TLA+ Series! 4 | 5 | ### Time 6 | June 22, 2016 - 10-11:30am PDT 7 | 8 | ### Abstract 9 | This lecture will familiarize you with the Paxos Protocol - what it is, what problems it solves, how it works, and why it works that way. The Paxos Protocol was developed in 1998 by Leslie Lamport and is a foundational component in the field of distributed systems, solving the difficult and critical problem of consensus in a network of unreliable processes. All of you have, at one time or another, interacted with a system depending on this protocol. This lecture is also an excellent demonstration of TLA+ as a specification language & teaching tool - many of the concepts are tricky to articulate in English but dead simple and unambiguous when read in TLA+! We will also examine the Paxos TLA+ spec as a showcase of how to write a simple, concise, and powerful specification. 10 | 11 | ### Bio 12 | Andrew Helwer is a software engineer in Microsoft Azure, living in Seattle WA. He has a BSc in computer science from the University of Calgary, and was a TA in a recent Microsoft TLA+ course delivered by Leslie Lamport. He is enthusiastic about distributed systems & formal methods, and enjoys writing Wikipedia articles in those fields. 13 | 14 | ### Paper and Spec 15 | (not required, but helpful to take a look before the lecture) 16 | + [Paxos Made Simple](http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf) 17 | + [Paxos TLA+ specification](./Paxos.tla) 18 | 19 | ### Media 20 | + [video](https://www.youtube.com/watch?v=zCaJSrTmUFA) 21 | + [slides](./Paxos.pdf) 22 | 23 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ### Dr. TLA+ Series - learn an algorithm and protocol, study a specification 2 | 3 | | | Date | Speaker | Topic | Media | 4 | |:----------:| ------------- |:-------------:| :----:|:----------:| 5 | | | 06.22.2016 | [Andrew Helwer](https://www.linkedin.com/in/ahelwer) | [Paxos](./Paxos/README.md) | [video](https://www.youtube.com/watch?v=zCaJSrTmUFA), [slides](./Paxos/Paxos.pdf) 6 | | | 07.21.2016 | [Jin Li](http://research.microsoft.com/~jinl) | [Raft](./Raft/README.md) | [video](https://www.youtube.com/watch?v=6Kwx8zfGW0Y), [slides](./Raft/Raft.pdf) 7 | | | 08.29.2016 | [Cheng Huang](http://research.microsoft.com/~chengh) | [Fast Paxos](./FastPaxos/README.md) | [video](https://www.youtube.com/watch?v=eW6Zv0X53T4), [slides](./FastPaxos/FastPaxos.pdf) 8 | | | 09.23.2016 | [Rustan Leino](http://research.microsoft.com/~leino) | [Global Snapshots](./GSnapshot/README.md) | [video](https://www.youtube.com/watch?v=ao58xine3jM), [slides](./GSnapshot/GlobalSnapshots.pdf) 9 | | | 11.09.2016 | [Heidi Howard](http://hh360.user.srcf.net/blog/) | [Flexible Paxos](./FlexiblePaxos/README.md) | [video](https://www.youtube.com/watch?v=LX-WK8EmoFE), [slides](./FlexiblePaxos/FlexiblePaxos.pdf) 10 | | | 01.20.2017 | [Shuai Mu](http://www.mpaxos.com/) | [Byzantine Paxos](./ByzPaxos/README.md) | [video](https://www.youtube.com/watch?v=XnfAZHkyOy4), [slides](./ByzPaxos/byz_paxos.pdf) 11 | | | 03.01.2018 | Ed Huang | [Verifying Distributed Transaction with TLA+ in TiDB](./TiDB/README.md) | 12 | | | 11.01.2018 | [Murat Demirbas](http://muratbuffalo.blogspot.com) | [Consistency guarantees provided by Cosmos DB](./CosmosDB/README.md) | [video](https://youtu.be/Ej6dlMBvUBI), [slides](./CosmosDB/CosmosDB.pdf) 13 | | | 11.15.2019 | [Saksham Chand](https://www.linkedin.com/in/saksham-chand-b1a19b91/) | [Specification and Verification of Multi-Paxos](./MultiPaxos/README.md) | [video](https://youtu.be/uBQSE4MMWhY), [slides](./MultiPaxos/SakshamChand_MultiPaxos.pdf) 14 | |⇒ | 02.XX.2021 | [Stephan Merz](http://www.loria.fr/~merz/) & [Markus Kuppe](https://lemmster.de) | Termination Detection (EWD840 & EWD998) | 15 | 16 | 17 | #### Each session will focus on a single algorithm/protocol and: 18 | + dive deep into how the algorithm and protocol works; 19 | + illustrate in detail how the TLA+ specification is written; 20 | + share the learnings from writing and studying the specification. 21 | -------------------------------------------------------------------------------- /Raft/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Raft (Jin Li) 2 | 3 | ### Time 4 | July 21, 2016 - 10-11:30am PDT 5 | 6 | ### Abstract 7 | In this talk, we will discuss Raft and its TLA+ spec. Raft is a consensus algorithm for managing a replicated log. It produces a result equivalent to (multi-)Paxos. The design of Raft separates key elements of consensus algorithm, such as leader election, log replication, etc.., which results in the Raft more understandable and implementable. Raft has been widely taught and implemented, with a partial list of implementation available at https://raft.github.io/. 8 | 9 | ### Bio 10 | Dr. Jin Li manages the Cloud Computing and Storage group at MSR Technologies/MSR-NExT. His team has made great contributions to Microsoft with monetary value in the order of hundreds of million dollars per annum, with works such as the local reconstruction code (LRC) in Azure and Windows Server, the erasure code used in Lync, Xbox and RemoteFX, the Data Deduplication feature in Windows Server 2012, the high performance SSD based key-value store in Bing, and the RemoteFX for WAN feature in Windows 8 and Windows Server 2012. He is an IEEE Fellow. 11 | 12 | ### Paper and Spec 13 | (not required, but helpful to take a look before the lecture) 14 | + [In Search of an Understandable Consensus Algorithm](https://www.usenix.org/conference/atc14/technical-sessions/presentation/ongaro) (Best Paper at USENIX ATC 2014) 15 | + [Raft TLA+ specification](https://github.com/ongardie/raft.tla) 16 | 17 | ### Media 18 | + [video](https://www.youtube.com/watch?v=6Kwx8zfGW0Y) 19 | + [slides](./Raft.pdf) 20 | 21 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 22 | -------------------------------------------------------------------------------- /Raft/Raft.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tlaplus/DrTLAPlus/b74cff5f3616c7a4beb5416203c292e818e87b5a/Raft/Raft.pdf -------------------------------------------------------------------------------- /TiDB/README.md: -------------------------------------------------------------------------------- 1 | ## Dr. TLA+ Series - Verifying Distributed Transaction with TLA+ in TiDB (Ed Huang) 2 | 3 | ### Time 4 | Thursday March 1, 2018 (2am, Beijing Time) 5 | 6 | ### Abstract 7 | TiDB is an open source distributed Hybrid Transactional/Analytical Processing (HTAP) database that empowers users to meet both online transactional and real-time analytical workloads with a single database. The design is inspired by Google Spanner/F1, and highly compatible with MySQL, for more information, please check out: [http://github.com/pingcap/tidb](http://github.com/pingcap/tidb). 8 | 9 | Under the hood, TiDB is a complex distributed system, including storage engine, SQL parser, query optimizer, transaction layer, replication, etc. TiDB uses a highly optimized 2PC (2-phase commit) algorithm to support ACID semantic. The correctness of the algorithm is very important to us. 10 | 11 | TLA+ is formal specification language which can help us to find the design problem in critical systems. It is widely used to verify the algorithm, like the distributed consensus algorithm - Raft. In TiDB, we also use TLA+ to verify our distributed transaction implementation, it gives us more confident that we are on the right way when we pass the verification. 12 | 13 | In this talk, I will first show what is TiDB, the design and key components, then I will show you some our works with TLA+ ( spec: [https://github.com/pingcap/tla-plus](https://github.com/pingcap/tla-plus) ). 14 | 15 | ### Bio 16 | Ed Huang worked for MSRA, NetEase and WandouLabs before co-founding PingCAP. He is a senior infrastructure software engineer, architect and the CTO of PingCAP. Ed is an expert in distributed system and database development with rich experience and unique understanding in distributed storage. As an open-source fanatic and developer, he has developed Codis, a distributed Redis cache solution, and TiDB, an open source distributed HTAP database. 17 | 18 | [back to schedule](https://github.com/tlaplus/DrTLAPlus) 19 | --------------------------------------------------------------------------------