├── LICENSE ├── README.md ├── amazon ├── README.md ├── serializableSnapshotIsolation.als ├── serializableSnapshotIsolation.tla ├── textbookSnapshotIsolation.als └── textbookSnapshotIsolation.tla └── pluscal ├── alternating_bit ├── AltBitProtocol.tla └── README.md ├── childcare ├── README.md ├── childcare.cfg ├── childcare.tla ├── childcare2.cfg ├── childcare2.tla ├── childcare2_fail.cfg ├── childcare2_fail.tla ├── childcare2_fail.txt └── output.txt └── dining_philosophers ├── README.md ├── dining_deadlock.output.txt ├── dining_deadlock.tla ├── dining_no_deadlock.output.txt └── dining_no_deadlock.tla /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | Repository for TLA+ and PlusCal programs. 3 | 4 | # Presentation 5 | 6 | Given at [Expert-talks 2017](https://expert-talks.in/) 7 | 8 | Slides at [slideshare](https://www.slideshare.net/SandeepJoshi55/doveryai-no-proveryai-introduction-to-tla) 9 | 10 | 11 | # Other Code samples on github 12 | 1. https://github.com/quux00/PlusCal-Examples 13 | 2. https://github.com/belaban/pluscal 14 | 3. https://github.com/muratdem/PlusCal-examples 15 | 4. https://github.com/duerrfk/skp 16 | 17 | # Discussion forum 18 | 19 | 1. [Google group](https://groups.google.com/forum/#!forum/tlaplus) 20 | 21 | # References 22 | 23 | 1. [Marc Brooker, Exploring TLA+ with two-phase commit](http://brooker.co.za/blog/2013/01/20/two-phase.html) 24 | 2. [Rico, Model checking for the Working Man](https://rix0r.nl/essays/2015/08/25/model-checking-for-the-working-man-mf/) 25 | 3. [Ron Pressler](https://pron.github.io/) 26 | 4. [Chris Newcombe, Experience of software engineers using TLA+, PlusCal and TLC](http://tla2012.loria.fr/contributed/newcombe-slides.pdf) 27 | 5. [Brannon Batson, High level specifications: Lessons from Industry](https://www.microsoft.com/en-us/research/publication/high-level-specifications-lessons-from-industry/) 28 | 6. [Hillel Wayne, Learn TLA+](https://www.learntla.com/introduction/) 29 | -------------------------------------------------------------------------------- /amazon/README.md: -------------------------------------------------------------------------------- 1 | 2 | This is a copy of the two sample TLA+ specifications that were posted by Chris Newcombe on [TLA+ news group](https://groups.google.com/d/msg/tlaplus/UwYW6XqyDvE/t6xwd5jGPYwJ) 3 | -------------------------------------------------------------------------------- /amazon/serializableSnapshotIsolation.tla: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sanjosh/tlaplus/fbd1a2043801ddfbb1462ec47cb1c2984b98e93d/amazon/serializableSnapshotIsolation.tla -------------------------------------------------------------------------------- /amazon/textbookSnapshotIsolation.als: -------------------------------------------------------------------------------- 1 | /* 2 | An Alloy model of textbook snapshot isolation, as described in section 4.2 of 3 | 4 | "A Critique of ANSI SQL Isolation Levels" 5 | http://research.microsoft.com/apps/pubs/default.aspx?id=69541 6 | 7 | This version includes the use of exclusive locks to implement 8 | the 'First Committer Wins' rule (actually the 'First Updater Wins' rule) 9 | 10 | This specification includes various correctness properties, which can be 11 | checked by the TLC model checker for all possible sequences of operations 12 | by a small number of transactions (e.g. 3 or 4) over a small number of 13 | keys (e.g. 2 or 3) 14 | 15 | We include a test for serializability, which the Alloy Analyzer can use to show that 16 | textbook snapshot isolation is NOT serializable, as it allows anomalies 17 | such as write-skew. 18 | 19 | Instructions: 20 | 21 | 1. Download Alloy Analyzer v4 from http://alloy.mit.edu/alloy4/ 22 | I tested with v4.1.10 23 | 24 | 2. Click 'Open' and load this model (text file). 25 | 26 | 3. Click (menu) Options...SAT Solver...MiniSat 27 | 28 | 4. Click 'Execute'. 29 | It is currently set up to check serializability of 3 transactions over 2 keys, 30 | with a maximum of 13 time-points (system states in the execution history). 31 | It will find and show an non-serializable counter-example. 32 | This analysis takes about 2 seconds. Other checks might take hours or days. 33 | 34 | 5. Click 'Show'. Don't worry about the complex graph that appears. 35 | 36 | 6. Click the "Magic Layout" button and "Yes, clear the current settings" 37 | The display will now represent a single state of the system, 38 | with keys and transactions as nodes, 39 | and relationships as labelled. arcs 40 | 41 | 7. Use the "<<" and ">>" arrow buttons at the bottom of the screen to move forwards 42 | and backwards through logical time steps. 43 | Be sure to always start at Time$0 -- sometimes the tool shows a different time first. 44 | 45 | 8. Click "Next" in the toolbar to look at the next counter-example, if there is one. 46 | 'Next' doesn't reset the time, so use << to start at Time$0. 47 | 48 | 9. Try examining different interesting conditions, or verifying some properties. 49 | You'll need to edit this file, in the section marked "Analysis" 50 | E.g. The file is currently set to execute this: 51 | 52 | check { cahill_serializable } for 2 but 13 Time, 3 TxnId, 2 Key 53 | 54 | You might change it to: 55 | 56 | check { cahill_serializable } for 2 but 13 Time, 4 TxnId, 3 Key 57 | 58 | 8. Learn the Alloy language & tool via the tutorials listed at http://alloy.mit.edu/alloy4/ 59 | The best place to start is probably this one: http://alloy.mit.edu/alloy4/tutorial4/ 60 | 61 | 9. For more depth, read Jackson's book: 62 | http://www.amazon.com/Software-Abstractions-Logic-Language-Analysis/dp/0262101149 63 | */ 64 | 65 | open util/ordering [Time] as TimeOrder 66 | 67 | // This algorithm for snapshot isolation is not based on timestamp ordering 68 | // (of start times of transactions), so we don't need TxnIds to be ordered for that reason. 69 | // However, we do use TxnIds as "version ids" for keys, so we need TxnId to be ordered 70 | // for that reason. 71 | // 72 | // When using TxnIds as version ids, we must constrain the begin() action to only 73 | // start transactions in increasing order of TxnId (rather than in arbitrary order). 74 | // This is necessary in order to ensure that the temporal ordering of (committed) matches 75 | // the order of their version ids (writer TxnIds). 76 | // That doesn't lose generality (transactions can still read, write, commit or abort 77 | // in any order). The symmetry reduction also makes visualization slightly easier 78 | // and might reduce the cpu-time to check the model. 79 | 80 | open util/ordering [TxnId] as TxnIdOrder 81 | 82 | sig Time {} 83 | sig Key {} 84 | sig TxnId { 85 | // Sets of transactions that are now or have previously been in this state 86 | // (i.e. these sets monotonically grow over time) 87 | // 88 | // note; We explicitly record 'begin' steps (in 'started'), so that we know the 89 | // precise lifetime of a transaction; the algorithms for snapshot isolation 90 | // depend on whether transaction lifetimes overlap, i.e. whether they are concurrent. 91 | // If we simply infered the 'start' time as the time of the first read or write or 92 | // waiting-for-lock etc. then we would fail to modle transactions that begin and 93 | // then do nothing for a while. It's not obvious that it is safe to ignore that 94 | // set of executions. 95 | started: set Time, 96 | committed: set Time, 97 | aborted: set Time, 98 | // These relations grow monotonically over time because they act as 'history 99 | // variables', recording the sequence of transaction operations. 100 | // Part of this history (for active concurrent transactions, plus the most recent 101 | // commit to each key before the oldest active transaction) is required by 102 | // the algorithm that implements Snapshot Isolation. 103 | // The full history is required to test the correctness conditions (e.g. serializability). 104 | // 105 | read: Key -> TxnId -> Time, // ReaderTxnId -> Key -> VersionIdThatWasRead -> Time 106 | written: Key -> Time, // WriterTxnId -> Key -> Time 107 | // These relations model the lock manager, used by the implementation of snapshot isolation. 108 | // These do NOT grow monotonically over time, as locks can be both acquired and released. 109 | // (It is much easier to model 110 | 111 | waitingForXLock: Key->Time, 112 | holdingXLock: Key->Time 113 | } 114 | 115 | // Helper for enabling conditions of public operations 116 | pred txn_started_and_can_do_public_operations[t: Time, txn: TxnId] { 117 | 118 | let time_up_to_and_including_t = (TimeOrder/prevs[t] + t) { 119 | 120 | // Must have been started at or before time t 121 | txn in started.time_up_to_and_including_t 122 | 123 | // ... and not committed or aborted before time t 124 | txn not in (committed + aborted).time_up_to_and_including_t 125 | } 126 | 127 | // And not currently blocked waiting for a lock. 128 | // (If a transaction is blocked waiting for a lock then an *internal* operation 129 | // can become enabled that will allow the transaction to make progress -- 130 | // e.g. the lock might become free and the transaction might be the one that 131 | // acquires the lock. But that operation is not a 'public' operation.) 132 | 133 | no txn.waitingForXLock.t 134 | } 135 | 136 | // Public operations (actions) 137 | 138 | pred begin[t, t': Time, txn: TxnId] { 139 | 140 | // Enabling condition 141 | // Purely for the purposes of the model, we artificially constrain 142 | // transactions to start in order of increasing TxnId, with this: 143 | // 144 | // started.t = TxnIdOrder/prevs[txn] 145 | // 146 | // This allows us to use TxnId as VersionIds for keys. 147 | // 148 | // The goal of using TxnIds for VersionIds is: 149 | // - make visualization simpler (so the user can anticipate which transaction will start next) 150 | // - reduce model-checking time by avoiding the symmetry of permuted TxnIds. 151 | // - avoid the need for a sig of explicit VersionIds (reduce model complexity and model checking time) 152 | // - avoid the obvious use of Time atoms as version ids, because we want to 'project' the visualization on Time, and that would hide any use of Time atoms as version-ids etc.. 153 | // 154 | // This optimization does not affect the thoroughness of model-checking, 155 | // (i.e. does not lose generality), because nothing else depends on any ordering 156 | // of TxnId. 157 | // 158 | // Correctness argument: 159 | // 160 | // The property we need is that, 161 | // for each key, *committed* version ids are monotonically increasing with time. 162 | // 163 | // When using TxnIds as VersionIds, this becomes: 164 | // for each key, the TxnIds of *committed* writers to that key 165 | // are monotonically increasing with time. 166 | // 167 | // The checks for basic_model_correctness verify that is true for 168 | // all executions; see safe_to_use_txnids_as_key_versionids. 169 | // 170 | // We implement this temporal constraint by these rules: 171 | // a. Constraining transactions to start in increasing order of TxnId 172 | // b. The standard SI FirstCommitterWins rule ensures that if any set of concurrent 173 | // transactions write to the same key, at most one of them can commit. 174 | // 175 | // As SI *never* does uncommitted reads, the above 2 properties imply that for 176 | // all *committed* writes to a given key, the TxnId of the writer is monotonically 177 | // increasing with time. 178 | // 179 | // Proof; We assume the contrary (that for some key k, the TxnIds of *committed* 180 | // writes to that key are not monotonically increasing in time), and show a 181 | // contradiction. Non-monotonic ordering means that there exist distinct 182 | // transactions Ti and Tj, with Ti < Tj, that both write to the same key and then 183 | // Tj commits before Ti. Rule (a) implies that Tj started after Ti, and by definition 184 | // Tj commits before Ti; therefore Ti and Tj are concurrent transactions that both 185 | // modify the same key, and both commit. But rule (b) says that at most one of 186 | // Ti and Tj can commit; a contradiction. 187 | // 188 | started.t = TxnIdOrder/prevs[txn] 189 | 190 | // Postcondition 191 | started.t' = started.t + txn 192 | // Frame condition 193 | read.t' = read.t 194 | written.t' = written.t 195 | committed.t' = committed.t 196 | aborted.t' = aborted.t 197 | waitingForXLock.t' = waitingForXLock.t 198 | holdingXLock.t' = holdingXLock.t 199 | } 200 | 201 | pred read[t, t': Time, txn: TxnId, k: Key] { 202 | txn_started_and_can_do_public_operations[t, txn] 203 | 204 | // Bernstein's standard simplification to the model: 205 | // no txn reads the same key more than once. 206 | no txn.read.t[k] 207 | 208 | // note: in SI, reads are not blocked by xlocks 209 | 210 | let versionid_to_read = versionid_of_k_that_would_be_read_by_txn[t, txn, k] { 211 | 212 | some versionid_to_read // enabling condition (forces first operation on each key to be a write) 213 | 214 | read.t' = read.t + txn->k->versionid_to_read 215 | 216 | started.t' = started.t 217 | written.t' = written.t 218 | committed.t' = committed.t 219 | aborted.t' = aborted.t 220 | waitingForXLock.t' = waitingForXLock.t 221 | holdingXLock.t' = holdingXLock.t 222 | } 223 | } 224 | 225 | // Helper. Assumes "one time_of_start(txn)" 226 | fun versionid_of_k_that_would_be_read_by_txn[t: Time, txn: TxnId, k: Key] 227 | : lone TxnId 228 | { 229 | txn in (written.t).k 230 | => 231 | // txn reads the (uncommitted) version that it wrote itself before t 232 | txn 233 | else 234 | // txn reads the latest version that was committed before txn began 235 | TxnIdOrder/max 236 | [ all_versionids_of_k_created_before_t_by_txns_committed_before_t 237 | [time_of_start[txn], k] 238 | & TxnIdOrder/prevs[txn] 239 | ] 240 | } 241 | 242 | fun all_versionids_of_k_created_before_t_by_txns_committed_before_t[t: Time, k: Key] 243 | : set TxnId 244 | { 245 | let all_versionids_of_k_created_before_t = (written.t).k, // set of writer TxnId at t 246 | all_txnids_committed_before_t = committed.t | // set of committed TxnId at t 247 | 248 | all_versionids_of_k_created_before_t & all_txnids_committed_before_t 249 | } 250 | 251 | 252 | pred start_write_may_block[t, t': Time, txn: TxnId, k: Key] { 253 | txn_started_and_can_do_public_operations[t, txn] 254 | 255 | // Berstein's standard simplification to the model: 256 | // no txn writes to the same key more than once. 257 | k not in txn.written.t 258 | 259 | // Part of First Commiter Wins rule: if txn attempts to write to a key that has 260 | // been modified and committed since txn began, then txn cannot possibly 261 | // commit, so we might as well abort txn now. 262 | // 263 | // Alternative: we could just fail the individual write, and allow the transaction 264 | // to proceed. (We could model that by including the above FCW check in the 265 | // enabling-condition, so that Alloy doesn't even attempt to generate behaviors 266 | // that attempt to violate the FCW rule in that way.) 267 | // I choose to not do that, as in the vast majority of cases the transaction 268 | // won't have any realistic alternative than abort, so we simply model the abort. 269 | 270 | some versions_of_k_committed_since_txn_began_and_before_t[t, txn, k] => 271 | abort[t, t', txn] 272 | else { 273 | // This write is not forbidden by the FCW rule. 274 | 275 | // Do we need to wait for this key's xlock before we can write? 276 | // 277 | // (Note that we know that txn itself cannot already be holding the xlock, 278 | // as our enabling-conditions prevent a transaction from writing 279 | // to a key more than once. But we do the correct test anyway (i.e. don't 280 | // wait for a lock if txn already holds it), incase we ever do choose to model 281 | // transactions that can write to the same key more than once 282 | some (holdingXLock.t).k - txn => { 283 | // The key is locked by some other transaction. 284 | // We will need to block to acquire the xlock before we can do the write. 285 | // But blocking on xlocks could cause deadlock, so the following predicate detects 286 | // and prevents that. The following predicate may abort txn or any other transaction 287 | // that would be involved in a potential cycle in the waiting-for-locks graph. 288 | // If this predicate does not abort txn, then when it returns, 289 | // txn will be blocked on the xlock (txn->k will be in waitingForXLock.t'), 290 | // and an internal action, finish_blocked_write[], will later become enabled if the 291 | // lock becomes free. (Note that any number of txns might be waiting for 292 | // the same xlock. The order of acquisition is intentionall non-deterministic, 293 | // to force Alloy to check all possibilities.) 294 | 295 | write_conflicts_with_xlock[t, t', txn, k] 296 | } 297 | else { 298 | // Key is not locked -- so we can immediately acquire the xlock. 299 | // and attempt to do the write. 300 | // (this also does the frame condition apart from waitingForXLock) 301 | do_write[t, t', txn, k] 302 | // remainder of the frame condition 303 | waitingForXLock.t' = waitingForXLock.t 304 | } 305 | } 306 | } 307 | 308 | // Helper for start_write_may_block 309 | fun versions_of_k_committed_since_txn_began_and_before_t[t: Time, txn: TxnId, k: Key] 310 | : set TxnId { 311 | 312 | // written is: WriterTxnId -> Key -> Time 313 | (written.(TimeOrder/prevs[t] - TimeOrder/prevs[time_of_start[txn]])).k 314 | & committed.t 315 | } 316 | 317 | // Helper for start_write_may_block 318 | pred write_conflicts_with_xlock[t, t' : Time, txn: TxnId, k: Key] { 319 | 320 | // Some other transaction is holding the xlock on this key 321 | // (In the current model txn itself cannot be holding the xlock 322 | // as the current model doesn't allow a transaction to write to the 323 | // same key twice.) 324 | // To write to this key, we must acquire the xlock. 325 | // But if waiting for the xlock would cause a deadlock 326 | // then we must instead abort one of the transactions 327 | // involved in the cycle. 328 | 329 | // Standard definition of a deadlock is a cycle in the 'waiting-for-locks' graph. 330 | // The waiting-for-locks graph has nodes that are transactions and 331 | // an edge from T1 to T2 if T2 is currently holding a lock that T1 is waiting for. 332 | // 333 | // Remember: 334 | // waitingForXLock: Txn->Key->Time 335 | // holdingXLock: Txn->Key->Time 336 | // 337 | // The current waiting-for-locks graph is therefore waitingForXLock.t 338 | // dot-joined with the *transpose* of holdingXLock.t. 339 | // i.e.: 340 | // (TxnThatIsWaitingForXLock -> KeyBeingWaitedFor) 341 | // . (KeyWhoseXLockIsHeld -> TxnThatIsHoldingKeysXlock) 342 | // 343 | // We want to know if a cycle would be caused in the waiting for locks graph 344 | // if txn begin wait for k's xlock. So we add txn->k to waitingForXLock.t 345 | // before we do the dot join. 346 | 347 | let new_waiting_for_held_by = (waitingForXLock.t + txn->k).~(holdingXLock.t) { 348 | 349 | // If txn starting to wait on an xlock would cause a cycle, then 350 | // txn must be involved in that cycle, so we can check by seing 351 | // if txn can reach itself. 352 | 353 | txn in txn.^new_waiting_for_held_by => { 354 | 355 | // If txn starts waiting for xlock for k, then it will cause a deadlock. 356 | 357 | // Pick a single transaction to abort, 358 | // that would break the potential cycle in the graph. 359 | // We make this a non-deterministic choice from all transactions involved in the 360 | // potential cycle, to ensure that we model-check all possible choices of victim. 361 | // i.e. We don't enshrine a particular policy -- e.g. min write locks. 362 | 363 | some to_abort: txns_involved_in_cycle[new_waiting_for_held_by] { 364 | 365 | txn in to_abort => { 366 | // We've selected txn itself as the victim to avoid deadlock. 367 | abort[t, t', txn] 368 | } 369 | else { 370 | // We've selected some transaction other than txn as the victim to avoid deadlock. 371 | 372 | // Do the abort, and set txn to be waiting for the xlock. 373 | // 374 | // Note: the abort is not guaranteed to release the xlock 375 | // that txn wants. (The abort just guarantees that when txn 376 | // starts waiting for the xlock, that action won't create a cycle in the 377 | // waiting-for-locks graph.) 378 | // And we *don't* check to see if the abort has released the 379 | // xlock that txn wants (to grant the xlock immediately to txn). 380 | // There might be other transactions waiting for the xlock 381 | // and we don't want to starve them. We want to model-check 382 | // all possible combinations of acquisition. 383 | 384 | aborted.t' = aborted.t + to_abort 385 | holdingXLock.t' = holdingXLock.t - (to_abort <: holdingXLock.t) // drop all of to_abort's xlocks 386 | 387 | // txn starts waiting for the lock, 388 | // and to_abort is nolonger waiting for any locks 389 | waitingForXLock.t' = (waitingForXLock.t + txn->k) 390 | // ... the aborting transaction might be waiting for some other lock 391 | - (to_abort <: waitingForXLock.t) 392 | 393 | started.t' = started.t 394 | read.t' = read.t 395 | written.t' = written.t 396 | committed.t' = committed.t 397 | } 398 | } 399 | } 400 | else { 401 | // Here we know that txn won't cause a deadlock 402 | // when it starts waiting for k's xlock. 403 | 404 | waitingForXLock.t' = waitingForXLock.t + txn->k 405 | 406 | started.t' = started.t 407 | read.t' = read.t 408 | written.t' = written.t 409 | committed.t' = committed.t 410 | aborted.t' = aborted.t 411 | holdingXLock.t' = holdingXLock.t 412 | } 413 | } 414 | } 415 | 416 | // Returns the set of all transactions that are involved in any cycle in the input graph. 417 | fun txns_involved_in_cycle[waiting_for_held_by_with_cycle : TxnId->TxnId] : set TxnId { 418 | 419 | let txns_in_waiting_for_held_by 420 | = TxnId.waiting_for_held_by_with_cycle + waiting_for_held_by_with_cycle.TxnId | 421 | 422 | {involved_in_cycle: txns_in_waiting_for_held_by | 423 | no txn: TxnId | 424 | txn in txn.^(waiting_for_held_by_with_cycle 425 | - (involved_in_cycle->TxnId + TxnId->involved_in_cycle)) 426 | } 427 | } 428 | 429 | // Helper for writes. 430 | // Record that txn has acquired a lock on k and created a new version of k. 431 | // Does most of the frame-condition but does not constrain waitingForXLock.t 432 | pred do_write[t, t': Time, txn: TxnId, k: Key] { 433 | 434 | // Lock it and write it 435 | holdingXLock.t' = holdingXLock.t + txn->k 436 | written.t' = written.t + txn->k 437 | 438 | started.t' = started.t 439 | read.t' = read.t 440 | committed.t' = committed.t 441 | aborted.t' = aborted.t 442 | } 443 | 444 | 445 | // Internal operation (action) 446 | pred finish_blocked_write[t, t': Time, txn: TxnId, k: Key] { 447 | // Enabling condition: txn is waiting for xlock on k, and that xlock is nolonger held. 448 | (k in txn.waitingForXLock.t) and no (holdingXLock.t).k 449 | 450 | // Record that we have acquired the lock and done our write 451 | // (this also does the frame condition apart from waitingForXLock) 452 | do_write[t, t', txn, k] 453 | 454 | // Not waiting for that lock any more 455 | waitingForXLock.t' = waitingForXLock.t - (txn <: waitingForXLock.t) 456 | } 457 | 458 | 459 | pred commit[t, t': Time, txn: TxnId] { 460 | txn_started_and_can_do_public_operations[t, txn] 461 | 462 | committed.t' = committed.t + txn 463 | 464 | // Obviously we also need to drop all xlocks that are held by txn. 465 | // That is done below, as we might need to drop locks held by loser_txns too. 466 | 467 | // We must enforce the FirstCommiterWins rule of Snapshot Isolation; 468 | // If there are one or more transactions that 469 | // are currently waiting for an xlock that is held by txn (the winner), 470 | // then we must abort those loser transactions. 471 | 472 | let keys_xlocked_by_txn = txn.holdingXLock.t, 473 | loser_txns = (waitingForXLock.t).keys_xlocked_by_txn { 474 | 475 | // loser_txns might be empty, or contain any number of transactions. 476 | // The following works in any of those cases: 477 | 478 | // snapshot isolation doesn't have cascading aborts so we don't 479 | // need to look transitively for the set of transactions to abort 480 | aborted.t' = aborted.t + loser_txns 481 | 482 | // But we do need to drop all locks held by the loser transactions 483 | // and txn itself. 484 | // (This might unblock other transactions. That unblocking is modelled by 485 | // other actions becoming enabled if a transaction that is in waitingForXLock 486 | // finds that the lock is nolonger held by anyone else.) 487 | holdingXLock.t' = holdingXLock.t - ((loser_txns + txn) <: holdingXLock.t) 488 | 489 | // And of course the aborted loser transactions are nolonger waiting for locks. 490 | // And neither is the winner txn. 491 | waitingForXLock.t' = waitingForXLock.t - (loser_txns <: waitingForXLock.t) 492 | 493 | started.t' = started.t 494 | read.t' = read.t 495 | written.t' = written.t 496 | } 497 | } 498 | 499 | pred abort[t, t': Time, txn: TxnId] { 500 | txn_started_and_can_do_public_operations[t, txn] 501 | 502 | aborted.t' = aborted.t + txn 503 | holdingXLock.t' = holdingXLock.t - (txn <: holdingXLock.t) // drop all of txn's xlocks 504 | 505 | started.t' = started.t 506 | read.t' = read.t 507 | written.t' = written.t 508 | committed.t' = committed.t 509 | // as the enabling condition for abort[] includes txn_started_and_can_do_public_operations[t, txn] 510 | // then we know that txn cannot be waiting for xlock. 511 | // But in the more general case, aborting a txn should definitely cancel any waiting-for-xlock state, 512 | // so we do that here. 513 | waitingForXLock.t' = waitingForXLock.t - (txn <: waitingForXLock.t) 514 | } 515 | 516 | pred skip[t,t': Time] { 517 | started.t' = started.t 518 | read.t' = read.t 519 | written.t' = written.t 520 | committed.t' = committed.t 521 | aborted.t' = aborted.t 522 | waitingForXLock.t' = waitingForXLock.t 523 | holdingXLock.t' = holdingXLock.t 524 | } 525 | 526 | 527 | // Facts for the execution-traces idiom 528 | 529 | pred init[t: Time] { 530 | no started.t 531 | no written.t 532 | no committed.t 533 | no read.t 534 | no aborted.t 535 | no waitingForXLock.t 536 | no holdingXLock.t 537 | } 538 | 539 | fact traces { 540 | 541 | init[TimeOrder/first[]] 542 | 543 | all t: Time - TimeOrder/last[] | let t' = TimeOrder/next[t] { 544 | 545 | // We allow (require) exactly one public operation (on a single transaction) 546 | // per time-step. 547 | // However, some algorithm-specific constraints *might* also cause 548 | // a simultaneous operation by one or more other transactions. 549 | // e.g. Attempting to commit a transaction might cause other transactions 550 | // to abort. 551 | // To follow standard literature for analyzing transaction concurrency control 552 | // algorithms, we do desire that at most one read, write or commit 553 | // occur at any point in time. We have assertions that verify that condition. 554 | // If it ever matters that transactions are aborted in the same timestep 555 | // as other operations, we can achieve that by introducing a 'pendingAbort' set, 556 | // and only enable abort[] steps for transactions that are in that set. 557 | 558 | // We allow (require) exactly one abstract operation per time-step 559 | some actiontxn: TxnId { 560 | 561 | // Public operations 562 | 563 | begin[t,t', actiontxn] 564 | 565 | or ( some actionkey: Key | 566 | read[t,t', actiontxn, actionkey] 567 | or start_write_may_block[t, t', actiontxn, actionkey] 568 | or finish_blocked_write[t, t', actiontxn, actionkey] 569 | ) 570 | 571 | or commit[t, t', actiontxn] 572 | or abort[t, t', actiontxn] 573 | 574 | or skip[t,t'] 575 | } 576 | } 577 | } 578 | 579 | 580 | // 581 | // Analysis 582 | // 583 | // Note that the Alloy Analyzer only executes the first 'run' or 'check' command that it finds 584 | // in the file. We choose the command by commenting-out the ones we don't want to execute. 585 | // 586 | 587 | // Find an interesting instance 588 | //run {} for 2 but 8 Time, 2 TxnId, 1 Key 589 | /* 590 | run { 591 | //all_txns_do_at_least_one_read_or_write[] 592 | at_least_one_txn_waits_for_an_xlock_and_commits[] 593 | some_txn_reads_from_another[] 594 | } for 2 but 14 Time, 3 TxnId, 2 Key 595 | */ 596 | 597 | // Full analysis 598 | // This scope takes 559s With berkmin 599 | //check { complete_analysis } for 2 but 15 Time, 4 TxnId, 3 Key 600 | 601 | // This scope takes 602 | // Berkmin: 473s 603 | // MinSat: 665s 604 | // ZChaff: 605 | //check { basic_model_correctness } for 2 but 13 Time, 3 TxnId, 2 Key 606 | //check { safe_to_use_txnids_as_key_versionids } 607 | //check { monotonically_growing_txn_state_sets } 608 | //check { begin_is_always_first_action_of_txn } 609 | //check { txn_at_most_one_commit_or_abort } 610 | //check { commit_or_abort_can_only_be_final_action_of_txn } 611 | //check { correctness_of_waitingForXLock } 612 | //check { correctness_of_holdingXLock } 613 | //for 2 but 14 Time, 4 TxnId, 2 Key 614 | 615 | //check { semantics_of_snapshot_isolation } for 2 but 15 Time, 4 TxnId, 3 Key 616 | 617 | // If we intentionall break all_versionids_of_k_created_before_t_by_txns_committed_before_t 618 | // to allow uncommitted reads, this does detect them 619 | //check { txn_only_reads_from_latest_prior_committed_snapshot_or_itself } for 2 but 5 Time, 2 TxnId, 2 Key 620 | 621 | // This once found a bug in which 622 | // t1 and t2 are concurrent and 623 | // t1 writes to k and commits, and then t2 writes to k and also commits 624 | // (was missing code to abort immediately if we attempt a write to a key that has 625 | // been modified and committed since we started) 626 | //check { first_committer_wins } for 2 but 10 Time, 2 TxnId, 2 Key 627 | 628 | // This once found counter examples, when write[] did not detect & prevent deadlocks. 629 | // This scope takes 303 seconds with BerkMin. 630 | //check { no_deadlock } for 2 but 14 Time, 3 TxnId, 3 Key 631 | // If we remove the deadlock-prevention code in write_conflicts_with_xlock[] 632 | // then this generates a deadlock graph-cycle containing 4 txns 633 | /* 634 | run { 635 | some t: Time | 636 | let waiting_for_held_by = (waitingForXLock.t).~(holdingXLock.t) | 637 | #{txn: TxnId | txn in txn.^waiting_for_held_by} >3 638 | } for 5 but 15 Time, 4 TxnId, 4 Key 639 | */ 640 | 641 | // This intentionally FAILS because not all SI executions are serializable. 642 | // It finds an execution with write-skew. 643 | //check { bernstein_serializable } for 2 but 13 Time, 3 TxnId, 2 Key 644 | check { cahill_serializable } for 2 but 13 Time, 3 TxnId, 2 Key 645 | 646 | // Confirm that our two slightly different phrasings of the serializability condition 647 | // are equivalent for all possible executions (in scope). 648 | //check { cahill_serializable <=> bernstein_serializable } for 2 but 15 Time, 4 TxnId, 3 Key 649 | 650 | // Confirm that our two slightly different phrasings of the serializability condition 651 | // are indeed not logically equivalent (not merely different styles of the same formula.) 652 | // This finds a counter-example 653 | // T0 begins, writes k and commits, then 654 | // T1 begins, writes k and commits 655 | // cahill_mvsg has T0->T1 because of 656 | // ( // - T1 produces a version of x, and T2 produces a later version of x (this is a ww-dependency); 657 | // bernstein_mvsg is empty because 658 | // not in bernstein_sg **as there is no read involved** and 659 | // "SG(H) has nodes for the committed transaction 660 | // in H and edges Ti -> Tj (i /= j) whenever for some key x, Tj reads x from Ti. 661 | // That is, Ti -> Tj is present iff for some x, rj[xi] (i /= j) is an operation of C(H). 662 | // not in version_order_edges **as there is no read involved** 663 | // and version_order_edges requires an rk 664 | // "for each rk[xj] and wi[xi] in C(H) where i, j, and k are distinct, 665 | // if xi << xj then include Ti -> Tj, 666 | // otherwise include Tk -> Ti. 667 | //check { all t: Time | cahill_mvsg[t] = bernstein_mvsg[t] } for 2 but 7 Time, 2 TxnId, 1 Key 668 | 669 | 670 | // visually check that 'not concurrent[]' forbids all concurrent transactions 671 | //run { no disj t1, t2: TxnId | concurrent[t1,t2] } for 2 but 7 Time, 2 TxnId, 1 Key 672 | 673 | 674 | // 675 | // Analysis assertions and predicates 676 | // 677 | 678 | pred complete_analysis[] { 679 | basic_model_correctness[] 680 | semantics_of_snapshot_isolation[] 681 | } 682 | 683 | 684 | pred basic_model_correctness[] { 685 | safe_to_use_txnids_as_key_versionids[] 686 | monotonically_growing_txn_state_sets[] 687 | begin_is_always_first_action_of_txn[] 688 | txn_at_most_one_commit_or_abort[] 689 | at_most_one_start_read_write_or_commit_per_timestep[] 690 | commit_or_abort_can_only_be_final_action_of_txn[] 691 | correctness_of_waitingForXLock[] 692 | correctness_of_holdingXLock[] 693 | } 694 | 695 | // Verify semantic properties of snapshot isolation 696 | // 697 | // 1. A txn may only ready from exactly the set of transactions that had been committed 698 | // at the time at which txn starts, or from writes done by txn itself 699 | // 700 | // 2. First Committer Wins: if concurrent transactions write to intersecting sets of keys, 701 | // then at most one of them can commit. 702 | // 703 | // 3. Deadlocks cannot be created. 704 | 705 | pred semantics_of_snapshot_isolation { 706 | txn_only_reads_from_latest_prior_committed_snapshot_or_itself[] 707 | first_committer_wins[] 708 | no_deadlock[] 709 | } 710 | 711 | 712 | pred txn_only_reads_from_latest_prior_committed_snapshot_or_itself[] { 713 | 714 | // read is: ReaderTxnId -> Key -> VersionIdThatWasRead -> Time 715 | 716 | // check all transactions that did at least one read 717 | all txn_doing_read: ((read.Time).TxnId).Key { 718 | 719 | // for every transaction that txn_doing_read actually read from, *excluding* itself 720 | all txn_read_from: (txn_doing_read.read.Time[Key] - txn_doing_read) { 721 | 722 | // the read-from transaction must have committed 723 | // *before* the transaction doing the read even started. 724 | let time_of_read_from_commit = time_of_commit[txn_read_from] { 725 | some time_of_read_from_commit 726 | TimeOrder/lt[time_of_read_from_commit, time_of_start[txn_doing_read]] 727 | } 728 | } 729 | } 730 | } 731 | 732 | 733 | pred first_committer_wins[] { 734 | no disj t1, t2: TxnId | 735 | t1 in committed.Time 736 | and t2 in committed.Time 737 | and concurrent[t1,t2] 738 | and some (t1.written.Time & t2.written.Time) // intersecting sets of keys-written 739 | } 740 | 741 | // true iff both t1 and t2 start and their lifetimes overlap 742 | pred concurrent[t1, t2: TxnId] { 743 | let t1_start = time_of_start[t1], 744 | t1_finalize = time_of_finalize[t1], 745 | t2_start = time_of_start[t2], 746 | t2_finalize = time_of_finalize[t2] { 747 | 748 | some t1_start 749 | && some t2_start 750 | && TimeOrder/lt[t1_start, t2_start] // t1 started before t2 started 751 | => ( no t1_finalize // and ( t1 never finished 752 | or TimeOrder/gt[t1_finalize, t2_start]) // or t1 finished after t2 started) 753 | else // t2 started before t1 started 754 | ( no t2_finalize // and ( t1 never finished 755 | or TimeOrder/gt[t2_finalize, t1_start]) // or t2 finished after t2 started) 756 | } 757 | } 758 | 759 | pred no_deadlock[] { 760 | // waitingForXLock: Txn->Key->Time 761 | // holdingXLock: Txn->Key->Time 762 | no t: Time | 763 | let waiting_for_held_by = (waitingForXLock.t).~(holdingXLock.t) | 764 | some txn: TxnId | txn in txn.^waiting_for_held_by 765 | } 766 | 767 | 768 | // 769 | // Verifying Serializability 770 | // [only here to find NON-serializable instances, until we implement Michael Cahill's algorithm for serializable-SI] 771 | // 772 | // We prove serializability by confirming that the MultiVersionSerializabilityGraph 773 | // is acyclic, for all histories constructed by the algorithm. 774 | 775 | // I have two definitions of the MVSG at hand 776 | // - from Michael Cahill's PhD thesis on serializable snapshot isolation 777 | // - From Phil Bernstein's book (the section on proving that MVTO is serializable) 778 | 779 | // To check that I've implemented them correctly, I define both of them, 780 | // and then asssert that they are equivalent (imply other) at all times in all executions. 781 | 782 | 783 | // From Michael Cahill's PhD thesis: 784 | // 785 | // Verifying that a history is conflict serializable is equivalent to showing that a particular graph is free of 786 | // cycles. The graph that must be cycle-free contains a node for each transaction in the history, and an edge 787 | // between each pair of conflicting transactions. Transactions T1 and T2 are said to conflict (or equivalently, 788 | // to have a dependency) whenever they perform operations whose results reveal something about the order 789 | // of the transactions; in particular when T1 performs an operation, and later T2 performs a conflicting 790 | // operation. Operations O1 and O2 are said to conflict if swapping the order of their execution would 791 | // produce different results (either a query producing a different answer, or updates producing different 792 | // database state). A cycle in this graph implies that there is a set of transactions that cannot be executed 793 | // serially in some order that gives the same results as in the original history. 794 | // ... 795 | // With snapshot isolation, the definitions of the serialization graph become much simpler, as versions of 796 | // an item x are ordered according to the temporal sequence of the transactions that created those versions 797 | // (note that First-Committer-Wins ensures that among two transactions that produce versions of x, one 798 | // will commit before the other starts). 799 | // 800 | // In the MVSG, we put an edge from one committed transaction T1 801 | // to another committed transaction T2 in the following situations: 802 | // 803 | // - T1 produces a version of x, and T2 produces a later version of x (this is a ww-dependency); 804 | // - T1 produces a version of x, and T2 reads this (or a later) version of x (this is a wr-dependency); 805 | // - T1 reads a version of x, and T2 produces a later version of x (this is a rw-dependency, also 806 | // known as an anti-dependency, and is the only case where T1 and T2 can run concurrently). 807 | 808 | pred cahill_serializable { 809 | 810 | all t: Time| 811 | no txn: TxnId | txn in txn.^(cahill_mvsg[t]) 812 | } 813 | 814 | fun cahill_mvsg[t: Time] : TxnId->TxnId { 815 | 816 | {T1: committed.t, T2: committed.t | 817 | 818 | // from one committed transaction T1 to another [distinct] committed transaction T2 819 | T1 != T2 820 | 821 | and some x: Key { 822 | ( // - T1 produces a version of x, and T2 produces a later version of x (this is a ww-dependency); 823 | x in T1.written.t 824 | and x in T2.written.t 825 | and TxnIdOrder/gt[T2, T1] 826 | ) 827 | or ( // - T1 produces a version of x, and T2 reads this (or a later) version of x (this is a wr-dependency); 828 | x in T1.written.t 829 | and some read_versionid: TxnId { 830 | // read is ReaderTxnId -> Key -> VersionIdThatWasRead -> Time 831 | read_versionid in T2.read.t[x] 832 | and TxnIdOrder/gte[read_versionid, T1] 833 | } 834 | ) 835 | or ( // - T1 reads a version of x, and T2 produces a later version of x (this is a rw-dependency, also 836 | // known as an anti-dependency, and is the only case where T1 and T2 can run concurrently). 837 | some read_versionid: TxnId { 838 | // read is ReaderTxnId -> Key -> VersionIdThatWasRead -> Time 839 | read_versionid in T1.read.t[x] 840 | and x in T2.written.t 841 | and TxnIdOrder/gt[T2, read_versionid] 842 | } 843 | ) 844 | } 845 | } 846 | 847 | } 848 | 849 | 850 | // From Phil Bernstein's book 851 | // 852 | // This is the correctness condition from p152 (chapter 5 section 5.2) of Bernstein's book: 853 | // 854 | // Theorem 5.4: An MV history H is 1SR iff there exists a version order, <<, 855 | // such that MVSG(H, <<) is acyclic. 856 | // 857 | // 858 | // 'version order' is defined as: 859 | // 860 | // // From p151 861 | // Given an MV history H and a data item [key] x, a version order, <, for x in H is 862 | // a total order of versions of x in H. 863 | // A version order, <<, for H is the union of the version orders for all data items. 864 | // For example, a possible version order for H, is x0 << x2, y0 << y1 << y3. 865 | // 866 | // 867 | // The version order is defined (for MVTO) as: 868 | // 869 | // From p152 870 | // Given an MV history H and a version order, <<, the multiversion serialization 871 | // graph for H and <<, MVSG(H, <<), is SG(H) with the following version 872 | // order edges added: for each rk[xj] and wi[xi] in C(H) where i, j, and k are 873 | // distinct, if xi << xj then include Ti -> Tj, otherwise include Tk -> Ti. 874 | // Recall that the nodes of SG(H) and, therefore, of MVSG(H, <<) are the 875 | // committed transactions in H. 876 | // (Note that there is no version order edge if j = k, that is, if a transaction reads 877 | // from itself.) 878 | // 879 | // 880 | // SG(H) is defined as follows: 881 | // 882 | // From p149: 883 | // The serialization graph for an MV history is defined as for a 1V history. 884 | // 885 | // From p32 (section 2.3, serializability theory for monoversion histories) 886 | // The serialization graph (SG) for H, denoted SG(H), is a directed 887 | // graph whose nodes are the transactions in T that are committed in H and 888 | // whose edges are all Ti -> Tj (i =/ j) such that one of Ti's operations precedes 889 | // and conflicts with one of Tj's operations in H. 890 | // 891 | // Continuing p149 892 | // But since only one kind of conflict is possible in an MV history, SGs are quite 893 | // simple. Let H be an MV history. SG(H) has nodes for the committed transaction 894 | // in H and edges Ti -> Tj (i /= j) whenever for some key x, Tj reads x from Ti. 895 | // That is, Ti -> Tj is present iff for some x, rj[xi] (i /= j) is an operation of C(H). 896 | // 897 | // From p30 898 | // Given a history H, the committed projection of H, denoted C(H), is the history 899 | // obtained from H by deleting all operations that do not belong to transactions 900 | // committed in H. Note that C(H) is a complete history over the set of committed 901 | // transactions in H. If H represents an execution at some point in time, C(H) is the 902 | // only part of the execution we can count on, since active transactions can be 903 | // aborted at any time, for instance, in the event of a system failure. 904 | 905 | pred bernstein_serializable { 906 | 907 | // MVSG[H, <<] is acyclic 908 | // i.e. No node (TxnId) can be reached by starting at that node and following 909 | // a directed path in the graph. 910 | // i.e. No node can be reached from itself in the transitive closure of MVSG[H, <<] 911 | 912 | all t: Time| 913 | no txn: TxnId | txn in txn.^(bernstein_mvsg[t]) 914 | 915 | // TODO: is this faster to check? 916 | // no (iden & ^(mvsg[t])) 917 | // 918 | // (Actually, avoiding the set-comprehensions in version_order_edges[] 919 | // might have more impact on speed. 920 | } 921 | 922 | fun bernstein_mvsg[t: Time] : TxnId->TxnId { 923 | bernstein_sg[t] + bernstein_version_order_edges[t] 924 | } 925 | 926 | // SG(H) 927 | // "Ti -> Tj is present iff for some x, rj[xi] (i /= j) is an operation of C(H). 928 | // 929 | // We confine the result to C(H) by only considering Ti, Tj, Tk that are in committed.t 930 | fun bernstein_sg[t: Time] : TxnId->TxnId { 931 | 932 | {writer_txn: committed.t, reader_txn: committed.t | 933 | reader_txn != writer_txn // distinct 934 | and writer_txn in reader_txn.read.t[Key] // reader read from writer 935 | } 936 | } 937 | 938 | // "for each rk[xj] and wi[xi] in C(H) where i, j, and k are distinct, 939 | // if xi << xj then include Ti -> Tj, 940 | // otherwise include Tk -> Ti. 941 | // 942 | // We confine the result to C(H) by only considering Ti, Tj, Tk that are in committed.t 943 | fun bernstein_version_order_edges[t: Time] : TxnId->TxnId { 944 | 945 | {Ti: committed.t, Tj: committed.t | 946 | Ti != Tj // Ti and Tj are distinct committed transactions 947 | and some Tk : committed.t | 948 | Tk != Ti // Tk is a committed transaction distinct from Ti and Tj 949 | and Tk != Tj 950 | and some x: Key | 951 | Tj in Tk.read.t[x] // rk[xj] is in C(H) (Tj is in set of transactions that Tk read from) 952 | and x in Ti.written.t // xi exists in C(H) 953 | and x in Tj.written.t // xj exists in C(H) 954 | and TxnIdOrder/lt[Ti, Tj]} // xi << xj (as version order is TxnId order) 955 | + 956 | {Tk: committed.t, Ti: committed.t | 957 | Tk != Ti // Tk and Ti are distinct 958 | and some Tj : committed.t | 959 | Tj != Tk // Tj is distinct from Ti and Tj 960 | and Tj != Ti 961 | and some x: Key | 962 | Tj in Tk.read.t[x] // rk[xj] is in C(H) (Tj is in set of transactions that Tk read from) 963 | and x in Ti.written.t // xi exists in C(H) 964 | and x in Tj.written.t // xj exists in C(H) 965 | and not TxnIdOrder/lt[Ti, Tj]} // NOT xi << xj (as version order is TxnId order) 966 | } 967 | 968 | 969 | // 970 | // Basic model correctness 971 | // 972 | 973 | pred safe_to_use_txnids_as_key_versionids[] { 974 | // For each key, the versionid order of committed writes matches the temporal order of 975 | // committed writes. 976 | // i.e. No version is ever commmitted that has a lower version-id than an existing 977 | // committed version. 978 | // The mechanism that enforces this is 979 | // 1. the First committer Wins Rule; if more than one concurrent transaction 980 | // writes to a key, at most one can commit. 981 | // 2. The artificial constraint that for this model, transactions always 982 | // begin in order of TxnId. 983 | // Therefore all successful writes (i.e. committed writes) to a key must be done 984 | // by transactions with increasing TxnIds. 985 | // i.e. Transactions commit in TxnId order. 986 | 987 | // ... for all pairs of different transactions that have both written and committed 988 | // ... written is WriterTxnId -> Key -> Time 989 | 990 | all k: Key | 991 | all disj txn1, txn2 : (written.Time).k & committed.Time | 992 | let t1c = time_of_commit[txn1], 993 | t2c = time_of_commit[txn2] | 994 | 995 | TxnIdOrder/lt[txn1,txn2] iff TimeOrder/lt[t1c, t2c] 996 | and TxnIdOrder/gt[txn1,txn2] iff TimeOrder/gt[t1c, t2c] 997 | } 998 | 999 | // This verifies that frame conditions are complete (i.e. don't accidentally allow spurious changes) 1000 | pred monotonically_growing_txn_state_sets[] { 1001 | monotonically_growing_txn_state_set[started] 1002 | monotonically_growing_txn_state_set[committed] 1003 | monotonically_growing_txn_state_set[aborted] 1004 | all t: Time - TimeOrder/last[] { 1005 | read.t in read.(TimeOrder/next[t]) 1006 | written.t in written.(TimeOrder/next[t]) 1007 | } 1008 | } 1009 | 1010 | pred monotonically_growing_txn_state_set[s: TxnId->Time] { 1011 | all t: Time - TimeOrder/last[] | 1012 | s.t in s.(TimeOrder/next[t]) 1013 | } 1014 | 1015 | // If a transaction starts at all, then start is the first operation in that txn 1016 | pred begin_is_always_first_action_of_txn[] { 1017 | all txn: TxnId | 1018 | some txn.started => 1019 | time_of_start[txn] = TimeOrder/min[times_of_all_events[txn]] 1020 | } 1021 | 1022 | // A transaction can do at most one commit or abort. 1023 | // (can't commit or abort multlple times, and can't both commit and abort) 1024 | pred txn_at_most_one_commit_or_abort[] { 1025 | all txn: TxnId | lone time_of_finalize[txn] 1026 | } 1027 | 1028 | // If a transaction commits or aborts then that commit or abort is the last operation 1029 | // of that transaction 1030 | pred commit_or_abort_can_only_be_final_action_of_txn[] { 1031 | all txn: TxnId | 1032 | let time_of_finalize = time_of_finalize[txn] | 1033 | some time_of_finalize => 1034 | time_of_finalize = TimeOrder/max[times_of_all_events[txn]] 1035 | } 1036 | 1037 | pred correctness_of_waitingForXLock[] { 1038 | // A transaction can only be waiting for one xlock at any point in time 1039 | all txn: TxnId | all t: Time | lone txn.waitingForXLock.t 1040 | 1041 | // A transaction cannot begin to wait for a particular xlock (i.e. particular key) 1042 | // more than once 1043 | all txn: TxnId | all k: Key | lone k.(start_wait_for_xlock_events[txn]) 1044 | 1045 | // A transaction might wait for different xlocks at different times 1046 | // We can't assert this as it is not true for all executions. 1047 | // I'd like to assert that it is true for some executions (i.e. not prohibited 1048 | // by the model). But the best way to check that is probably to 1049 | // 'run' it to find and visually inspect an instance. 1050 | // some txn: TxnId | #(wait_for_xlock_events[txn]) > 1 1051 | 1052 | // Every time a transaction leaves the waitingForXLock state, 1053 | // it does so either by acquiring that particular lock or aborting 1054 | all txn: TxnId | 1055 | let swle = stop_wait_for_xlock_events[txn] | 1056 | all k: swle.Time | let post_t = k.swle | 1057 | k->post_t in acquire_xlock_events[txn] 1058 | or post_t = time_of_abort[txn] 1059 | 1060 | // Multiple transactions can be waiting for the same lock (and different locks) 1061 | // We can't assert this as it is not true for all executions. 1062 | // I'd like to assert that it is true for some executions (i.e. not prohibited 1063 | // by the model). But the best way to check that is probably to 1064 | // 'run' it to find and visually inspect an instance. 1065 | // some t:Time | some k: Key | waitingForXLock.t[k] > 1 1066 | } 1067 | 1068 | pred at_most_one_start_read_write_or_commit_per_timestep[] { 1069 | // (It's not true that exactly one action happens in every step, because 1070 | // some actions can force other transactions to abort in the same step.) 1071 | 1072 | // read: ReaderTxnId -> Key -> VersionIdThatWasRead -> Time 1073 | // written: WriterTxnId -> Key -> Time 1074 | 1075 | all t: Time - TimeOrder/first[] { 1076 | let p = TimeOrder/prev[t], 1077 | 1078 | s = started.t - started.p, 1079 | r = read.t - read.p, 1080 | w = written.t - written.p, 1081 | c = committed.t - committed.p { 1082 | 1083 | lone s 1084 | lone r 1085 | lone w 1086 | lone c 1087 | some s => (no r and no w and no c) 1088 | some r => (no s and no w and no c) 1089 | some w => (no s and no r and no c) 1090 | some c => (no s and no r and no w) 1091 | } 1092 | } 1093 | } 1094 | 1095 | pred correctness_of_holdingXLock[] { 1096 | // - no two transactions can hold the same xlock at the same time 1097 | // holding_for_xlock is HolderTxnId->Key->Time 1098 | all t: Time | all k: Key | lone (holdingXLock.t).k 1099 | 1100 | // If a transaction is finalized then all of the txn's locks are continuously held 1101 | // from time of acquisition until time of finalize (are released in the same transition as finalize). 1102 | // If a transaction is not finalized, then its locks are never released. 1103 | 1104 | all txn: TxnId | 1105 | some time_of_finalize[txn] => { 1106 | all k: (acquire_xlock_events[txn]).Time | 1107 | k.(release_xlock_events[txn]) = time_of_finalize[txn] 1108 | } 1109 | else { 1110 | no release_xlock_events[txn] 1111 | } 1112 | 1113 | // All writes are done while holding the appropriate xlock 1114 | // (This doesn't check that at most one write is done per time-step; 1115 | // that is checked elsewhere.) 1116 | all t: Time - TimeOrder/last[] | let p = TimeOrder/prev[t] | 1117 | all k: Key | 1118 | all txn: TxnId | 1119 | txn in (written.t).k and txn not in (written.p).k 1120 | => txn in (holdingXLock.t).k 1121 | 1122 | // A transaction can hold multiple locks at the same time 1123 | // We can't assert this as it is not true for all executions. 1124 | // I'd like to assert that it is true for some executions (i.e. not prohibited 1125 | // by the model). But the best way to check that is probably to 1126 | // 'run' it to find and visually inspect an instance. 1127 | // some txn: TxnId | txn.holding_for_xlock.t[Key] > 1 1128 | } 1129 | 1130 | 1131 | // 1132 | // Helpers 1133 | // 1134 | 1135 | // These return an empty set if the transaction never did any event(s) of 1136 | // the specified type 1137 | 1138 | // All 'event times' are the time of the POST state of the event (i.e. when the 1139 | // state change corresponding to the event first showed up). 1140 | 1141 | fun time_of_start[txn: TxnId] : lone Time { 1142 | time_of_simple_event[txn, started] 1143 | } 1144 | 1145 | fun read_events[txn: TxnId] : Key->TxnId->Time { 1146 | {k: Key, versionid: TxnId, post_t: Time - TimeOrder/first[] | 1147 | k->versionid not in txn.read.(TimeOrder/prev[post_t]) 1148 | and k->versionid in txn.read.post_t} 1149 | } 1150 | 1151 | fun start_wait_for_xlock_events[txn: TxnId] : Key->Time { 1152 | {k: Key, post_t: Time - TimeOrder/first[] | 1153 | k not in txn.waitingForXLock.(TimeOrder/prev[post_t]) 1154 | and k in txn.waitingForXLock.post_t} 1155 | } 1156 | 1157 | fun stop_wait_for_xlock_events[txn: TxnId] : Key->Time { 1158 | {k: Key, post_t: Time - TimeOrder/first[] | 1159 | k in txn.waitingForXLock.(TimeOrder/prev[post_t]) 1160 | and k not in txn.waitingForXLock.post_t} 1161 | } 1162 | 1163 | fun acquire_xlock_events[txn: TxnId] : Key->Time { 1164 | {k: Key, post_t: Time - TimeOrder/first[] | 1165 | k not in txn.holdingXLock.(TimeOrder/prev[post_t]) 1166 | and k in txn.holdingXLock.post_t} 1167 | } 1168 | 1169 | fun release_xlock_events[txn: TxnId] : Key->Time { 1170 | {k: Key, post_t: Time - TimeOrder/first[] | 1171 | k in txn.holdingXLock.(TimeOrder/prev[post_t]) 1172 | and k not in txn.holdingXLock.post_t} 1173 | } 1174 | 1175 | fun write_events[txn: TxnId] : Key->Time { 1176 | {k: Key, post_t: Time - TimeOrder/first[] | 1177 | k not in txn.written.(TimeOrder/prev[post_t]) 1178 | and k in txn.written.post_t} 1179 | } 1180 | 1181 | fun time_of_commit[txn: TxnId] : lone Time { 1182 | time_of_simple_event[txn, committed] 1183 | } 1184 | 1185 | fun time_of_abort[txn: TxnId] : lone Time { 1186 | time_of_simple_event[txn, aborted] 1187 | } 1188 | 1189 | fun time_of_finalize[txn: TxnId] : lone Time { 1190 | time_of_commit[txn] + time_of_abort[txn] 1191 | } 1192 | 1193 | fun time_of_simple_event[txn: TxnId, r: TxnId->Time] : lone Time { 1194 | {post_t: Time - TimeOrder/first[] | 1195 | txn not in r.(TimeOrder/prev[post_t]) 1196 | and txn in r.post_t} 1197 | } 1198 | 1199 | fun times_of_all_events[txn: TxnId] : set Time { 1200 | time_of_start[txn] 1201 | // read_events returns a set of Key->VersionIdRead->TimeOfEvent 1202 | + ((read_events[txn])[Key])[TxnId] 1203 | // write_events returns a set of Key->TimeOfEvent 1204 | + write_events[txn][Key] 1205 | // *_xlock_events returns a set of Key->TimeOfEvent 1206 | + start_wait_for_xlock_events[txn][Key] 1207 | + stop_wait_for_xlock_events[txn][Key] 1208 | + acquire_xlock_events[txn][Key] 1209 | + release_xlock_events[txn][Key] 1210 | + time_of_finalize[txn] 1211 | } 1212 | 1213 | // 1214 | // Ad-hoc constraints, pulled into other predicates purely to select interesting instances. 1215 | // 1216 | 1217 | pred all_txns_must_start[] { 1218 | TxnId in started.Time 1219 | } 1220 | pred all_txns_do_at_least_one_read_or_write[] { 1221 | // read is: ReaderTxnId -> KeyThatWasRead -> VersionIdThatWasRead -> Time 1222 | // written is: WriterTxnId -> KeyThatWasWritten -> Time 1223 | (TxnId in (read.Time.TxnId).Key) 1224 | or (TxnId in (written.Time).Key) 1225 | } 1226 | pred some_txn_reads_from_another[] { 1227 | some disj Ti,Tj : TxnId | Ti in (Tj.read.Time)[Key] 1228 | } 1229 | pred no_txn_reads_from_itself[] { 1230 | no Ti : TxnId | Ti in (Ti.read.Time)[Key] 1231 | } 1232 | pred at_least_one_txn_does_a_write[] { 1233 | #written.Time > 0 1234 | } 1235 | pred at_least_one_txn_does_a_read[] { 1236 | #read.Time > 0 1237 | } 1238 | pred if_any_txn_writes_it_does_not_read[] { 1239 | all txn: TxnId | some txn.written => no txn.read 1240 | } 1241 | pred at_least_one_txn_waits_for_an_xlock_and_commits[] { 1242 | some txn: TxnId | 1243 | some start_wait_for_xlock_events[txn] 1244 | and some time_of_commit[txn] 1245 | } 1246 | pred all_txns_commit_or_abort[] { 1247 | TxnId in (committed.Time + aborted.Time) 1248 | } 1249 | pred no_txns_abort[] { 1250 | no aborted.Time 1251 | } 1252 | pred at_least_one_txn_aborts[] { 1253 | #aborted.Time > 0 1254 | } 1255 | 1256 | 1257 | 1258 | // TODO: check that the model is not now over-constrained 1259 | // by changing the algorithm to intentional violate a correctness property, and confirm that the expected violations are found 1260 | 1261 | -------------------------------------------------------------------------------- /amazon/textbookSnapshotIsolation.tla: -------------------------------------------------------------------------------- 1 | -------------------- MODULE textbookSnapshotIsolation ------------------- 2 | 3 | (* A TLA+ specification of textbook snapshot isolation, as described in section 4.2 of 4 | 5 | "A Critique of ANSI SQL Isolation Levels" 6 | http://research.microsoft.com/apps/pubs/default.aspx?id=69541 7 | 8 | This version includes the use of exclusive locks to implement 9 | the 'First Committer Wins' rule (actually the 'First Updater Wins' rule) 10 | 11 | This specification includes various correctness properties, which can be 12 | checked by the TLC model checker for all possible sequences of operations 13 | by a small number of transactions (e.g. 3 or 4) over a small number of 14 | keys (e.g. 2 or 3) 15 | 16 | We include a test for serializability, which TLC can use to show that 17 | textbook snapshot isolation is NOT serializable, as it allows anomalies 18 | such as write-skew. 19 | 20 | Instructions for testing the specification are below. 21 | 22 | We also show how to use TLC to find 'interesting executions' of the algorithm, 23 | e.g. the one described in: 24 | 25 | "A Read-Only Transaction Anomaly Under Snapshot Isolation" 26 | http://www.cs.umb.edu/~poneil/ROAnom.pdf 27 | *) 28 | 29 | EXTENDS Integers, Sequences, FiniteSets, TLC 30 | 31 | CONSTANTS TxnId, Key 32 | NoLock == CHOOSE x : x \notin (Key \union TxnId) 33 | 34 | (* To check properties of this specification via the TLA toolbox: 35 | 36 | 1. In the "TLC Model Checker" menu, create new model 37 | to specify the values of the "CONSTANTS" defined above. 38 | See instructions below. 39 | 40 | 2. Run TLC, with same number of threads as you have cores. 41 | 42 | Configuring the TLC Model Checker (release v1.4.0 or later) 43 | 44 | "Model Overview" ... "What is the model?" 45 | 46 | Key <- {K1,K2} 47 | Click 'Set of model values' 48 | Click 'Symmetry set' ... click 'Next' 49 | Click 'Leave untyped' ... click 'Finish' 50 | 51 | TxnId <- {T1,T2,T3,T4} 52 | As above, click 'Set of model values', 'Symmetry set', and 'Leave untyped'. 53 | 54 | "Model Overview" ... "What is the behavior spec?" ... 55 | 56 | Spec 57 | 58 | "How to run TLC" 59 | 60 | Maximum JVM heap size in MB: I tested with 1500, but smaller may be fine 61 | Number of worker threads: Use the number of cpu cores on your machine 62 | 63 | "Model Overview" ... "What to check?" 64 | 65 | - Ensure that the "Deadlock" box is checked 66 | 67 | - For "Invariants", add one or more of the following: 68 | 69 | Should NEVER be violated 70 | 71 | Check basic model correctness: 72 | 73 | TypeInv 74 | WellFormedTransactionsInHistory(history) 75 | CorrectnessOfHoldingXLocks 76 | CorrectnessOfWaitingForXLock 77 | 78 | Check the semantics of Snapshot Isolation are met: 79 | 80 | CorrectReadView 81 | FirstCommitterWins 82 | 83 | Check that Cahill's formulation of the serializability condition (and our encoding of it in TLA+) 84 | is equivalent to our encoding of Bernstein's formulation of the same condition, 85 | in all reachable states. 86 | This should be true even for states that are not serializable. I.e. if one check returns false, 87 | then the other should also. 88 | 89 | CahillSerializable(history) = BernsteinSerializable(history) 90 | 91 | EXPECTED to be violated: 92 | 93 | We know that Snapshot Isolation is not serializable. 94 | 95 | CahillSerializable(history) 96 | 97 | We can use TLC to find "interesting execution histories". 98 | This helps increase confidence that the specification allows all 99 | the behaviors that we want it to allow -- i.e. is not unintentionally over-constrained. 100 | To do so, we check an invariant of the form "the interesting condition is NOT true". 101 | TLC will therefore report an invariant-violation for the first state 102 | it finds in which the interesting condition is true. 103 | Examples: 104 | 105 | ~ (AtLeastNTxnsHaveCommitted(3) /\ AtLeastNTxnsHaveRead(2) /\ AtLeastNTxnsHaveWritten(2)) 106 | ~ AtLeastNTxnsAreWaitingForLocks(2) 107 | ~ AtLeastNTxnsAbortedDueToReason(1, "forced by First Committer Wins") 108 | ~ AtLeastNTxnsAbortedDueToReason(1, "forced by deadlock-prevention") 109 | *) 110 | 111 | VARIABLES history, (* Global linear history of all transaction operations *) 112 | holdingXLocks, (* Abstraction of a lock-manager *) 113 | waitingForXLock 114 | 115 | allvars == <> 116 | 117 | (* 118 | * Types of variables (the sets of allowed values) 119 | *) 120 | 121 | (* 122 | * Elements in the global history of transaction operations. 123 | * We maintain the familiar linear "operation history" in a form easily readable by humans. 124 | * Also, our formulation of correctness properties examines this history. 125 | * Finally, we use this history to abstract-away some uninteresting details of the 126 | * algorithm for snapshot isolation (see later). 127 | *) 128 | 129 | AbortReasons == {"voluntary", (* the 'reason' for an abort is only used for debugging the spec in TLC *) 130 | "forced by First Committer Wins", 131 | "forced by deadlock-prevention"} 132 | 133 | BeginEventsT == [op : {"begin"}, txnid : TxnId] 134 | AbortEventsT == [op : {"abort"}, txnid : TxnId, reason : AbortReasons] 135 | CommitEventsT == [op : {"commit"}, txnid : TxnId] 136 | ReadEventsT == [op : {"read"}, txnid : TxnId, key : Key, ver : TxnId] 137 | WriteEventsT == [op : {"write"}, txnid : TxnId, key : Key] 138 | EventsT == BeginEventsT \union ReadEventsT \union WriteEventsT \union CommitEventsT \union AbortEventsT 139 | 140 | (* TLA+ is not statically typed. 141 | It's wise to define a 'type invariant', and have TLC check it. 142 | *) 143 | TypeInv == /\ history \in Seq(EventsT) 144 | (* A transaction may hold indepedent exclusive locks on any number of keys *) 145 | /\ holdingXLocks \in [TxnId -> SUBSET Key] 146 | (* A transaction can be waiting for at most one exclusive lock *) 147 | /\ waitingForXLock \in [TxnId -> Key \union {NoLock}] 148 | 149 | 150 | (* Generic TLA+ utilities 151 | *) 152 | Range(f) == {f[x] : x \in DOMAIN f} 153 | 154 | 155 | (* Utilities on for history of operations. 156 | These take the history as a parameter (rather than working on the current global history), 157 | so we can use them to examine prefixes or filtered views of the global history. 158 | 159 | A note on the abstraction-level of this specification: 160 | An implementation of snapshot isolation needs to keep track of various metadata 161 | about transactions, e.g. to decide which version of a key should be 162 | read by a transaction, and whether a transaction must 163 | be aborted due to the First Committer Wins rule, etc. 164 | For the purposes of this specification we are not interested in the details of 165 | those mechanisms, so we choose to abstract them heavily. We achieve the abstraction 166 | by allowing transactions to directly examine the global history of events. 167 | (It it is possible that a real implementation could do the same, although that 168 | is unlikely to be an efficient solution in practice.) 169 | The abstraction is done via the following operators: 170 | 171 | - CommittedTxns, AbortedTxns and therefore FinalizedTxns 172 | - KeysThatTxnHasDoneOperationOn 173 | - VersionThatWouldBeReadBy (uses CommittedWriteHistoryOfKey) 174 | - WritersCommittedToKeySinceTxnBegan 175 | - VersionIDsOfKeyNewerThanReadByTxn 176 | 177 | There is one mechanism that cannot be abstracted by the conventional type of global linear operation history 178 | -- we need state to record when a transaction is blocked waiting for a lock to be released. 179 | We therefore introduce the 'waitingForXLock' variable to model that mechanism. 180 | 181 | To demonstrate that the level of abstraction is a free choice, we choose to 182 | also introduce a variable 'holdingXLocks', to model which exclusive locks are currently held by each transaction. 183 | i.e. The combination of 'waitingForXLock' and 'holdingXLocks' (plus the code that detects and prevents deadlocks) 184 | gives an abstract representation of a lock manager. 185 | However, the variable 'holdingXLocks' is not actually necessary, as the same information can 186 | be obtained by examining the global operation history to see which keys a transaction has written to. 187 | An earlier version of this specification did exactly that. 188 | The correctness predicate 'CorrectnessOfHoldingXLocks' checks that in every state, 189 | holdingXLocks is consistent with the global operation history. 190 | *) 191 | 192 | 193 | (* We model an execution history as a TLA finite-length Sequence of events. 194 | A TLA Sequence is a function [1..N |-> element]. 195 | As our events don't store a unique timestamp or serial number, converting a history 196 | to a set of events would lose track of events that differ only by position 197 | in history -- e.g. multiple reads or writes of a particular key by the same transaction. 198 | For simplicitly, Phil Bernstein's book chooses to forbid such operations, and we do the same 199 | (via enabling conditions on actions in the model). 200 | So it is actually safe to select (into a set) events which meet some criteria. 201 | *) 202 | SelectEvents(h, Test(_)) == {e \in Range(h): Test(e)} 203 | 204 | (* Operators that classify transactions, as of the 'end' of a given global operation history. 205 | 206 | Currently we just deduce the classification of a transaction from the global history of operations. 207 | A real implementation would obviously have internal state for this. 208 | *) 209 | ActiveOrFinalizedTxns(h) == {e.txnid : e \in Range(h)} (* all transactions apart from those that have not yet started *) 210 | NotYetStartedTxns(h) == TxnId \ ActiveOrFinalizedTxns(h) 211 | CommittedTxns(h) == {e.txnid : e \in SelectEvents(h, LAMBDA e : e.op \in {"commit"})} 212 | AbortedTxns(h) == {e.txnid : e \in SelectEvents(h, LAMBDA e : e.op \in {"abort"})} 213 | FinalizedTxns(h) == CommittedTxns(h) \union AbortedTxns(h) 214 | ActiveTxns(h) == ActiveOrFinalizedTxns(h) \ FinalizedTxns(h) 215 | 216 | (* We define the 'start time' of a transation as the position (index) 217 | of its operation in the specified history. 218 | Obviously it is not valid to compare 'start times' that were obtained from 219 | different histories (e.g. different filtered views of the global history). 220 | We mostly use this for finding the end of an interesting prefix of the global history. 221 | *) 222 | StartTime(h, txn) == CHOOSE pos \in 1 .. Len(h) : h[pos] = [op |-> "begin", txnid |-> txn] 223 | 224 | KeysThatTxnHasDoneOperationOn(h, txn, operation) == 225 | LET txn_ops == SelectEvents(h, LAMBDA e : e.txnid = txn /\ e.op = operation) 226 | IN {e.key : e \in txn_ops} 227 | 228 | (* The sequence of committed operations in h. i.e. All operations by aborted or non-finalized transactions are removed *) 229 | CommittedWriteHistoryOfKey(h, key) == 230 | SelectSeq(h, 231 | LAMBDA e : /\ e.op = "write" 232 | /\ e.key = key 233 | /\ e.txnid \in CommittedTxns(h)) 234 | 235 | (* The set of transactions that txn has read from in history h *) 236 | TxnsReadFrom(h, txn) == {op.ver : op \in Range(SelectSeq(h, 237 | LAMBDA e : /\ e.op = "read" 238 | /\ e.txnid = txn))} 239 | (* Returns a set of of <> *) 240 | KeyVersionsReadByTxn(h, txn) == 241 | {<> : op \in Range(SelectSeq(h, 242 | LAMBDA e : /\ e.op = "read" 243 | /\ e.txnid = txn))} 244 | 245 | (* Returns -1 if op is not in history. Otherwise returns an integer in 1..Len(h) *) 246 | IndexOfOpInHistory(h, op) == 247 | IF op \in Range(h) THEN CHOOSE i \in 1..Len(h) : h[i] = op 248 | ELSE -1 249 | 250 | 251 | (* 252 | * Helpers for actions; these are hard-wired to use the spec variables 253 | * current database history, holdingXLocks, and waitingForXlock. 254 | * i.e. They don't work on prefixes of the database history. 255 | *) 256 | 257 | KeysCurrentlyXLockedByActiveTxn(txn) == holdingXLocks[txn] 258 | 259 | KeysCurrentlyXLockedByAnyTxn == UNION Range(holdingXLocks) 260 | 261 | StartedAndCanDoPublicOperation(txn) == 262 | (* Started and not yet finalized *) 263 | /\ txn \in ActiveTxns(history) 264 | (* If txn was waiting for a lock (because it is attempting to write to some key) 265 | then it cannot choose to commit, abort, read or write. 266 | But note that *internal* actions may choose to forcibly abort it, or may grant 267 | the lock and allow the suspended write to proceed. 268 | *) 269 | /\ waitingForXLock[txn] = NoLock 270 | 271 | 272 | WritersCommittedToKeySinceTxnBegan(txn, key) == 273 | (* Note: the write to key might have happened before txn started, 274 | even if the writer committed after txn started. 275 | *) 276 | LET hSinceTxnBegan == SubSeq(history, StartTime(history, txn), Len(history)) 277 | cSinceTxnBegan == CommittedTxns(hSinceTxnBegan) 278 | IN 279 | {t \in cSinceTxnBegan : key \in KeysThatTxnHasDoneOperationOn(history, t, "write")} 280 | 281 | 282 | (* Returns a set with one version id (txn id) 283 | or an empty set, if there was no committed version before txn began, 284 | *) 285 | LatestCommittedVersionOfKeyWhenTxnBegan(txn, key) == 286 | (* Because of the FirstCommitterWins Rule, the latest write in the 287 | committed write history of the key is that latest version. 288 | *) 289 | LET hBeforeTxnBegan == SubSeq(history, 1, StartTime(history, txn)) 290 | cwhkbtb == CommittedWriteHistoryOfKey(hBeforeTxnBegan, key) 291 | IN 292 | (* The latest committed write is the last in the sequence. *) 293 | IF Len(cwhkbtb) = 0 THEN {} 294 | ELSE {cwhkbtb[Len(cwhkbtb)].txnid} 295 | 296 | (* Evaluates to a set containing one TxnId, or an empty set if there 297 | are no versions of key that can be read by txn 298 | *) 299 | VersionThatWouldBeReadBy(txn, key) == 300 | IF key \in KeysCurrentlyXLockedByActiveTxn(txn) THEN 301 | (* txn reads the (uncommitted) version that it wrote itself. 302 | Note: There can be only one such committed version, as this spec 303 | artificially constrains transactions to do at most one write of 304 | a particular key - i.e. Bernstein's standard simplification. 305 | *) 306 | {txn} 307 | ELSE 308 | (* Txn reads the version that was written by the latest transaction 309 | to commit before txn began. 310 | *) 311 | LatestCommittedVersionOfKeyWhenTxnBegan(txn, key) 312 | 313 | 314 | 315 | (* 316 | * Public actions. This is the public interface of the system. 317 | *) 318 | 319 | Begin(txn) == 320 | /\ txn \notin ActiveOrFinalizedTxns(history) 321 | /\ history' = Append(history, [op |-> "begin", txnid |-> txn]) 322 | /\ UNCHANGED <> 323 | 324 | 325 | Commit(txn) == 326 | /\ StartedAndCanDoPublicOperation(txn) 327 | (* Txn is the winner of FirstCommiterWins rule for any writes it is doing. 328 | So if there are other transactions that are currently waiting for any xlock 329 | that is held by txn, then we must abort those transactions. 330 | *) 331 | /\ LET XLocksHeldByCommittingTxn == 332 | KeysCurrentlyXLockedByActiveTxn(txn) 333 | 334 | LoserTxns == 335 | {blockedTxn \in TxnId : waitingForXLock[blockedTxn] \in XLocksHeldByCommittingTxn} 336 | 337 | AbortOpSeq(Txns) == 338 | LET RECURSIVE BuildAbortOpSeq(_) 339 | BuildAbortOpSeq(RemainingTxns) == 340 | IF RemainingTxns = {} THEN 341 | <<>> 342 | ELSE 343 | LET t == CHOOSE t \in RemainingTxns : TRUE 344 | IN <<[op |-> "abort", txnid |-> t, reason |-> "forced by First Committer Wins"]>> 345 | \o BuildAbortOpSeq(RemainingTxns \ {t}) 346 | IN 347 | BuildAbortOpSeq(Txns) 348 | IN 349 | /\ history' = Append(history, [op |-> "commit", txnid |-> txn]) \o AbortOpSeq(LoserTxns) 350 | /\ holdingXLocks' = [t \in TxnId |-> IF t \in {txn} \union LoserTxns 351 | THEN {} (* drop locks held by transactions that have just committed or aborted *) 352 | ELSE holdingXLocks[t]] 353 | /\ waitingForXLock' = [t \in TxnId |-> IF t \in LoserTxns 354 | THEN NoLock (* aborted transactions cannot now be waiting for a lock *) 355 | ELSE waitingForXLock[t]] 356 | 357 | 358 | ChooseToAbort(txn) == 359 | /\ StartedAndCanDoPublicOperation(txn) 360 | /\ history' = Append(history, [op |-> "abort", txnid |-> txn, reason |-> "voluntary"]) 361 | /\ holdingXLocks' = [holdingXLocks EXCEPT ![txn] = {}] (* drop any locks held by the txn that is aborting *) 362 | /\ UNCHANGED <> (* txn can't be waiting for any locks, because StartedAndCanDoPublicOperation(txn) is true *) 363 | 364 | 365 | Read(txn, key) == 366 | /\ StartedAndCanDoPublicOperation(txn) 367 | /\ key \notin KeysThatTxnHasDoneOperationOn(history, txn, "read") (* Bernstein's simplification: no txn reads the same key more than once *) 368 | /\ LET readVerSet == VersionThatWouldBeReadBy(txn, key) 369 | IN 370 | /\ readVerSet /= {} (* still part of the 'enabling condition' for this action 371 | -- we can only read if there is a version of this key that can be read, 372 | i.e. we don't model attempts to read keys that have not yet been created. 373 | *) 374 | /\ history' = Append(history, [op |-> "read", txnid |-> txn, 375 | key |-> key, ver |-> CHOOSE ver \in readVerSet : TRUE]) 376 | /\ UNCHANGED <> 377 | 378 | 379 | (* The Write action requires some helpers, and TLA+ requires that operators are declared before use, 380 | * so helpers come first. 381 | *) 382 | 383 | HelperWriteCanAcquireXLock(txn, key) == 384 | /\ history' = Append(history, [op |-> "write", txnid |-> txn, key |-> key]) 385 | /\ holdingXLocks' = [holdingXLocks EXCEPT ![txn] = @ \union {key}] (* txn acquires lock on key *) 386 | /\ waitingForXLock' = [waitingForXLock EXCEPT ![txn] = NoLock] 387 | 388 | 389 | HelperWriteConflictsWithXLock(txn, key) == 390 | (* Some other transaction is holding the xlock on this key 391 | (In the current model txn itself cannot be holding the xlock 392 | as the current model doesn't allow a transaction to write to the 393 | same key twice.) 394 | To write to this key, we must acquire the xlock. 395 | But if waiting for the xlock would cause a deadlock 396 | then we must abort one of the transactions involved in the cycle. 397 | If we choose to abort a transaction other than txn itself, 398 | then txn can start waiting for the lock it wants. 399 | *) 400 | LET activeTxns == ActiveTxns(history) 401 | 402 | xlockIsHeldBy == 403 | [k \in Key |-> 404 | LET holder == {t \in activeTxns : k \in KeysCurrentlyXLockedByActiveTxn(t)} 405 | IN IF holder /= {} THEN CHOOSE t \in holder : TRUE 406 | ELSE NoLock] 407 | 408 | proposedWaitingForXLock == [waitingForXLock EXCEPT ![txn] = key] 409 | 410 | newWaitingForXLockHeldByEdges == 411 | {<> \in activeTxns \X activeTxns : 412 | \E k \in Key : /\ proposedWaitingForXLock[from] = k 413 | /\ xlockIsHeldBy[k] = to} 414 | 415 | pathThatCyclesFromTxnToTxn == 416 | (* We do eager deadlock-prevention, so we simply forbid the the creation 417 | of any cycle in the actual (accepted) waiting-for-locks graph. 418 | Therefore the only possible cycle in the *proposed* waiting-for-locks 419 | graph would be created by the sole new edge; i.e. txn wanting to acquire 420 | the xlock for key. Therefore the only possible cycle begins with txn 421 | and loops back through txn. So we only consider txn as the start point 422 | of the search, and we don't have to worry about cycles between 423 | groups of nodes that don't include txn. 424 | 425 | Also, there can be at most one cycle. If there were more than one cycle 426 | then some node (inc. txn) must have more than one outgoing edge. But it is impossible for any 427 | node to have more than one outgoing edge, as outgoing edges can only be caused by a transaction 428 | waiting for an Xlock. A transaction can only be waiting for at most one lock 429 | at any point in time (and any particular XLock can only be held by at most one other transaction). 430 | Therefore there is at most one outgoing edge from each transaction. 431 | So we only need to look for any one cycle; i.e. we don't need to find the set of all cycles. 432 | So we don't need to do backtracking; if we hit a dead-end then we are done. 433 | Note: We can't use the generic Graphs module in 'Specifying Systems' as 434 | TLC can't enumerate the infinite set-comprehension in the definition of Path(G). 435 | 436 | TODO: could just use FindAllNodesInAnyCycle(_) for this (it's defined later). 437 | *) 438 | LET RECURSIVE extendPath(_) 439 | (* Returns a set of Seq(TxnId) with at most one member. 440 | Any member will begin with txn and end with a different transaction 441 | that would loop back to txn if we continued to follow the cycle. 442 | *) 443 | extendPath(currPath) == 444 | LET from == currPath[Len(currPath)] 445 | outgoingEdges == {<> \in newWaitingForXLockHeldByEdges : e_from = from} 446 | IN IF outgoingEdges = {} 447 | THEN {} (* Done: the first dead-end we hit implies there is no cycle. *) 448 | ELSE LET edge == CHOOSE <> \in outgoingEdges : TRUE 449 | IN IF edge[2] = txn THEN {currPath} (* Done: This path does loop back to txn. *) 450 | ELSE extendPath(Append(currPath, edge[2])) 451 | IN extendPath(<>) 452 | IN 453 | IF pathThatCyclesFromTxnToTxn = {} THEN 454 | (* Here, txn won't cause a deadlock when it starts waiting for k's xlock, 455 | so it starts waiting without further ado. 456 | *) 457 | /\ waitingForXLock' = [waitingForXLock EXCEPT ![txn] = key] 458 | /\ UNCHANGED <> 459 | ELSE 460 | (* If txn starts waiting for xlock for k, then it will cause a deadlock. 461 | Pick a transaction to abort, that will prevent the potential cycle in the graph. 462 | We make this a non-deterministic choice from all transactions involved in the 463 | potential cycle, so we model-check all possible choices. 464 | i.e. We don't enshrine a particular abort policy (e.g. minimum write locks). 465 | *) 466 | \E to_abort \in Range(CHOOSE anyPathSeq \in pathThatCyclesFromTxnToTxn : TRUE) : 467 | /\ history' = Append(history, [op |-> "abort", txnid |-> to_abort, reason |-> "forced by deadlock-prevention"]) 468 | /\ IF to_abort = txn THEN 469 | (* We've decided to avoid deadlock by aborting the current transation. 470 | *) 471 | /\ holdingXLocks' = [holdingXLocks EXCEPT ![txn] = {}] (* drop any locks held by the txn that is aborting *) 472 | /\ UNCHANGED <> (* txn can't be waiting for any locks, because StartedAndCanDoPublicOperation(txn) is true *) 473 | ELSE 474 | (* We've decided to avoid deadlock by aboring a transaction other than txn. 475 | We alter waitingForXLock so that txn starts waiting for the lock it wants, 476 | and the to_abort transaction is nolonger waiting for any locks, or holding 477 | any locks (because it's been aborted). 478 | 479 | Note: the abort is not guaranteed to release the xlock 480 | that txn wants. (The abort just guarantees that when txn 481 | starts waiting for the xlock, that action won't create a cycle in the 482 | waiting-for-locks graph.) 483 | 484 | Also we *don't* check to see if the abort has released the 485 | xlock that txn wants (to grant the xlock immediately to txn). 486 | There might be other transactions waiting for the xlock 487 | and we don't want to starve them. We want to model-check 488 | all possible combinations of acquisition. 489 | *) 490 | /\ holdingXLocks' = [holdingXLocks EXCEPT ![to_abort] = {}] (* to_abort may have been holding locks *) 491 | /\ waitingForXLock' = [waitingForXLock EXCEPT ![txn] = key, 492 | ![to_abort] = NoLock] (* to_abort may have been waiting for a lock *) 493 | 494 | 495 | StartWriteMayBlock(txn, key) == 496 | /\ StartedAndCanDoPublicOperation(txn) 497 | /\ key \notin KeysCurrentlyXLockedByActiveTxn(txn) (* Bernstein's simplification *) 498 | (* Part of First Commiter Wins rule: if txn attempts to write to a key that has 499 | been modified and committed since txn began, then txn cannot possibly 500 | commit, so we might as well abort txn now. 501 | Alternative: we could just fail the individual write, and allow the transaction 502 | to proceed. We could model that by including the FCW check in the 503 | enabling-condition, so that Alloy doesn't even attempt to generate behaviors 504 | that attempt to violate the FCW rule in that way. 505 | I choose to not do that, as we would not be modelling the application's choice 506 | of how to handle the failed write. In the vast majority of cases the transaction 507 | won't have any realistic alternative than abort, so we simply model the abort. 508 | *) 509 | /\ IF WritersCommittedToKeySinceTxnBegan(txn, key) /= {} THEN 510 | (* Abort txn because it lost the First Updater Wins rule. *) 511 | /\ history' = Append(history, [op |-> "abort", txnid |-> txn, reason |-> "forced by First Committer Wins"]) 512 | /\ holdingXLocks' = [holdingXLocks EXCEPT ![txn] = {}] (* txn may have been holding locks *) 513 | /\ UNCHANGED <> (* txn cannot have been waiting for a lock, as StartedAndCanDoPublicOperation(txn) is true *) 514 | ELSE 515 | IF key \in KeysCurrentlyXLockedByAnyTxn THEN 516 | (* txn needs to wait for some other transaction to release the lock on key. *) 517 | HelperWriteConflictsWithXLock(txn, key) 518 | ELSE 519 | (* No-one is holding key's lock, so txn can lock it immediately. *) 520 | HelperWriteCanAcquireXLock(txn, key) 521 | 522 | 523 | (* 524 | * Internal actions, not part of the public interface of the system. 525 | *) 526 | 527 | (* If txn is blocked waiting for a lock on some key, 528 | it may proceed when that key is nolonger locked. 529 | (Note: txn might be forcibly aborted while it is waiting, 530 | before it ever gets here.) 531 | *) 532 | FinishBlockedWrite(txn) == 533 | /\ waitingForXLock[txn] /= NoLock 534 | /\ LET key == waitingForXLock[txn] 535 | IN /\ key \notin KeysCurrentlyXLockedByAnyTxn 536 | /\ HelperWriteCanAcquireXLock(txn, key) 537 | 538 | (* 539 | * End of actions. 540 | *) 541 | 542 | 543 | (* 544 | * Constraint on possible initial states 545 | *) 546 | Init == /\ history = <<>> 547 | /\ holdingXLocks = [txn \in TxnId |-> {}] 548 | /\ waitingForXLock = [txn \in TxnId |-> NoLock] 549 | 550 | 551 | (* We legitimately terminate if all transactions have either 552 | committed or aborted. At all times when that is not the 553 | case then the Next-state action should make progress. 554 | So for liveness we simply assert weak-fairness of Next. 555 | If we don't explicitly model termination like this then 556 | TLC reports a 'deadlock error' for such terminations. 557 | (That is 'deadlock' in the TLA sense, which means that 558 | ENABLED(Next) is false so the system cannot make any more 559 | state transitions. That has almost nothing to do with 560 | transactional deadlock. We _do_ want to model and check, 561 | transactional deadlock because our algorithm has to handle it. 562 | If there was a bug in our algorithm for preventing/resolving 563 | transaction deadlock, then that should be detected by TLC 564 | as a TLA deadlock (inability for the system to make progress). 565 | But other kinds of algorithm bugs or modelling errors could 566 | also cause TLA-type deadlock. 567 | *) 568 | LegitimateTermination == FinalizedTxns(history) = TxnId 569 | 570 | 571 | (* 572 | * The Next-state action. 573 | * This says that in every state, some transaction can perform one of the named actions. 574 | * or, we have reached the LegitimateTermination condition (all transactions have committed or aborted) 575 | *) 576 | Next == \/ \E txn \in TxnId : 577 | 578 | (* Public actions *) 579 | \/ Begin(txn) 580 | \/ Commit(txn) 581 | \/ ChooseToAbort(txn) (* as contrasted with being forced to abort by FCW rule or deadlock prevention *) 582 | \/ \E key \in Key : 583 | \/ Read(txn, key) 584 | \/ StartWriteMayBlock(txn, key) 585 | 586 | (* Internal actions *) 587 | \/ FinishBlockedWrite(txn) 588 | 589 | (* The following disjunct allows infinite 'stuttering steps' (no change in state) 590 | once legitimate termination has been reached (all transactions have committed or aborted). 591 | This allows TLC to distinguish between legitimate termination 592 | vs. inability to make progress due to some bug in the algorithm or TLA+ code. 593 | An example of such a bug in the algorithm would be failure to detect 594 | and prevent transaction deadlock (a cycle in the graph of waiting-for-held-locks). 595 | We want legitimate termination to be treated as a legal state, despite being 596 | a dead-end in the graph of states that TLC is exploring. In particular, we want 597 | TLC to continue model-checking via any other other unexplored states left in its queue. 598 | But inability to progress for any other reason should cause TLC to 599 | halt and report an error ('deadlock') in that behavior. 600 | *) 601 | \/ (LegitimateTermination /\ UNCHANGED allvars) 602 | 603 | 604 | (* The formula for the whole specification. 605 | 606 | The assertion of weak fairness on the Next-state action says that 607 | if Next is continuously enabled then infinitely many Next steps 608 | occur. i.e. The system must take another step if Next is enabled. 609 | *) 610 | Spec == Init /\ [][Next]_allvars /\ WF_allvars(Next) 611 | 612 | 613 | (* 614 | * Correctness properties of the algorithm, 615 | * and of the specification (encoding of the algorithm in TLA+) 616 | * 617 | * Most properties are specified as being invariants (true in every reachable state). 618 | * In TLA+ we can write that a property is an invariant by declaring that 619 | * it is a theorem that the specification formula implies that the property is always true. 620 | *) 621 | 622 | THEOREM Spec => []TypeInv 623 | 624 | (* Liveness (progress) properties are written in temporal logic. 625 | * e.g. We assert that Spec implies that eventually, all transactions commit or abort. 626 | *) 627 | THEOREM Spec => <>LegitimateTermination 628 | 629 | (* There are further correctness properties below. I haven't bothered declaring 630 | * them as theorems because we have to manually enter them into the "What to check?" 631 | * part of the "TLC Model Checker" configuration anyway. Declaring them as theorems 632 | * doesn't seem to affect that. 633 | *) 634 | 635 | 636 | (* 637 | * Helpers for correctness properties 638 | *) 639 | 640 | (* Returns an set containing all elements that participate in any cycle (i.e. union of all cycles), 641 | or an empty set if no cycle is found. 642 | TODO: this is stronger than necessary for our current use-case; 643 | we only need to know if there is a cycle, not all of the nodes in all cycles. 644 | *) 645 | FindAllNodesInAnyCycle(edges) == 646 | 647 | LET RECURSIVE findCycleNodes(_, _) (* startNode, visitedSet *) 648 | (* Returns a set containing all elements of some cycle starting at startNode, 649 | or an empty set if no cycle is found. 650 | *) 651 | findCycleNodes(node, visitedSet) == 652 | IF node \in visitedSet THEN 653 | {node} (* found a cycle, which includes node *) 654 | ELSE 655 | LET newVisited == visitedSet \union {node} 656 | neighbors == {to : <> \in 657 | {<> \in edges : from = node}} 658 | IN (* Explore neighbors *) 659 | UNION {findCycleNodes(neighbor, newVisited) : neighbor \in neighbors} 660 | 661 | startPoints == {from : <> \in edges} (* All nodes with an outgoing edge *) 662 | IN 663 | UNION {findCycleNodes(node, {}) : node \in startPoints} 664 | 665 | IsCycle(edges) == FindAllNodesInAnyCycle(edges) /= {} 666 | 667 | (* It's easy to write unit tests for helper operators, 668 | and have TLC check them: 669 | - in "What is the behavior spec?", choose "No Behavior Spec" 670 | - In "Evaluate Constant Expression", enter UnitTests_FindAllNodesInAnyCycle 671 | Example: 672 | *) 673 | UnitTests_FindAllNodesInAnyCycle == 674 | /\ FindAllNodesInAnyCycle({}) = {} 675 | /\ FindAllNodesInAnyCycle({<<"a", "b">>}) = {} 676 | /\ FindAllNodesInAnyCycle({<<"a", "b">>, <<"b", "c">>, <<"c", "d">>}) = {} (* no cycle, more nodes *) 677 | /\ FindAllNodesInAnyCycle({<<"a", "a">>}) = {"a"} (* cycle of length 1 *) 678 | /\ FindAllNodesInAnyCycle({<<"a", "b">>, <<"b", "a">>}) = {"a", "b"} (* cycle of length 2 *) 679 | /\ FindAllNodesInAnyCycle({<<"a", "b">>, <<"b", "c">>, <<"c", "d">>, <<"d", "a">>}) = {"a", "b", "c", "d"} (* cycle of length 3 *) 680 | /\ FindAllNodesInAnyCycle({<<"a", "a">>, <<"b", "b">>}) = {"a", "b"} (* multiple disjoint cycles of length 1*) 681 | /\ FindAllNodesInAnyCycle({<<"a", "d">>, <<"d", "b">>, <<"c", "d">>, <<"d", "c">>}) = {"d", "c"} (* cycles plus some nodes not in any cycle but which join to a cycle *) 682 | /\ FindAllNodesInAnyCycle({<<"a", "b">>, <<"b", "a">>, <<"c", "c">>, <<"d", "c">>}) = {"a", "b", "c"} (* multiple disjoint cycles including length > 1 *) 683 | 684 | (* Sidebar 685 | Another way to test for a cycle in a graph is by computing the transitive closure of 686 | the graph (as done in the Alloy version of this spec). 687 | 688 | Here are a couple of different definitions of Transitive Closure that TLC can evaluate 689 | fairly efficiently. I've verified that these are equivalent for relations up to 1..5 \X 1..5 690 | 691 | (* "If R is a relation--that is a set of ordered pairs--let its support be 692 | the set of all elements that appear in those pairs." 693 | *) 694 | Support(R) == {r[1] : r \in R} \cup {r[2] : r \in R} 695 | 696 | TC_SelfJoin(R) == 697 | LET S == Support(R) 698 | SS == S \X S 699 | RECURSIVE selfJoin(_) 700 | selfJoin(r1) == 701 | LET missingJoinTuples(left,right) == 702 | {<> \in SS : 703 | /\ <> \notin left 704 | /\ <> \notin right 705 | /\ \E y \in S : <> \in left /\ <> \in right} 706 | mjt == missingJoinTuples(r1, r1) 707 | IN 708 | IF mjt = {} THEN r1 (* have reached least fixpoint, so this must be transitive closure *) 709 | ELSE LET bigger == r1 \union mjt 710 | IN bigger \union selfJoin(bigger) 711 | IN selfJoin(R) 712 | 713 | (* This definition is based on a suggestion by Leslie Lamport *) 714 | TC_ExtendPath(R) == 715 | LET S == Support(R) 716 | SS == S \X S 717 | C[PathLen \in Nat] == 718 | IF PathLen = 0 THEN R 719 | ELSE 720 | LET TCShorterPaths == C[PathLen - 1] 721 | IN {<> \in SS : 722 | \E y \in S : /\ <> \in TCShorterPaths 723 | /\ <> \in TCShorterPaths} 724 | \union TCShorterPaths 725 | IN (* Allowing paths of length Cardinality(S) + 1 allows for paths that are cycles *) 726 | C[Cardinality(S)] 727 | *) 728 | 729 | 730 | (* Returns true iff both t1 and t2 start and their lifetimes overlap. *) 731 | AreConcurrent(h, t1, t2) == 732 | LET iT1b == IndexOfOpInHistory(h, [op |-> "begin", txnid |-> t1]) 733 | iT1c == IndexOfOpInHistory(h, [op |-> "commit", txnid |-> t1]) 734 | iT2b == IndexOfOpInHistory(h, [op |-> "begin", txnid |-> t2]) 735 | iT2c == IndexOfOpInHistory(h, [op |-> "commit", txnid |-> t2]) 736 | IN 737 | /\ iT1b /= -1 (* t1 started *) 738 | /\ iT1b /= -1 (* t2 started *) 739 | /\ IF iT1b < iT2b THEN 740 | \/ iT1c = -1 (* t1 never finished *) 741 | \/ iT1c > iT2b (* or t1 finished after t2 started *) 742 | ELSE 743 | \/ iT2c = -1 (* t2 never finished *) 744 | \/ iT2c > iT1b (* or t2 finished after t2 started *) 745 | 746 | 747 | (* 748 | * Correctness properties 749 | *) 750 | 751 | WellFormedTransactionsInHistory(h) == 752 | 753 | /\ h \in Seq(EventsT) (* The relevant part of TypeInv *) 754 | /\ \A txn \in TxnId : 755 | LET th == SelectSeq(h, LAMBDA e : e.txnid = txn) (* just the history for this transaction *) 756 | IN 757 | (* If a txn has any operations, the first, and only the first, must be begin. 758 | *) 759 | /\ LET idxsB == {i \in 1..Len(th) : th[i] = [op |-> "begin", txnid |-> txn]} 760 | IN IF Len(th) = 0 THEN idxsB = {} 761 | ELSE idxsB = {1} 762 | 763 | (* A txn may have at most one commit or abort operation, 764 | and if present it must be the last for that txn. 765 | *) 766 | /\ LET idxsF == {i \in 1..Len(th) : \/ th[i] = [op |-> "commit", txnid |-> txn] 767 | \/ \E r \in AbortReasons : 768 | th[i] = [op |-> "abort", txnid |-> txn, reason |-> r]} 769 | IN idxsF = {} \/ idxsF = {Len(th)} 770 | 771 | (* "Bernstein's simplification" 772 | We choose to restrict the specification to histories in which 773 | each transactions is allowed at most one read and one write to each key. 774 | (N.B. There aren't any other restrictions on reads or writes. 775 | E.g. A transaction may read and/or write to more than one key, 776 | and if it both reads and writes to a key, then the read and write may be in either order.) 777 | *) 778 | /\ \A key \in Key : 779 | /\ LET idxsWK == {i \in 1..Len(th) : th[i] = [op |-> "write", txnid |-> txn, key |-> key]} 780 | IN Cardinality(idxsWK) =< 1 781 | /\ LET idxsRK == {i \in 1..Len(th) : 782 | \E ver \in TxnId : 783 | th[i] = [op |-> "read", txnid |-> txn, key |-> key, ver |-> ver]} 784 | IN Cardinality(idxsRK) =< 1 785 | 786 | 787 | (* It's easy to do unit-tests of correctness conditions: 788 | *) 789 | UnitTest_WellFormedTransactionsInHistory == 790 | (* must begin *) 791 | /\ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"]>>) 792 | (* just begin & commit *) 793 | /\ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "commit", txnid |-> "T_1"]>>) 794 | (* begin, readX, writeY, commit *) 795 | /\ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "read", txnid |-> "T_1", key |-> "K_X", ver |-> "T_2"], [op |-> "write", txnid |-> "T_1", key |-> "K_Y"], [op |-> "commit", txnid |-> "T_1"]>>) 796 | (* begin, readX, writeX, abort *) 797 | /\ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "read", txnid |-> "T_1", key |-> "K_X", ver |-> "T_2"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"], [op |-> "abort", txnid |-> "T_1", reason |-> "voluntary"]>>) 798 | (* Negative tests *) 799 | (* begin out of place *) 800 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "write", txnid |-> "T_1", key |-> "K_X"], [op |-> "begin", txnid |-> "T_1"]>>) 801 | (* multiple begin *) 802 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "begin", txnid |-> "T_1"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"]>>) 803 | (* commit out of place (after a begin of a different transaction) *) 804 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "commit", txnid |-> "T_1"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"]>>) 805 | (* abort out of place *) 806 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "abort", txnid |-> "T_1", reason |-> "voluntary"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"]>>) 807 | (* Violation of Bernstein's simplification: multiple writes to same key *) 808 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"], [op |-> "write", txnid |-> "T_1", key |-> "K_X"]>>) 809 | (* Violation of Bernstein's simplification: multiple reads of same key *) 810 | /\ ~ WellFormedTransactionsInHistory(<<[op |-> "begin", txnid |-> "T_1"], [op |-> "read", txnid |-> "T_1", key |-> "K_X", ver |-> "T_2"], [op |-> "read", txnid |-> "T_1", key |-> "K_X", ver |-> "T_2"]>>) 811 | 812 | 813 | (* 814 | * Semantics of snapshot isolation. 815 | *) 816 | 817 | (* Snapshot isolation has precisely defined semantics for what versions of keys 818 | a transaction is allowed to read. 819 | *) 820 | CorrectReadView == 821 | 822 | \A txn \in TxnId : 823 | LET itxnb == IndexOfOpInHistory(history, [op |-> "begin", txnid |-> txn]) 824 | IN 825 | (* only committed reads: 826 | all transactions that txn read from (excluding itself) must have committed *before* txn started 827 | *) 828 | /\ \A read_from \in TxnsReadFrom(history, txn) \ {txn} : 829 | LET irfc == IndexOfOpInHistory(history, [op |-> "commit", txnid |-> read_from]) 830 | IN 831 | /\ irfc /= -1 (* read_from has committed *) 832 | /\ irfc < itxnb (* read_from committed before txn began, so txn sees any writes by read_from *) 833 | 834 | (* only up-to-date reads: 835 | for each key-version read by txn, there must be no committed writes between 836 | the write of that version of the key, and the start time of txn (when it chose its read-view). 837 | (This also holds for the case where txn reads a version that it wrote itself.) 838 | *) 839 | /\ \A <> \in KeyVersionsReadByTxn(history, txn) : 840 | LET iwkv 841 | == IndexOfOpInHistory(history, [op |-> "write", txnid |-> ver, key |-> key]) (* we know this is not -1 *) 842 | 843 | history_between_write_and_txn_began 844 | == SubSeq(history, iwkv + 1, itxnb) 845 | 846 | committed_txns_when_txn_began 847 | == CommittedTxns(SubSeq(history, 1, itxnb)) 848 | 849 | committed_writes_to_key_between_write_that_txn_read_and_when_txn_began 850 | == SelectEvents(history_between_write_and_txn_began, 851 | LAMBDA e : /\ e.op = "write" 852 | /\ e.key = key 853 | /\ e.txnid \in committed_txns_when_txn_began) 854 | IN 855 | {} = committed_writes_to_key_between_write_that_txn_read_and_when_txn_began 856 | 857 | (* For all keys that were both written and read by txn, 858 | - if the read occured before the write, then txn read the latest committed version at the time that txn began 859 | - if the read occured after the write, then txn read the version it wrote itself 860 | (We assume that we have correctly implemented Bernstein's simplification, checked elsewhere, 861 | that a transaction can do at most one read and at most one write to each key.) 862 | *) 863 | /\ LET read_key_ver == KeyVersionsReadByTxn(history, txn) 864 | written_key == KeysThatTxnHasDoneOperationOn(history, txn, "write") 865 | key_ver_of_keys_that_txn_also_wrote 866 | == {<> \in read_key_ver : key \in written_key} 867 | IN 868 | /\ \A <> \in key_ver_of_keys_that_txn_also_wrote : 869 | LET iw == IndexOfOpInHistory(history, [op |-> "write", txnid |-> txn, key |-> key]) 870 | ir == IndexOfOpInHistory(history, [op |-> "read", txnid |-> txn, key |-> key, ver |-> ver]) 871 | IN 872 | IF ir < iw THEN {ver} = LatestCommittedVersionOfKeyWhenTxnBegan(txn, key) (* returns a set *) 873 | ELSE ver = txn 874 | 875 | 876 | FirstCommitterWins == 877 | (* There are no committed transactions that were concurrent, and whose write-sets (keys) intersect. *) 878 | ~ \E t1, t2 \in CommittedTxns(history) : 879 | /\ t1 /= t2 880 | /\ AreConcurrent(history, t1, t2) 881 | /\ KeysThatTxnHasDoneOperationOn(history, t1, "write") 882 | \intersect KeysThatTxnHasDoneOperationOn(history, t2, "write") 883 | /= {} 884 | 885 | (* NoDeadlock == 886 | Absence of deadlock is tested automatically by TLC (unless we disable that test). 887 | The LegitimateTermination condition, plus the weak-fairness of Next action, mean 888 | that TLC correctly does not report deadlock when a behavior cannot be extended because 889 | all transactions have been committed or aborted. 890 | But in all other cases where TLC finds that ENABLED(Next) is false, it will report a deadlock. 891 | *) 892 | 893 | SemanticsOfSnapshotIsolation == 894 | /\ CorrectReadView 895 | /\ FirstCommitterWins 896 | (* /\ NoDeadlock *) (* implicitly tested by TLC, see definition of Next *) 897 | 898 | 899 | (* Some cross-checks that the TLA+ code is correct. 900 | These have nothing to do with checking the algorithm, only it's encoding in TLA+. 901 | 902 | E.g. We wish to check that we are correctly abstracting the lock manager, 903 | and not losing or acquiring locks by accident. Such bugs might prevent execution histories 904 | that could reveal bugs in the actual algorithm. 905 | *) 906 | 907 | CorrectnessOfHoldingXLocks == 908 | (* At any time, an XLOCK can be held by at most one transaction *) 909 | /\ \A k \in Key : Cardinality({t \in TxnId : k \in holdingXLocks[t]}) <= 1 910 | 911 | (* We can deduce from the write/commit/abort history of a transaction 912 | which XLOCKS it must hold at any point in time. 913 | Specifically a lock is held from the write of a key (not before) 914 | until the transaction is finalized (not after). 915 | Check that holdingXLOCK is equivalent to the locks implied by history. 916 | e.g. This checks that holdingXLOCKs does not accidentally lose entries. 917 | *) 918 | /\ LET active == ActiveTxns(history) 919 | IN \A t \in TxnId : 920 | IF t \in active THEN holdingXLocks[t] = KeysThatTxnHasDoneOperationOn(history, t, "write") 921 | ELSE holdingXLocks[t] = {} 922 | 923 | (* For all transactions that claim to be holding an XLOCK, the transaction must be active. 924 | (This is checked by the equivalence to the locks implied by the write/commit/abort history.) 925 | *) 926 | /\ \A t \in TxnId : holdingXLocks[t] /= {} => t \in ActiveTxns(history) 927 | 928 | 929 | CorrectnessOfWaitingForXLock == 930 | (* A transaction can only be waiting for one xlock at any point in time 931 | This is checked by TypeInv, as Range(waitingForXLOCK) is key, not SUBSET key. 932 | *) 933 | 934 | (* Only active transactions can be waiting for an XLOCK *) 935 | /\ \A t \in TxnId : waitingForXLock[t] /= NoLock => t \in ActiveTxns(history) 936 | 937 | 938 | (* Serializability 939 | 940 | As the tests for serializability are complex, we reduce the risk of an error by 941 | including two different formulations (by Cahill and Bernstein). 942 | We can check an invariant that says that these are equivalent in all states. 943 | *) 944 | 945 | (* 946 | From Michael Cahill's PhD thesis: 947 | 948 | Verifying that a history is conflict serializable is equivalent to showing that a particular graph is free of 949 | cycles. The graph that must be cycle-free contains a node for each transaction in the history, and an edge 950 | between each pair of conflicting transactions. Transactions T1 and T2 are said to conflict (or equivalently, 951 | to have a dependency) whenever they perform operations whose results reveal something about the order 952 | of the transactions; in particular when T1 performs an operation, and later T2 performs a conflicting 953 | operation. Operations O1 and O2 are said to conflict if swapping the order of their execution would 954 | produce different results (either a query producing a different answer, or updates producing different 955 | database state). A cycle in this graph implies that there is a set of transactions that cannot be executed 956 | serially in some order that gives the same results as in the original history. 957 | 958 | This is formalized in Theorem 1, which models each transaction as a log of operations, which is a 959 | list of read or write operations on named data items. The execution history is then an interleaving 960 | of the different logs; each log is a subsequence of the execution history involving the ops of a single 961 | transaction. 962 | 963 | Theorem 1 (Conflict serializability, (Stearns et al., 1976)). Let T = {T1 .. Tm} be a set of transactions 964 | and let E be an execution of these transactions modeled by logs {L1, .. Lm}. E is serializable 965 | if there exists a total ordering of T such that for each pair of conflicting operations Oi and Oj from 966 | distinct transactions Ti, and Tj (respectively), Oi precedes Oj in any log L1,...Lm if and only if Ti 967 | precedes Tj in the total ordering. 968 | 969 | ... 970 | 971 | With snapshot isolation, the definitions of the serialization graph become much simpler, as versions of 972 | an item x are ordered according to the temporal sequence of the [commits of the] transactions that created 973 | those versions (note that First-Committer-Wins ensures that among two transactions that produce 974 | versions of x, one will commit before the other starts). 975 | 976 | In the MVSG, we put an edge from one committed transaction T1 977 | to another committed transaction T2 in the following situations: 978 | 979 | - T1 produces a version of x, and T2 produces a later version of x (this is a ww-dependency); 980 | - T1 produces a version of x, and T2 reads this (or a later) version of x (this is a wr-dependency); 981 | - T1 reads a version of x, and T2 produces a later version of x (this is a rw-dependency, also 982 | known as an anti-dependency, and is the only case where T1 and T2 can run concurrently). 983 | *) 984 | CahillMVSG(h) == 985 | (* We only consider operations by transactions that have committed in h, 986 | i.e. not operations done by transactions that have already aborted, or have not yet committed. 987 | *) 988 | LET ct == CommittedTxns(h) 989 | ch == SelectSeq(h, LAMBDA e : e.txnid \in CommittedTxns(h)) 990 | IN 991 | (* The following set comprehension is SPECIFIC TO SNAPSHOT ISOLATION, 992 | because it 'knows' that snapshot isolation guarantees certain properties. 993 | We check correctness of snapshot isolation elsewhere (e.g. First Committer Wins rule). 994 | This predicate does not test the correctness of snapshot isolation. 995 | Here we assume that we have correctly implemented snapshot isolation, and then test 996 | the correctness of Cahill's algorithm (extension to snapshot isolation) 997 | that restricts snapshot isolation to only producing serializable execution histories. 998 | The properties we assume of snapshot isolation are: 999 | a. First Committer Wins: 1000 | Two committed transactions that both wrote to at least one common key 1001 | cannot be concurrent. 1002 | Therefore, 1003 | - the version-order of a key is the commit-order of the transactions that wrote to that key. 1004 | - the version-order of a key is also the write-order (as writers cannot be concurrent, 1005 | so writes cannot be logically re-ordered when constructing a serializable ordering). 1006 | b. Consistent Read-view: 1007 | If T2 reads a version that T1 wrote (or a 'later' version in the version order of that key), 1008 | then T2 must have started after T1 committed. 1009 | *) 1010 | {<> \in ct \X ct : 1011 | (* ... from one committed transaction T1 to another [distinct] committed transaction T2 *) 1012 | T1 /= T2 1013 | /\ \E x \in Key : 1014 | LET iT1w == IndexOfOpInHistory(ch, [op |-> "write", txnid |-> T1, key |-> x]) 1015 | iT2w == IndexOfOpInHistory(ch, [op |-> "write", txnid |-> T2, key |-> x]) 1016 | IN (* T1 produces a version of x, and T2 produces a later version of x 1017 | (this is a 'ww-dependency') *) 1018 | /\ iT1w /= -1 (* T1 wrote to x *) 1019 | /\ iT2w /= -1 (* T2 wrote to x *) 1020 | /\ iT1w < iT2w (* T1 committed before T2, which for snapshot isolation means that 1021 | T1's write is before T2's write in the version order for x. 1022 | Note that the First Committer Wins rule guarantees that T1 and T2 1023 | were not concurrent. 1024 | *) 1025 | \/ 1026 | (* T1 produces a version of x, and T2 reads this (or a later) version of x 1027 | (this is a 'wr-dependency'). 1028 | *) 1029 | LET iT1c == IndexOfOpInHistory(ch, [op |-> "commit", txnid |-> T1]) 1030 | iT2b == IndexOfOpInHistory(ch, [op |-> "begin", txnid |-> T2]) 1031 | IN 1032 | /\ iT1w /= -1 (* T1 wrote to x *) 1033 | /\ x \in KeysThatTxnHasDoneOperationOn(ch, T2, "read") (* T2 read some version of x *) 1034 | /\ iT1c < iT2b (* T1 committed before T2 began, so T2 sees any writes by T1 *) 1035 | \/ 1036 | (* T1 reads a version of x, and T2 produces a later version of x 1037 | (this is a 'rw-dependency, also' known as an anti-dependency, 1038 | and is the only case where T1 and T2 can run concurrently). 1039 | *) 1040 | LET iT1b == IndexOfOpInHistory(ch, [op |-> "begin", txnid |-> T1]) 1041 | iT2c == IndexOfOpInHistory(ch, [op |-> "commit", txnid |-> T2]) 1042 | IN 1043 | /\ x \in KeysThatTxnHasDoneOperationOn(h, T1, "read") (* T1 read some version of x *) 1044 | /\ iT2w /= -1 (* T2 wrote to x *) 1045 | /\ iT1b < iT2c (* T1 (reader) begins before T2 (writer) commits, so T1's read view does not include T2, so T1 reads an earlier version of x than is written by T2 *) 1046 | } 1047 | 1048 | (* For serializability, the property must hold for every committed prefix of the actual history. 1049 | We ensure that by checking that it is an invariant -- i.e. true in every state 1050 | *) 1051 | CahillSerializable(h) == ~ IsCycle(CahillMVSG(h)) 1052 | 1053 | 1054 | (* 1055 | From Phil Bernstein's book: http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx 1056 | 1057 | This is the correctness condition from p152 (chapter 5 section 5.2): 1058 | 1059 | Theorem 5.4: An MV history H is 1SR iff there exists a version order, <<, 1060 | such that MVSG(H, <<) is acyclic. 1061 | 1062 | 'version order' is defined as: 1063 | 1064 | From p151 1065 | Given an MV history H and a data item [key] x, a version order, <, for x in H is 1066 | a total order of versions of x in H. 1067 | A version order, <<, for H is the union of the version orders for all data items. 1068 | 1069 | The version order is defined (for MVTO) as: 1070 | 1071 | From p152 1072 | Given an MV history H and a version order, <<, the multiversion serialization 1073 | graph for H and <<, MVSG(H, <<), is SG(H) with the following version 1074 | order edges added: for each rk[xj] and wi[xi] in C(H) where i, j, and k are 1075 | distinct, if xi << xj then include Ti -> Tj, otherwise include Tk -> Ti. 1076 | Recall that the nodes of SG(H) and, therefore, of MVSG(H, <<) are the 1077 | committed transactions in H. 1078 | (Note that there is no version order edge if j = k, that is, if a transaction reads 1079 | from itself.) 1080 | 1081 | SG(H) is defined as follows: 1082 | 1083 | From p149: 1084 | The serialization graph for an MV history is defined as for a 1V history. 1085 | 1086 | From p32 (section 2.3, serializability theory for monoversion histories) 1087 | The serialization graph (SG) for H, denoted SG(H), is a directed 1088 | graph whose nodes are the transactions in T that are committed in H and 1089 | whose edges are all Ti -> Tj (i /= j) such that one of Ti's operations precedes 1090 | and conflicts with one of Tj's operations in H. 1091 | 1092 | Continuing p149 1093 | But since only one kind of conflict is possible in an MV history, SGs are quite 1094 | simple. Let H be an MV history. SG(H) has nodes for the committed transaction 1095 | in H and edges Ti -> Tj (i /= j) whenever for some key x, Tj reads x from Ti. 1096 | That is, Ti -> Tj is present iff for some x, rj[xi] (i /= j) is an operation of C(H). 1097 | 1098 | From p30 1099 | Given a history H, the committed projection of H, denoted C(H), is the history 1100 | obtained from H by deleting all operations that do not belong to transactions 1101 | committed in H. Note that C(H) is a complete history over the set of committed 1102 | transactions in H. If H represents an execution at some point in time, C(H) is the 1103 | only part of the execution we can count on, since active transactions can be 1104 | aborted at any time, for instance, in the event of a system failure. 1105 | *) 1106 | 1107 | (* SG(H) 1108 | "Ti -> Tj is present iff for some x, rj[xi] (i /= j) is an operation of C(H). 1109 | *) 1110 | BernsteinSG(h) == 1111 | 1112 | LET ct == CommittedTxns(h) 1113 | ch == SelectSeq(h, LAMBDA e : e.txnid \in CommittedTxns(h)) 1114 | IN 1115 | {<> \in ct \X ct : 1116 | /\ reader_txn /= writer_txn (* distinct *) 1117 | /\ writer_txn \in TxnsReadFrom(h, reader_txn) (* reader_txn read from writer_txn *) 1118 | } 1119 | 1120 | (* "for each rk[xj] and wi[xi] in C(H) where i, j, and k are distinct, 1121 | if xi << xj then include Ti -> Tj, 1122 | otherwise include Tk -> Ti." 1123 | *) 1124 | BernsteinVersionOrderEdges(h) == 1125 | 1126 | LET ct == CommittedTxns(h) 1127 | ch == SelectSeq(h, LAMBDA e : e.txnid \in CommittedTxns(h)) 1128 | IN 1129 | {<> \in ct \X ct : 1130 | /\ Ti /= Tj (* Ti and Tj are distinct committed transactions *) 1131 | /\ \E Tk \in ct : 1132 | /\ Tk /= Ti (* Tk is a committed transaction distinct from Ti and Tj *) 1133 | /\ Tk /= Tj 1134 | /\ \E x \in Key : 1135 | /\ -1 /= IndexOfOpInHistory(ch, [op |-> "read", txnid |-> Tk, key |-> x, ver |-> Tj]) (* rk[xj] is in C(H) *) 1136 | /\ LET idx_xi == IndexOfOpInHistory(ch, [op |-> "write", key |-> x, txnid |-> Ti]) 1137 | idx_xj == IndexOfOpInHistory(ch, [op |-> "write", key |-> x, txnid |-> Tj]) 1138 | IN 1139 | /\ -1 /= idx_xi (* xi exists in C(H) *) 1140 | /\ -1 /= idx_xj (* xj exists in C(H) *) 1141 | /\ (idx_xi < idx_xj) (* xi << xj. It is valid to compare these indexes, as they come from the same history (ch) *) 1142 | } 1143 | \union 1144 | {<> \in ct \X ct : 1145 | /\ Tk /= Ti (* Tk and Ti are distinct *) 1146 | /\ \E Tj \in ct : 1147 | /\ Tj /= Tk (* Tj is distinct from Ti and Tj *) 1148 | /\ Tj /= Ti 1149 | /\ \E x \in Key : 1150 | /\ -1 /= IndexOfOpInHistory(ch, [op |-> "read", txnid |-> Tk, key |-> x, ver |-> Tj]) (* rk[xj] is in C(H) *) 1151 | /\ LET idx_xi == IndexOfOpInHistory(ch, [op |-> "write", key |-> x, txnid |-> Ti]) 1152 | idx_xj == IndexOfOpInHistory(ch, [op |-> "write", key |-> x, txnid |-> Tj]) 1153 | IN 1154 | /\ -1 /= idx_xi (* xi exists in C(H) *) 1155 | /\ -1 /= idx_xj (* xj exists in C(H) *) 1156 | /\ ~ (idx_xi < idx_xj) (* NOT xi << xj. It is valid to compare these indexes, as they come from the same history (ch) *) 1157 | } 1158 | 1159 | BernsteinMVSG(h) == BernsteinSG(h) \union BernsteinVersionOrderEdges(h) 1160 | 1161 | BernsteinSerializable(h) == ~ IsCycle(BernsteinMVSG(h)) 1162 | 1163 | 1164 | 1165 | (* Predicates used solely to force TLC to find interesting histories, for understanding 1166 | and debugging the algorithm and model. 1167 | For this, we use TLC's ability to check that a predicate is invariant (true in every state). 1168 | TLC reports the first state it finds in which the invariant is false. 1169 | So to find an example of a particular interesting condition, 1170 | we tell TLC to check an invariant of the form 'not MyInterestingCondition', 1171 | and so find an instance of MyInterestingCondition. 1172 | So when telling TLC to 'check' these, remember to prefix with '~'. Can't really get that 1173 | wrong as if you forget the '~' TLC will instantly report the invariant violated, 1174 | as these are usually not true of the initial state. 1175 | *) 1176 | AtLeastNTxnsHaveWritten(N) == Cardinality({txn \in TxnId : KeysThatTxnHasDoneOperationOn(history, txn, "write") /= {}}) >= N 1177 | AtLeastNTxnsHaveRead(N) == Cardinality({txn \in TxnId : KeysThatTxnHasDoneOperationOn(history, txn, "read") /= {}}) >= N 1178 | AtLeastNTxnsHaveCommitted(N) == Cardinality(CommittedTxns(history)) >= N 1179 | AtLeastNTxnsAreWaitingForLocks(N) == Cardinality({txn \in TxnId : waitingForXLock[txn] /= NoLock}) >= N 1180 | AtLeastNTxnsAbortedDueToReason(N, Reason) == 1181 | LET TxnsAbortedDueToReason == {e.txnid : e \in SelectEvents(history, 1182 | LAMBDA e : e.op = "abort" /\ e.reason = Reason)} 1183 | IN Cardinality(TxnsAbortedDueToReason) >= N 1184 | 1185 | 1186 | (* An interesting case: find the 'read-only anomaly' reported in 1187 | "A Read-Only Transaction Anomaly Under Snapshot Isolation" 1188 | [Alan Fekete, Elizabeth O'Neil, and Patrick O'Neil] 1189 | http://www.cs.umb.edu/~poneil/ROAnom.pdf 1190 | 1191 | Find the anomalous history by checking this 'invariant' 1192 | 1193 | ~ ReadOnlyAnomaly(history) 1194 | 1195 | If that invariant is violated then a read-only anomaly has been found. 1196 | The unit test below confirms that the above invariant will indeed be violated 1197 | by the history in Fekete's paper. 1198 | However, at the time of writing (September 2011), TLC fails to find the anomaly 1199 | because of an apparent bug -- TLC stops searching if its Queue Size 1200 | grows larger than 2^31 states. 1201 | From manual experiments, I found that TLC does find the history up to the 1202 | last 4 states, before hitting this limit. 1203 | I filed this bug report: http://bugzilla.tlaplus.net/show_bug.cgi?id=205 1204 | *) 1205 | 1206 | (* Helper, returns <> *) 1207 | KeysReadAndWrittenByTxn(h, txn) == 1208 | <> 1213 | 1214 | HistoryWithoutTxn(h, txn) == 1215 | SelectSeq(h, LAMBDA e : /\ e.txnid /= txn) 1216 | 1217 | ReadOnlyAnomaly(h) == 1218 | (* current history is not serializable *) 1219 | /\ ~ CahillSerializable(h) 1220 | (* and there is a transaction that does some reads and zero writes, 1221 | and when that transaction is entirely removed from the history, 1222 | the resulting history is serializable. *) 1223 | /\ \E txn \in TxnId : 1224 | LET keysReadWritten == KeysReadAndWrittenByTxn(h, txn) 1225 | IN 1226 | /\ Cardinality(keysReadWritten[1]) > 0 1227 | /\ Cardinality(keysReadWritten[2]) = 0 1228 | /\ CahillSerializable(HistoryWithoutTxn(h, txn)) 1229 | 1230 | (* Unit test for ReadOnlyAnomaly. 1231 | This is an encoding of the example history from Fekete's paper: 1232 | 1233 | R2(X0,0) R2(Y0,0) R1(Y0,0) W1(Y1,20) C1 R3(X0,0) R3(Y1,20) C3 W2(X2,-11) C2 1234 | *) 1235 | UnitTests_ReadOnlyAnomaly == 1236 | 1237 | LET h == << 1238 | (* preamble: create keys that are used later *) 1239 | [op |-> "begin", txnid |-> "T_0"], 1240 | [op |-> "write", txnid |-> "T_0", key |-> "K_X"], 1241 | [op |-> "write", txnid |-> "T_0", key |-> "K_Y"], 1242 | [op |-> "commit", txnid |-> "T_0"], 1243 | 1244 | (* the history from the paper *) 1245 | [op |-> "begin", txnid |-> "T_2"], 1246 | (* R2(X0,0) *) [op |-> "read", txnid |-> "T_2", key |-> "K_X", ver |-> "T_0"], 1247 | (* R2(Y0,0) *) [op |-> "read", txnid |-> "T_2", key |-> "K_Y", ver |-> "T_0"], 1248 | 1249 | [op |-> "begin", txnid |-> "T_1"], 1250 | (* R1(Y0,0) *) [op |-> "write", txnid |-> "T_1", key |-> "K_Y"], 1251 | (* W1(Y1,20) *) [op |-> "write", txnid |-> "T_1", key |-> "K_Y"], 1252 | (* C1 *) [op |-> "commit", txnid |-> "T_1"], 1253 | 1254 | [op |-> "begin", txnid |-> "T_3"], 1255 | (* R3(X0,0) *) [op |-> "read", txnid |-> "T_3", key |-> "K_X", ver |-> "T_0"], 1256 | (* R3(Y1,20) *) [op |-> "read", txnid |-> "T_3", key |-> "K_Y", ver |-> "T_1"], 1257 | (* C3 *) [op |-> "commit", txnid |-> "T_3"], 1258 | 1259 | (* W2(X2,-11) *) [op |-> "write", txnid |-> "T_2", key |-> "K_X"], 1260 | (* C2 *) [op |-> "commit", txnid |-> "T_2"] 1261 | >> 1262 | IN 1263 | ReadOnlyAnomaly(h) 1264 | 1265 | (* 1266 | As the unit test uses strings to represent transaction identifiers and keys, 1267 | running the test is slightly different the normal way we check this specification. 1268 | 1269 | 1. Define a new model with 1270 | 1271 | Model Overview ... What is the model? 1272 | Key <- {"K_X", "K_Y"} 1273 | TxnId <- {"T_0", "T_1", "T_2", "T_3"} 1274 | (both normal assignments) 1275 | 1276 | Advanced Options ... Definition Override : 1277 | NoLock <- (a model value) 1278 | 1279 | 2. Check "No behavior spec" under "Model Overview" ... "What is the behavior spec?" 1280 | 1281 | 3. Set "Evaluate Constant Expression" to 1282 | UnitTests_ReadOnlyAnomaly 1283 | 1284 | 4. Run TLC. It the output in the "Value" pane should be TRUE . 1285 | 1286 | p.261 of Specifying Systems suggests that this should work (as "Evaluate Constant Expression") 1287 | but the toolbox gives an error about the assume. 1288 | 1289 | ASSUME LET Key == {"K_X", "K_Y"} 1290 | TxnId == {"T_0", "T_1", "T_2", "T_3"} 1291 | IN UnitTests_ReadOnlyAnomaly 1292 | *) 1293 | 1294 | ============================================================================= 1295 | \* Modification History 1296 | \* Last modified Mon Oct 10 09:21:44 PDT 2011 by cnewcom 1297 | \* Created Sun Mar 27 13:01:51 PDT 2011 by cnewcom 1298 | -------------------------------------------------------------------------------- /pluscal/alternating_bit/AltBitProtocol.tla: -------------------------------------------------------------------------------- 1 | --------------------- MODULE AltBitProtocol ---------------------- 2 | EXTENDS Naturals, Sequences, TLC 3 | CONSTANT Msg 4 | 5 | (*********************************************************) 6 | (* AltBitProtocol from The PlusCal Algorithm Language *) 7 | (* paper by Lamport *) 8 | (* original @ https://github.com/quux00/PlusCal-Examples *) 9 | (*********************************************************) 10 | 11 | Remove(i, seq) == [j \in 1..(Len(seq)-1) |-> IF j < i THEN seq[j] ELSE seq[j+1]] 12 | 13 | (* 14 | --algorithm AltBitProtocol { 15 | variables 16 | input = <<>>, output = <<>>, 17 | msgChan = <<>>, ackChan = <<>>, 18 | newChan = <<>>; 19 | 20 | macro Send(m, chan) { 21 | chan := Append(chan, m); 22 | } 23 | 24 | macro Recv(v, chan) { 25 | await chan # <<>>; \* could also do Len(chan) > 0 ?? 26 | v := Head(chan); 27 | chan := Tail(chan); 28 | } 29 | 30 | process (Sender = "S") 31 | variables next = 1, sbit = 0, ack; 32 | { 33 | s: while (TRUE) { 34 | either with (m \in Msg) { 35 | input := Append(input, m); 36 | 37 | } or { 38 | await next <= Len(input); 39 | Send(<>, msgChan); 40 | 41 | } or { 42 | Recv(ack, ackChan); 43 | if (ack = sbit) { 44 | next := next + 1; 45 | sbit := (sbit + 1) % 2; 46 | }; 47 | }; 48 | print <<"Sender", input>>; 49 | } 50 | }; \* end Sender process block 51 | 52 | process (Receiver = "R") 53 | variables rbit = 1, msg; 54 | { 55 | r: while (TRUE) { 56 | either { 57 | Send(rbit, ackChan); 58 | } or { 59 | Recv(msg, msgChan); 60 | if (msg[2] # rbit) { 61 | rbit := (rbit + 1) % 2; 62 | output := Append(output, msg[1]); 63 | }; 64 | }; 65 | } 66 | }; \* end Receiver process block 67 | 68 | process (LoseMsg = "L") { 69 | l: while (TRUE) { 70 | either with (i \in 1..Len(msgChan)) { 71 | msgChan := Remove(i, msgChan); 72 | } or with (i \in 1..Len(ackChan)) { 73 | ackChan := Remove(i, ackChan); 74 | }; 75 | } 76 | }; \* end LoseMsg process block 77 | 78 | } \* end algorithm 79 | *) 80 | \* BEGIN TRANSLATION 81 | CONSTANT defaultInitValue 82 | VARIABLES input, output, msgChan, ackChan, newChan, next, sbit, ack, rbit, 83 | msg 84 | 85 | vars == << input, output, msgChan, ackChan, newChan, next, sbit, ack, rbit, 86 | msg >> 87 | 88 | ProcSet == {"S"} \cup {"R"} \cup {"L"} 89 | 90 | Init == (* Global variables *) 91 | /\ input = <<>> 92 | /\ output = <<>> 93 | /\ msgChan = <<>> 94 | /\ ackChan = <<>> 95 | /\ newChan = <<>> 96 | (* Process Sender *) 97 | /\ next = 1 98 | /\ sbit = 0 99 | /\ ack = defaultInitValue 100 | (* Process Receiver *) 101 | /\ rbit = 1 102 | /\ msg = defaultInitValue 103 | 104 | Sender == /\ \/ /\ \E m \in Msg: 105 | input' = Append(input, m) 106 | /\ UNCHANGED <> 107 | \/ /\ next <= Len(input) 108 | /\ msgChan' = Append(msgChan, (<>)) 109 | /\ UNCHANGED <> 110 | \/ /\ ackChan # <<>> 111 | /\ ack' = Head(ackChan) 112 | /\ ackChan' = Tail(ackChan) 113 | /\ IF ack' = sbit 114 | THEN /\ next' = next + 1 115 | /\ sbit' = (sbit + 1) % 2 116 | ELSE /\ TRUE 117 | /\ UNCHANGED << next, sbit >> 118 | /\ UNCHANGED <> 119 | /\ PrintT(<<"Sender", input'>>) 120 | /\ UNCHANGED << output, newChan, rbit, msg >> 121 | 122 | Receiver == /\ \/ /\ ackChan' = Append(ackChan, rbit) 123 | /\ UNCHANGED <> 124 | \/ /\ msgChan # <<>> 125 | /\ msg' = Head(msgChan) 126 | /\ msgChan' = Tail(msgChan) 127 | /\ IF msg'[2] # rbit 128 | THEN /\ rbit' = (rbit + 1) % 2 129 | /\ output' = Append(output, msg'[1]) 130 | ELSE /\ TRUE 131 | /\ UNCHANGED << output, rbit >> 132 | /\ UNCHANGED ackChan 133 | /\ UNCHANGED << input, newChan, next, sbit, ack >> 134 | 135 | LoseMsg == /\ \/ /\ \E i \in 1..Len(msgChan): 136 | msgChan' = Remove(i, msgChan) 137 | /\ UNCHANGED ackChan 138 | \/ /\ \E i \in 1..Len(ackChan): 139 | ackChan' = Remove(i, ackChan) 140 | /\ UNCHANGED msgChan 141 | /\ UNCHANGED << input, output, newChan, next, sbit, ack, rbit, msg >> 142 | 143 | Next == Sender \/ Receiver \/ LoseMsg 144 | 145 | Spec == Init /\ [][Next]_vars 146 | 147 | \* END TRANSLATION 148 | 149 | ================================================================== 150 | -------------------------------------------------------------------------------- /pluscal/alternating_bit/README.md: -------------------------------------------------------------------------------- 1 | 2 | [Alternating Bit Protocol](https://en.wikipedia.org/wiki/Alternating_bit_protocol) 3 | 4 | [original source code](https://github.com/quux00/PlusCal-Examples/blob/master/AltBitProtocol/AltBitProtocol.tla) 5 | -------------------------------------------------------------------------------- /pluscal/childcare/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Problem 3 | 4 | At a child care center, state regulations require that there is always one adult present for every three children. 5 | 6 | ["The Little Book of Semaphores" by Allen B. Downey Sec 7.2](http://greenteapress.com/wp/semaphores/) 7 | 8 | # How to Run 9 | 10 | ``` 11 | export CLASSPATH=`location of tla2tools.jar`/* 12 | java pcal.trans ./childcare2.tla 13 | java tlc2.TLC ./childcare2.tla 14 | ``` 15 | 16 | 17 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare.cfg: -------------------------------------------------------------------------------- 1 | SPECIFICATION Spec 2 | \* Add statements after this line. 3 | 4 | CONSTANT ADULTS = 2 5 | CONSTANT CHILDREN = 10 6 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE childcare ------------------------------- 2 | EXTENDS Naturals, Sequences, TLC, Reals 3 | 4 | CONSTANTS CHILDREN, 5 | ADULTS 6 | 7 | ASSUME /\ CHILDREN > 0 8 | /\ ADULTS > 0 9 | 10 | IsEnvSafe(num_children, num_adults) == (num_children = 0) 11 | \/ ((num_adults * 3) >= num_children) 12 | 13 | 14 | (* 15 | --algorithm FairProcess { 16 | 17 | variables num_children = 0, 18 | num_adults = 0 19 | 20 | process (a \in 1..ADULTS) { 21 | p0: 22 | either { \* adults can enter anytime 23 | num_adults := num_adults + 1; 24 | } 25 | or { \* adults exit if insufficient children 26 | await (num_children = 0 \/ num_adults * 3 > num_children); 27 | await (num_adults > 1); \* deadlock if specify gt zero 28 | num_adults := num_adults - 1; 29 | \* print <>; 30 | }; 31 | assert IsEnvSafe(num_children, num_adults); 32 | } 33 | 34 | process (c \in 1..CHILDREN) { 35 | p0: 36 | either { \* children enter if there are sufficient adults 37 | await (num_adults > 0 /\ num_adults * 3 > num_children); 38 | num_children := num_children + 1; 39 | \* print <>; 40 | } 41 | or { \* children can exit anytime 42 | await (num_children > 0); 43 | num_children := num_children - 1; 44 | }; 45 | assert IsEnvSafe(num_children, num_adults); 46 | } 47 | } 48 | *) 49 | \* BEGIN TRANSLATION 50 | \* Label p0 of process a at line 22 col 25 changed to p0_ 51 | VARIABLES num_children, num_adults, pc 52 | 53 | vars == << num_children, num_adults, pc >> 54 | 55 | ProcSet == (1..ADULTS) \cup (1..CHILDREN) 56 | 57 | Init == (* Global variables *) 58 | /\ num_children = 0 59 | /\ num_adults = 0 60 | /\ pc = [self \in ProcSet |-> CASE self \in 1..ADULTS -> "p0_" 61 | [] self \in 1..CHILDREN -> "p0"] 62 | 63 | p0_(self) == /\ pc[self] = "p0_" 64 | /\ \/ /\ num_adults' = num_adults + 1 65 | \/ /\ (num_children = 0 \/ num_adults * 3 > num_children) 66 | /\ (num_adults > 1) 67 | /\ num_adults' = num_adults - 1 68 | /\ Assert(IsEnvSafe(num_children, num_adults'), 69 | "Failure of assertion at line 31, column 17.") 70 | /\ pc' = [pc EXCEPT ![self] = "Done"] 71 | /\ UNCHANGED num_children 72 | 73 | a(self) == p0_(self) 74 | 75 | p0(self) == /\ pc[self] = "p0" 76 | /\ \/ /\ (num_adults > 0 /\ num_adults * 3 > num_children) 77 | /\ num_children' = num_children + 1 78 | \/ /\ (num_children > 0) 79 | /\ num_children' = num_children - 1 80 | /\ Assert(IsEnvSafe(num_children', num_adults), 81 | "Failure of assertion at line 45, column 17.") 82 | /\ pc' = [pc EXCEPT ![self] = "Done"] 83 | /\ UNCHANGED num_adults 84 | 85 | c(self) == p0(self) 86 | 87 | Next == (\E self \in 1..ADULTS: a(self)) 88 | \/ (\E self \in 1..CHILDREN: c(self)) 89 | \/ (* Disjunct to prevent deadlock on termination *) 90 | ((\A self \in ProcSet: pc[self] = "Done") /\ UNCHANGED vars) 91 | 92 | Spec == Init /\ [][Next]_vars 93 | 94 | Termination == <>(\A self \in ProcSet: pc[self] = "Done") 95 | 96 | \* END TRANSLATION 97 | 98 | ============================================================================= 99 | \* Modification History 100 | \* Last modified Mon Oct 09 12:50:02 IST 2017 by sandeep 101 | \* Created Mon Oct 09 12:32:27 IST 2017 by sandeep 102 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare2.cfg: -------------------------------------------------------------------------------- 1 | SPECIFICATION Spec 2 | \* Add statements after this line. 3 | 4 | CONSTANT ADULTS = 2 5 | CONSTANT CHILDREN = 10 6 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare2.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE childcare2 ------------------------------- 2 | EXTENDS Naturals, Sequences, TLC, Reals 3 | 4 | CONSTANTS CHILDREN, 5 | ADULTS 6 | 7 | ASSUME /\ CHILDREN > 0 8 | /\ ADULTS > 0 9 | 10 | IsEnvSafe(num_children, num_adults) == (num_children = 0) 11 | \/ ((num_adults * 3) >= num_children) 12 | 13 | 14 | (* 15 | --algorithm FairProcess { 16 | 17 | variables num_children = 0, 18 | num_adults = 0 19 | 20 | process (a \in 1..ADULTS) { 21 | a0: num_adults := num_adults + 1; 22 | a1 : { 23 | await ((num_adults - 1) * 3 >= num_children); 24 | num_adults := num_adults - 1; 25 | assert IsEnvSafe(num_children, num_adults); 26 | }; 27 | goto a0; 28 | } 29 | 30 | process (c \in 1..CHILDREN) { 31 | p0: { \* children enter if there are sufficient adults 32 | await (num_adults * 3 >= num_children + 1); 33 | num_children := num_children + 1; 34 | assert IsEnvSafe(num_children, num_adults); 35 | }; 36 | p1 : num_children := num_children - 1; 37 | goto p0; 38 | } 39 | } 40 | *) 41 | \* BEGIN TRANSLATION 42 | VARIABLES num_children, num_adults, pc 43 | 44 | vars == << num_children, num_adults, pc >> 45 | 46 | ProcSet == (1..ADULTS) \cup (1..CHILDREN) 47 | 48 | Init == (* Global variables *) 49 | /\ num_children = 0 50 | /\ num_adults = 0 51 | /\ pc = [self \in ProcSet |-> CASE self \in 1..ADULTS -> "a0" 52 | [] self \in 1..CHILDREN -> "p0"] 53 | 54 | a0(self) == /\ pc[self] = "a0" 55 | /\ num_adults' = num_adults + 1 56 | /\ pc' = [pc EXCEPT ![self] = "a1"] 57 | /\ UNCHANGED num_children 58 | 59 | a1(self) == /\ pc[self] = "a1" 60 | /\ (num_children = 0 \/ ((num_adults - 1) * 3 >= num_children)) 61 | /\ num_adults' = num_adults - 1 62 | /\ Assert(IsEnvSafe(num_children, num_adults'), 63 | "Failure of assertion at line 25, column 33.") 64 | /\ pc' = [pc EXCEPT ![self] = "a0"] 65 | /\ UNCHANGED num_children 66 | 67 | a(self) == a0(self) \/ a1(self) 68 | 69 | p0(self) == /\ pc[self] = "p0" 70 | /\ (num_adults * 3 >= num_children + 1) 71 | /\ num_children' = num_children + 1 72 | /\ Assert(IsEnvSafe(num_children', num_adults), 73 | "Failure of assertion at line 34, column 33.") 74 | /\ pc' = [pc EXCEPT ![self] = "p1"] 75 | /\ UNCHANGED num_adults 76 | 77 | p1(self) == /\ pc[self] = "p1" 78 | /\ num_children' = num_children - 1 79 | /\ pc' = [pc EXCEPT ![self] = "p0"] 80 | /\ UNCHANGED num_adults 81 | 82 | c(self) == p0(self) \/ p1(self) 83 | 84 | Next == (\E self \in 1..ADULTS: a(self)) 85 | \/ (\E self \in 1..CHILDREN: c(self)) 86 | \/ (* Disjunct to prevent deadlock on termination *) 87 | ((\A self \in ProcSet: pc[self] = "Done") /\ UNCHANGED vars) 88 | 89 | Spec == Init /\ [][Next]_vars 90 | 91 | Termination == <>(\A self \in ProcSet: pc[self] = "Done") 92 | 93 | \* END TRANSLATION 94 | 95 | ============================================================================= 96 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare2_fail.cfg: -------------------------------------------------------------------------------- 1 | SPECIFICATION Spec 2 | \* Add statements after this line. 3 | 4 | CONSTANT ADULTS = 2 5 | CONSTANT CHILDREN = 10 6 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare2_fail.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE childcare2_fail ------------------------------- 2 | EXTENDS Naturals, Sequences, TLC, Reals 3 | 4 | CONSTANTS CHILDREN, 5 | ADULTS 6 | 7 | ASSUME /\ CHILDREN > 0 8 | /\ ADULTS > 0 9 | 10 | IsEnvSafe(num_children, num_adults) == (num_children = 0) 11 | \/ ((num_adults * 3) >= num_children) 12 | 13 | 14 | (* 15 | --algorithm FairProcess { 16 | 17 | variables num_children = 0, 18 | num_adults = 0 19 | 20 | process (a \in 1..ADULTS) { 21 | a0: num_adults := num_adults + 1; 22 | a1 : { 23 | await ((num_adults) * 3 >= num_children); 24 | num_adults := num_adults - 1; 25 | assert IsEnvSafe(num_children, num_adults); 26 | }; 27 | goto a0; 28 | } 29 | 30 | process (c \in 1..CHILDREN) { 31 | p0: { \* children enter if there are sufficient adults 32 | await (num_adults * 3 >= num_children + 1); 33 | num_children := num_children + 1; 34 | assert IsEnvSafe(num_children, num_adults); 35 | }; 36 | p1 : { 37 | num_children := num_children - 1; 38 | }; 39 | goto p0; 40 | } 41 | } 42 | *) 43 | \* BEGIN TRANSLATION 44 | VARIABLES num_children, num_adults, pc 45 | 46 | vars == << num_children, num_adults, pc >> 47 | 48 | ProcSet == (1..ADULTS) \cup (1..CHILDREN) 49 | 50 | Init == (* Global variables *) 51 | /\ num_children = 0 52 | /\ num_adults = 0 53 | /\ pc = [self \in ProcSet |-> CASE self \in 1..ADULTS -> "a0" 54 | [] self \in 1..CHILDREN -> "p0"] 55 | 56 | a0(self) == /\ pc[self] = "a0" 57 | /\ num_adults' = num_adults + 1 58 | /\ pc' = [pc EXCEPT ![self] = "a1"] 59 | /\ UNCHANGED num_children 60 | 61 | a1(self) == /\ pc[self] = "a1" 62 | /\ (num_children = 0 \/ ((num_adults - 1) * 3 >= num_children)) 63 | /\ num_adults' = num_adults - 1 64 | /\ Assert(IsEnvSafe(num_children, num_adults'), 65 | "Failure of assertion at line 25, column 33.") 66 | /\ pc' = [pc EXCEPT ![self] = "a0"] 67 | /\ UNCHANGED num_children 68 | 69 | a(self) == a0(self) \/ a1(self) 70 | 71 | p0(self) == /\ pc[self] = "p0" 72 | /\ (num_adults * 3 >= num_children + 1) 73 | /\ num_children' = num_children + 1 74 | /\ Assert(IsEnvSafe(num_children', num_adults), 75 | "Failure of assertion at line 34, column 33.") 76 | /\ pc' = [pc EXCEPT ![self] = "p1"] 77 | /\ UNCHANGED num_adults 78 | 79 | p1(self) == /\ pc[self] = "p1" 80 | /\ num_children' = num_children - 1 81 | /\ pc' = [pc EXCEPT ![self] = "p0"] 82 | /\ UNCHANGED num_adults 83 | 84 | c(self) == p0(self) \/ p1(self) 85 | 86 | Next == (\E self \in 1..ADULTS: a(self)) 87 | \/ (\E self \in 1..CHILDREN: c(self)) 88 | \/ (* Disjunct to prevent deadlock on termination *) 89 | ((\A self \in ProcSet: pc[self] = "Done") /\ UNCHANGED vars) 90 | 91 | Spec == Init /\ [][Next]_vars 92 | 93 | Termination == <>(\A self \in ProcSet: pc[self] = "Done") 94 | 95 | \* END TRANSLATION 96 | 97 | ============================================================================= 98 | -------------------------------------------------------------------------------- /pluscal/childcare/childcare2_fail.txt: -------------------------------------------------------------------------------- 1 | 2 | Error: TLC threw an unexpected exception. 3 | This was probably caused by an error in the spec or model. 4 | See the User Output or TLC Console for clues to what happened. 5 | The exception was a tlc2.tool.EvalException 6 | : Attempted to apply the operator overridden by the Java method 7 | public static tlc2.value.Value tlc2.module.TLC.Assert(tlc2.value.Value,tlc2.value.Value), 8 | but it produced the following error: 9 | The first argument of Assert evaluated to FALSE; the second argument was: 10 | "Failure of assertion at line 25, column 33." 11 | Error: The behavior up to this point is: 12 | State 1: 13 | /\ num_adults = 0 14 | /\ num_children = 0 15 | /\ pc = <<"a0", "a0", "p0", "p0", "p0", "p0", "p0", "p0", "p0", "p0">> 16 | 17 | State 2: 18 | /\ num_adults = 1 19 | /\ num_children = 0 20 | /\ pc = <<"a1", "a0", "p0", "p0", "p0", "p0", "p0", "p0", "p0", "p0">> 21 | 22 | State 3: 23 | /\ num_adults = 1 24 | /\ num_children = 1 25 | /\ pc = <<"a1", "a0", "p1", "p0", "p0", "p0", "p0", "p0", "p0", "p0">> 26 | 27 | Error: The error occurred when TLC was evaluating the nested 28 | expressions at the following positions: 29 | 0. Line 59, column 16 to line 59, column 30 in child2 30 | 1. Line 60, column 17 to line 60, column 70 in child2 31 | 2. Line 60, column 38 to line 60, column 69 in child2 32 | 3. Line 61, column 16 to line 61, column 43 in child2 33 | 4. Line 62, column 16 to line 63, column 68 in child2 34 | 35 | 36 | 37 states generated, 28 distinct states found, 23 states left on queue. 37 | The depth of the complete state graph search is 4. 38 | Finished in 00s at (2017-10-17 22:56:17) 39 | 40 | -------------------------------------------------------------------------------- /pluscal/childcare/output.txt: -------------------------------------------------------------------------------- 1 | Running in Model-Checking mode with 1 worker. 2 | [...snip...] 3 | Starting... (2017-10-11 07:45:52) 4 | Computing initial states... 5 | Finished computing initial states: 1 distinct state generated. 6 | Model checking completed. No error has been found. 7 | Estimates of the probability that TLC did not check all reachable states 8 | because two distinct states had the same fingerprint: 9 | calculated (optimistic): val = 9.1E-13 10 | based on the actual fingerprints: val = 7.9E-13 11 | 11549 states generated, 1702 distinct states found, 0 states left on queue. 12 | The depth of the complete state graph search is 11. 13 | Finished in 01s at (2017-10-11 07:45:53) 14 | -------------------------------------------------------------------------------- /pluscal/dining_philosophers/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Problem 3 | 4 | [Dining Philosophers Problem](https://en.wikipedia.org/wiki/Dining_philosophers_problem) 5 | 6 | # How to run 7 | 8 | ``` 9 | export CLASSPATH=`location of tla2tools.jar`/* 10 | 11 | java pcal.trans dining_deadlock.tla 12 | java tlc2.TLC ./dining_deadlock.tla 13 | 14 | java pcal.trans dining_no_deadlock.tla 15 | java tlc2.TLC ./dining_no_deadlock.tla 16 | ``` 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /pluscal/dining_philosophers/dining_deadlock.output.txt: -------------------------------------------------------------------------------- 1 | 2 | TLC2 Version 2.09 of 10 March 2017 (rev: 3c9e4be) 3 | Running in Model-Checking mode with 1 worker. 4 | Parsing file dining_deadlock.tla 5 | Parsing file /tmp/Naturals.tla 6 | Parsing file /tmp/TLC.tla 7 | Parsing file /tmp/Sequences.tla 8 | Semantic processing of module Naturals 9 | Semantic processing of module Sequences 10 | Semantic processing of module TLC 11 | Semantic processing of module dining_deadlock 12 | Starting... (2017-10-15 06:18:09) 13 | Computing initial states... 14 | Finished computing initial states: 1 distinct state generated. 15 | Error: Deadlock reached. 16 | Error: The behavior up to this point is: 17 | [...] 18 | 19 | 6677 states generated, 2071 distinct states found, 482 states left on queue. 20 | The depth of the complete state graph search is 12. 21 | Finished in 00s at (2017-10-15 06:18:09) 22 | 23 | -------------------------------------------------------------------------------- /pluscal/dining_philosophers/dining_deadlock.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE dining_deadlock ------------------------------- 2 | EXTENDS Naturals, TLC 3 | 4 | \* this solution will Deadlock 5 | 6 | (* 7 | --algorithm DiningPhilosophers { 8 | 9 | variable forks = [i \in 1..5 |-> FALSE]; 10 | 11 | process (ph \in 1..5) 12 | variables left; right; 13 | { 14 | 15 | init: { 16 | right := self; 17 | if (self = 1) left := 5; else left := self - 1; 18 | }; 19 | 20 | wait_first_fork: 21 | { 22 | await(forks[left] = FALSE); 23 | forks[left] := TRUE; 24 | }; 25 | 26 | wait_second_fork: 27 | { 28 | await(forks[right] = FALSE); 29 | forks[right] := TRUE; 30 | }; 31 | 32 | done_eating: 33 | { 34 | forks[right] := FALSE; 35 | \* fake label fl2 otherwise won't compile 36 | fl2: forks[left] := FALSE; 37 | } 38 | 39 | } 40 | 41 | } (* end Dining *) 42 | *) 43 | \* BEGIN TRANSLATION 44 | CONSTANT defaultInitValue 45 | VARIABLES forks, pc, left, right 46 | 47 | vars == << forks, pc, left, right >> 48 | 49 | ProcSet == (1..5) 50 | 51 | Init == (* Global variables *) 52 | /\ forks = [i \in 1..5 |-> FALSE] 53 | (* Process ph *) 54 | /\ left = [self \in 1..5 |-> defaultInitValue] 55 | /\ right = [self \in 1..5 |-> defaultInitValue] 56 | /\ pc = [self \in ProcSet |-> "init"] 57 | 58 | init(self) == /\ pc[self] = "init" 59 | /\ right' = [right EXCEPT ![self] = self] 60 | /\ IF self = 1 61 | THEN /\ left' = [left EXCEPT ![self] = 5] 62 | ELSE /\ left' = [left EXCEPT ![self] = self - 1] 63 | /\ pc' = [pc EXCEPT ![self] = "wait_first_fork"] 64 | /\ forks' = forks 65 | 66 | wait_first_fork(self) == /\ pc[self] = "wait_first_fork" 67 | /\ (forks[left[self]] = FALSE) 68 | /\ forks' = [forks EXCEPT ![left[self]] = TRUE] 69 | /\ pc' = [pc EXCEPT ![self] = "wait_second_fork"] 70 | /\ UNCHANGED << left, right >> 71 | 72 | wait_second_fork(self) == /\ pc[self] = "wait_second_fork" 73 | /\ (forks[right[self]] = FALSE) 74 | /\ forks' = [forks EXCEPT ![right[self]] = TRUE] 75 | /\ pc' = [pc EXCEPT ![self] = "done_eating"] 76 | /\ UNCHANGED << left, right >> 77 | 78 | done_eating(self) == /\ pc[self] = "done_eating" 79 | /\ forks' = [forks EXCEPT ![right[self]] = FALSE] 80 | /\ pc' = [pc EXCEPT ![self] = "fl2"] 81 | /\ UNCHANGED << left, right >> 82 | 83 | fl2(self) == /\ pc[self] = "fl2" 84 | /\ forks' = [forks EXCEPT ![left[self]] = FALSE] 85 | /\ pc' = [pc EXCEPT ![self] = "Done"] 86 | /\ UNCHANGED << left, right >> 87 | 88 | ph(self) == init(self) \/ wait_first_fork(self) \/ wait_second_fork(self) 89 | \/ done_eating(self) \/ fl2(self) 90 | 91 | Next == (\E self \in 1..5: ph(self)) 92 | \/ (* Disjunct to prevent deadlock on termination *) 93 | ((\A self \in ProcSet: pc[self] = "Done") /\ UNCHANGED vars) 94 | 95 | Spec == Init /\ [][Next]_vars 96 | 97 | Termination == <>(\A self \in ProcSet: pc[self] = "Done") 98 | 99 | \* END TRANSLATION 100 | 101 | ============================================================================= 102 | -------------------------------------------------------------------------------- /pluscal/dining_philosophers/dining_no_deadlock.output.txt: -------------------------------------------------------------------------------- 1 | 2 | TLC2 Version 2.09 of 10 March 2017 (rev: 3c9e4be) 3 | Running in Model-Checking mode with 1 worker. 4 | Parsing file dining_no_deadlock.tla 5 | Parsing file /tmp/Naturals.tla 6 | Parsing file /tmp/TLC.tla 7 | Parsing file /tmp/Sequences.tla 8 | Semantic processing of module Naturals 9 | Semantic processing of module Sequences 10 | Semantic processing of module TLC 11 | Semantic processing of module dining_no_deadlock 12 | Starting... (2017-10-15 06:22:42) 13 | Computing initial states... 14 | Finished computing initial states: 1 distinct state generated. 15 | Model checking completed. No error has been found. 16 | Estimates of the probability that TLC did not check all reachable states 17 | because two distinct states had the same fingerprint: 18 | calculated (optimistic): val = 4.3E-12 19 | based on the actual fingerprints: val = 2.5E-12 20 | 19794 states generated, 5619 distinct states found, 0 states left on queue. 21 | The depth of the complete state graph search is 27. 22 | Finished in 01s at (2017-10-15 06:22:43) 23 | 24 | -------------------------------------------------------------------------------- /pluscal/dining_philosophers/dining_no_deadlock.tla: -------------------------------------------------------------------------------- 1 | ------------------------------- MODULE dining_no_deadlock ------------------------------- 2 | EXTENDS Naturals, TLC 3 | 4 | \* Pluscal options (-termination) 5 | 6 | 7 | (* 8 | --algorithm Dining { 9 | 10 | variable forks = [i \in 1..5 |-> FALSE]; 11 | 12 | fair process (ph \in 1..5) 13 | variables tmp; left; right; 14 | { 15 | 16 | init: 17 | right := self; 18 | if (self = 1) left := 5; else left := self - 1; 19 | \* Introduce asymmetry; one philosopher picks right before left 20 | if (self = 3) { 21 | tmp := left; 22 | fake_label1: left := right; 23 | right := tmp; 24 | }; 25 | 26 | wait_one: { 27 | await(forks[left] = FALSE); 28 | forks[left] := TRUE; 29 | }; 30 | 31 | wait_two: { 32 | await(forks[right] = FALSE); 33 | forks[right] := TRUE; 34 | }; 35 | 36 | done_eating: 37 | { 38 | forks[right] := FALSE; 39 | fake_label2: forks[left] := FALSE; 40 | } 41 | 42 | } 43 | 44 | } (* end Dining *) 45 | *) 46 | \* BEGIN TRANSLATION 47 | CONSTANT defaultInitValue 48 | VARIABLES forks, pc, tmp, left, right 49 | 50 | vars == << forks, pc, tmp, left, right >> 51 | 52 | ProcSet == (1..5) 53 | 54 | Init == (* Global variables *) 55 | /\ forks = [i \in 1..5 |-> FALSE] 56 | (* Process ph *) 57 | /\ tmp = [self \in 1..5 |-> defaultInitValue] 58 | /\ left = [self \in 1..5 |-> defaultInitValue] 59 | /\ right = [self \in 1..5 |-> defaultInitValue] 60 | /\ pc = [self \in ProcSet |-> "init"] 61 | 62 | init(self) == /\ pc[self] = "init" 63 | /\ right' = [right EXCEPT ![self] = self] 64 | /\ IF self = 1 65 | THEN /\ left' = [left EXCEPT ![self] = 5] 66 | ELSE /\ left' = [left EXCEPT ![self] = self - 1] 67 | /\ IF self = 3 68 | THEN /\ tmp' = [tmp EXCEPT ![self] = left'[self]] 69 | /\ pc' = [pc EXCEPT ![self] = "fake_label1"] 70 | ELSE /\ pc' = [pc EXCEPT ![self] = "wait_one"] 71 | /\ tmp' = tmp 72 | /\ forks' = forks 73 | 74 | fake_label1(self) == /\ pc[self] = "fake_label1" 75 | /\ left' = [left EXCEPT ![self] = right[self]] 76 | /\ right' = [right EXCEPT ![self] = tmp[self]] 77 | /\ pc' = [pc EXCEPT ![self] = "wait_one"] 78 | /\ UNCHANGED << forks, tmp >> 79 | 80 | wait_one(self) == /\ pc[self] = "wait_one" 81 | /\ (forks[left[self]] = FALSE) 82 | /\ forks' = [forks EXCEPT ![left[self]] = TRUE] 83 | /\ pc' = [pc EXCEPT ![self] = "wait_two"] 84 | /\ UNCHANGED << tmp, left, right >> 85 | 86 | wait_two(self) == /\ pc[self] = "wait_two" 87 | /\ (forks[right[self]] = FALSE) 88 | /\ forks' = [forks EXCEPT ![right[self]] = TRUE] 89 | /\ pc' = [pc EXCEPT ![self] = "done_eating"] 90 | /\ UNCHANGED << tmp, left, right >> 91 | 92 | done_eating(self) == /\ pc[self] = "done_eating" 93 | /\ forks' = [forks EXCEPT ![right[self]] = FALSE] 94 | /\ pc' = [pc EXCEPT ![self] = "fake_label2"] 95 | /\ UNCHANGED << tmp, left, right >> 96 | 97 | fake_label2(self) == /\ pc[self] = "fake_label2" 98 | /\ forks' = [forks EXCEPT ![left[self]] = FALSE] 99 | /\ pc' = [pc EXCEPT ![self] = "Done"] 100 | /\ UNCHANGED << tmp, left, right >> 101 | 102 | ph(self) == init(self) \/ fake_label1(self) \/ wait_one(self) 103 | \/ wait_two(self) \/ done_eating(self) \/ fake_label2(self) 104 | 105 | Next == (\E self \in 1..5: ph(self)) 106 | \/ (* Disjunct to prevent deadlock on termination *) 107 | ((\A self \in ProcSet: pc[self] = "Done") /\ UNCHANGED vars) 108 | 109 | Spec == /\ Init /\ [][Next]_vars 110 | /\ \A self \in 1..5 : WF_vars(ph(self)) 111 | 112 | Termination == <>(\A self \in ProcSet: pc[self] = "Done") 113 | 114 | \* END TRANSLATION 115 | 116 | ============================================================================= 117 | --------------------------------------------------------------------------------