├── .github └── workflows │ └── main.yml ├── .gitignore ├── CODE_OF_CONDUCT.md ├── Cargo.toml ├── LICENSE-APACHE ├── LICENSE-MIT ├── README.md ├── RELEASES.md ├── examples ├── borrow_check.rs └── graspan1.rs ├── src ├── iteration.rs ├── join.rs ├── lib.rs ├── map.rs ├── merge.rs ├── relation.rs ├── test.rs ├── treefrog.rs └── variable.rs └── triagebot.toml /.github/workflows/main.yml: -------------------------------------------------------------------------------- 1 | 2 | name: CI 3 | 4 | on: 5 | push: 6 | branches: [ master ] 7 | pull_request: 8 | 9 | jobs: 10 | test: 11 | name: Run tests 12 | runs-on: ubuntu-latest 13 | continue-on-error: ${{ matrix.rust == 'nightly' }} 14 | strategy: 15 | matrix: 16 | rust: [beta, nightly] 17 | steps: 18 | - uses: actions/checkout@v2 19 | with: 20 | fetch-depth: 1 21 | 22 | - name: Install rust toolchain 23 | uses: actions-rs/toolchain@v1 24 | with: 25 | toolchain: ${{ matrix.rust }} 26 | profile: minimal 27 | override: true 28 | 29 | - name: Build datafrog 30 | run: cargo build 31 | 32 | - name: Execute tests 33 | run: cargo test 34 | 35 | - name: Check examples 36 | run: cargo check --examples 37 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Generated by Cargo 2 | # will have compiled files and executables 3 | /target/ 4 | 5 | # Remove Cargo.lock from gitignore if creating an executable, leave it for libraries 6 | # More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html 7 | Cargo.lock 8 | 9 | # These are backup files generated by rustfmt 10 | **/*.rs.bk 11 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # The Rust Code of Conduct 2 | 3 | A version of this document [can be found online](https://www.rust-lang.org/conduct.html). 4 | 5 | ## Conduct 6 | 7 | **Contact**: [rust-mods@rust-lang.org](mailto:rust-mods@rust-lang.org) 8 | 9 | * We are committed to providing a friendly, safe and welcoming environment for all, regardless of level of experience, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, religion, nationality, or other similar characteristic. 10 | * On IRC, please avoid using overtly sexual nicknames or other nicknames that might detract from a friendly, safe and welcoming environment for all. 11 | * Please be kind and courteous. There's no need to be mean or rude. 12 | * Respect that people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer. 13 | * Please keep unstructured critique to a minimum. If you have solid ideas you want to experiment with, make a fork and see how it works. 14 | * We will exclude you from interaction if you insult, demean or harass anyone. That is not welcome behavior. We interpret the term "harassment" as including the definition in the Citizen Code of Conduct; if you have any lack of clarity about what might be included in that concept, please read their definition. In particular, we don't tolerate behavior that excludes people in socially marginalized groups. 15 | * Private harassment is also unacceptable. No matter who you are, if you feel you have been or are being harassed or made uncomfortable by a community member, please contact one of the channel ops or any of the [Rust moderation team][mod_team] immediately. Whether you're a regular contributor or a newcomer, we care about making this community a safe place for you and we've got your back. 16 | * Likewise any spamming, trolling, flaming, baiting or other attention-stealing behavior is not welcome. 17 | 18 | ## Moderation 19 | 20 | 21 | These are the policies for upholding our community's standards of conduct. If you feel that a thread needs moderation, please contact the [Rust moderation team][mod_team]. 22 | 23 | 1. Remarks that violate the Rust standards of conduct, including hateful, hurtful, oppressive, or exclusionary remarks, are not allowed. (Cursing is allowed, but never targeting another user, and never in a hateful manner.) 24 | 2. Remarks that moderators find inappropriate, whether listed in the code of conduct or not, are also not allowed. 25 | 3. Moderators will first respond to such remarks with a warning. 26 | 4. If the warning is unheeded, the user will be "kicked," i.e., kicked out of the communication channel to cool off. 27 | 5. If the user comes back and continues to make trouble, they will be banned, i.e., indefinitely excluded. 28 | 6. Moderators may choose at their discretion to un-ban the user if it was a first offense and they offer the offended party a genuine apology. 29 | 7. If a moderator bans someone and you think it was unjustified, please take it up with that moderator, or with a different moderator, **in private**. Complaints about bans in-channel are not allowed. 30 | 8. Moderators are held to a higher standard than other community members. If a moderator creates an inappropriate situation, they should expect less leeway than others. 31 | 32 | In the Rust community we strive to go the extra step to look out for each other. Don't just aim to be technically unimpeachable, try to be your best self. In particular, avoid flirting with offensive or sensitive issues, particularly if they're off-topic; this all too often leads to unnecessary fights, hurt feelings, and damaged trust; worse, it can drive people away from the community entirely. 33 | 34 | And if someone takes issue with something you said or did, resist the urge to be defensive. Just stop doing what it was they complained about and apologize. Even if you feel you were misinterpreted or unfairly accused, chances are good there was something you could've communicated better — remember that it's your responsibility to make your fellow Rustaceans comfortable. Everyone wants to get along and we are all here first and foremost because we want to talk about cool technology. You will find that people will be eager to assume good intent and forgive as long as you earn their trust. 35 | 36 | The enforcement policies listed above apply to all official Rust venues; including official IRC channels (#rust, #rust-internals, #rust-tools, #rust-libs, #rustc, #rust-beginners, #rust-docs, #rust-community, #rust-lang, and #cargo); GitHub repositories under rust-lang, rust-lang-nursery, and rust-lang-deprecated; and all forums under rust-lang.org (users.rust-lang.org, internals.rust-lang.org). For other projects adopting the Rust Code of Conduct, please contact the maintainers of those projects for enforcement. If you wish to use this code of conduct for your own project, consider explicitly mentioning your moderation policy or making a copy with your own moderation policy so as to avoid confusion. 37 | 38 | *Adapted from the [Node.js Policy on Trolling](http://blog.izs.me/post/30036893703/policy-on-trolling) as well as the [Contributor Covenant v1.3.0](https://www.contributor-covenant.org/version/1/3/0/).* 39 | 40 | [mod_team]: https://www.rust-lang.org/team.html#Moderation-team 41 | -------------------------------------------------------------------------------- /Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "datafrog" 3 | version = "2.0.1" 4 | authors = ["Frank McSherry ", "The Rust Project Developers", "Datafrog Developers"] 5 | license = "Apache-2.0/MIT" 6 | description = "Lightweight Datalog engine intended to be embedded in other Rust programs" 7 | readme = "README.md" 8 | keywords = ["datalog", "analysis"] 9 | repository = "https://github.com/rust-lang-nursery/datafrog" 10 | edition = "2018" 11 | 12 | [badges] 13 | is-it-maintained-issue-resolution = { repository = "https://github.com/rust-lang-nursery/datafrog" } 14 | is-it-maintained-open-issues = { repository = "https://github.com/rust-lang-nursery/datafrog" } 15 | 16 | [dev-dependencies] 17 | proptest = "0.8.7" 18 | -------------------------------------------------------------------------------- /LICENSE-APACHE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /LICENSE-MIT: -------------------------------------------------------------------------------- 1 | Permission is hereby granted, free of charge, to any 2 | person obtaining a copy of this software and associated 3 | documentation files (the "Software"), to deal in the 4 | Software without restriction, including without 5 | limitation the rights to use, copy, modify, merge, 6 | publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software 8 | is furnished to do so, subject to the following 9 | conditions: 10 | 11 | The above copyright notice and this permission notice 12 | shall be included in all copies or substantial portions 13 | of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF 16 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED 17 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A 18 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 19 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 20 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR 22 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 23 | DEALINGS IN THE SOFTWARE. 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # datafrog 2 | 3 | Datafrog is a lightweight Datalog engine intended to be embedded in other Rust programs. 4 | 5 | Datafrog has no runtime, and relies on you to build and repeatedly apply the update rules. 6 | It tries to help you do this correctly. As an example, here is how you might write a reachability 7 | query using Datafrog (minus the part where we populate the `nodes` and `edges` initial relations). 8 | 9 | ```rust 10 | extern crate datafrog; 11 | use datafrog::Iteration; 12 | 13 | fn main() { 14 | // Prepare initial values, .. 15 | let nodes: Vec<(u32,u32)> = vec![ 16 | // .. 17 | ]; 18 | let edges: Vec<(u32,u32)> = vec![ 19 | // .. 20 | ]; 21 | 22 | // Create a new iteration context, .. 23 | let mut iteration = Iteration::new(); 24 | 25 | // .. some variables, .. 26 | let nodes_var = iteration.variable::<(u32,u32)>("nodes"); 27 | let edges_var = iteration.variable::<(u32,u32)>("edges"); 28 | 29 | // .. load them with some initial values, .. 30 | nodes_var.insert(nodes.into()); 31 | edges_var.insert(edges.into()); 32 | 33 | // .. and then start iterating rules! 34 | while iteration.changed() { 35 | // nodes(a,c) <- nodes(a,b), edges(b,c) 36 | nodes_var.from_join(&nodes_var, &edges_var, |_b, &a, &c| (c,a)); 37 | } 38 | 39 | // extract the final results. 40 | let reachable: Vec<(u32,u32)> = nodes_var.complete(); 41 | } 42 | ``` 43 | 44 | If you'd like to read more about how it works, check out [this blog post](https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md). 45 | 46 | ## Authorship 47 | 48 | Datafrog was initially developed by [Frank McSherry][fmc] and was 49 | later transferred to the rust-lang-nursery organization. Thanks Frank! 50 | 51 | [fmc]: https://github.com/frankmcsherry 52 | -------------------------------------------------------------------------------- /RELEASES.md: -------------------------------------------------------------------------------- 1 | # 2.0.1 2 | 3 | - Work around a rustdoc ICE (#24) 4 | 5 | # 2.0.0 6 | 7 | - Breaking changes: 8 | - leapjoin now takes a tuple of leapers, and not a `&mut` slice: 9 | - `from_leapjoin(&input, &mut [&mut foo.extend_with(...), ..], ..)` becomes 10 | `from_leapjoin(&input, (foo.extend_with(...), ..), ..)` 11 | - if there is only one leaper, no tuple is needed 12 | - `Relation::from` now requires a vector, not an iterator; use 13 | `Relation::from_iter` instead 14 | - Changed the API to permit using `Relation` and `Variable` more interchangeably, 15 | and added a number of operations to construct relations directly, like `Relation::from_join` 16 | - Extended leapfrog triejoin with new operations (`PrefixFilter` and `ValueFilter`) 17 | 18 | # 1.0.0 19 | 20 | - Added leapfrog triejoin (#11). 21 | - Have badges and repo links now! 22 | - Minor performance improvements (#13). 23 | 24 | # 0.1.0 25 | 26 | - Initial release. 27 | -------------------------------------------------------------------------------- /examples/borrow_check.rs: -------------------------------------------------------------------------------- 1 | extern crate datafrog; 2 | use datafrog::Iteration; 3 | 4 | type Region = u32; 5 | type Borrow = u32; 6 | type Point = u32; 7 | 8 | fn main() { 9 | let subset = { 10 | // Create a new iteration context, ... 11 | let mut iteration1 = Iteration::new(); 12 | 13 | // .. some variables, .. 14 | let subset = iteration1.variable::<(Region, Region, Point)>("subset"); 15 | 16 | // different indices for `subset`. 17 | let subset_r1p = iteration1.variable::<((Region, Point), Region)>("subset_r1p"); 18 | let subset_r2p = iteration1.variable::<((Region, Point), Region)>("subset_r2p"); 19 | let subset_p = iteration1.variable::<(Point, (Region, Region))>("subset_p"); 20 | 21 | // temporaries as we perform a multi-way join. 22 | let subset_1 = iteration1.variable::<((Region, Point), Region)>("subset_1"); 23 | let subset_2 = iteration1.variable::<((Region, Point), Region)>("subset_2"); 24 | 25 | let region_live_at = iteration1.variable::<((Region, Point), ())>("region_live_at"); 26 | let cfg_edge_p = iteration1.variable::<(Point, Point)>("cfg_edge_p"); 27 | 28 | // load initial facts. 29 | subset.insert(Vec::new().into()); 30 | region_live_at.insert(Vec::new().into()); 31 | cfg_edge_p.insert(Vec::new().into()); 32 | 33 | // .. and then start iterating rules! 34 | while iteration1.changed() { 35 | // remap fields to re-index by keys. 36 | subset_r1p.from_map(&subset, |&(r1, r2, p)| ((r1, p), r2)); 37 | subset_r2p.from_map(&subset, |&(r1, r2, p)| ((r2, p), r1)); 38 | subset_p.from_map(&subset, |&(r1, r2, p)| (p, (r1, r2))); 39 | 40 | // R0: subset(R1, R2, P) :- outlives(R1, R2, P). 41 | // Already loaded; outlives is static. 42 | 43 | // R1: subset(R1, R3, P) :- 44 | // subset(R1, R2, P), 45 | // subset(R2, R3, P). 46 | subset.from_join(&subset_r2p, &subset_r1p, |&(_r2, p), &r1, &r3| (r1, r3, p)); 47 | 48 | // R2: subset(R1, R2, Q) :- 49 | // subset(R1, R2, P), 50 | // cfg_edge(P, Q), 51 | // region_live_at(R1, Q), 52 | // region_live_at(R2, Q). 53 | 54 | subset_1.from_join(&subset_p, &cfg_edge_p, |&_p, &(r1, r2), &q| ((r1, q), r2)); 55 | subset_2.from_join(&subset_1, ®ion_live_at, |&(r1, q), &r2, &()| { 56 | ((r2, q), r1) 57 | }); 58 | subset.from_join(&subset_2, ®ion_live_at, |&(r2, q), &r1, &()| (r1, r2, q)); 59 | } 60 | 61 | subset_r1p.complete() 62 | }; 63 | 64 | let _requires = { 65 | // Create a new iteration context, ... 66 | let mut iteration2 = Iteration::new(); 67 | 68 | // .. some variables, .. 69 | let requires = iteration2.variable::<(Region, Borrow, Point)>("requires"); 70 | requires.insert(Vec::new().into()); 71 | 72 | let requires_rp = iteration2.variable::<((Region, Point), Borrow)>("requires_rp"); 73 | let requires_bp = iteration2.variable::<((Borrow, Point), Region)>("requires_bp"); 74 | 75 | let requires_1 = iteration2.variable::<(Point, (Borrow, Region))>("requires_1"); 76 | let requires_2 = iteration2.variable::<((Region, Point), Borrow)>("requires_2"); 77 | 78 | let subset_r1p = iteration2.variable::<((Region, Point), Region)>("subset_r1p"); 79 | subset_r1p.insert(subset); 80 | 81 | let killed = Vec::new().into(); 82 | let region_live_at = iteration2.variable::<((Region, Point), ())>("region_live_at"); 83 | let cfg_edge_p = iteration2.variable::<(Point, Point)>("cfg_edge_p"); 84 | 85 | // .. and then start iterating rules! 86 | while iteration2.changed() { 87 | requires_rp.from_map(&requires, |&(r, b, p)| ((r, p), b)); 88 | requires_bp.from_map(&requires, |&(r, b, p)| ((b, p), r)); 89 | 90 | // requires(R, B, P) :- borrow_region(R, B, P). 91 | // Already loaded; borrow_region is static. 92 | 93 | // requires(R2, B, P) :- 94 | // requires(R1, B, P), 95 | // subset(R1, R2, P). 96 | requires.from_join(&requires_rp, &subset_r1p, |&(_r1, p), &b, &r2| (r2, b, p)); 97 | 98 | // requires(R, B, Q) :- 99 | // requires(R, B, P), 100 | // !killed(B, P), 101 | // cfg_edge(P, Q), 102 | // (region_live_at(R, Q); universal_region(R)). 103 | 104 | requires_1.from_antijoin(&requires_bp, &killed, |&(b, p), &r| (p, (b, r))); 105 | requires_2.from_join(&requires_1, &cfg_edge_p, |&_p, &(b, r), &q| ((r, q), b)); 106 | requires.from_join(&requires_2, ®ion_live_at, |&(r, q), &b, &()| (r, b, q)); 107 | } 108 | 109 | requires.complete() 110 | }; 111 | 112 | // borrow_live_at(B, P) :- requires(R, B, P), region_live_at(R, P) 113 | 114 | // borrow_live_at(B, P) :- requires(R, B, P), universal_region(R). 115 | } 116 | -------------------------------------------------------------------------------- /examples/graspan1.rs: -------------------------------------------------------------------------------- 1 | extern crate datafrog; 2 | use datafrog::Iteration; 3 | 4 | fn main() { 5 | let timer = ::std::time::Instant::now(); 6 | 7 | // Make space for input data. 8 | let mut nodes = Vec::new(); 9 | let mut edges = Vec::new(); 10 | 11 | // Read input data from a handy file. 12 | use std::fs::File; 13 | use std::io::{BufRead, BufReader}; 14 | 15 | let filename = std::env::args().nth(1).unwrap(); 16 | let file = BufReader::new(File::open(filename).unwrap()); 17 | for readline in file.lines() { 18 | let line = readline.expect("read error"); 19 | if !line.is_empty() && !line.starts_with('#') { 20 | let mut elts = line[..].split_whitespace(); 21 | let src: u32 = elts.next().unwrap().parse().expect("malformed src"); 22 | let dst: u32 = elts.next().unwrap().parse().expect("malformed dst"); 23 | let typ: &str = elts.next().unwrap(); 24 | match typ { 25 | "n" => { 26 | nodes.push((dst, src)); 27 | } 28 | "e" => { 29 | edges.push((src, dst)); 30 | } 31 | unk => panic!("unknown type: {}", unk), 32 | } 33 | } 34 | } 35 | 36 | println!("{:?}\tData loaded", timer.elapsed()); 37 | 38 | // Create a new iteration context, ... 39 | let mut iteration = Iteration::new(); 40 | 41 | // .. some variables, .. 42 | let variable1 = iteration.variable::<(u32, u32)>("nodes"); 43 | let variable2 = iteration.variable::<(u32, u32)>("edges"); 44 | 45 | // .. load them with some initial values, .. 46 | variable1.insert(nodes.into()); 47 | variable2.insert(edges.into()); 48 | 49 | // .. and then start iterating rules! 50 | while iteration.changed() { 51 | // N(a,c) <- N(a,b), E(b,c) 52 | variable1.from_join(&variable1, &variable2, |_b, &a, &c| (c, a)); 53 | } 54 | 55 | let reachable = variable1.complete(); 56 | 57 | println!( 58 | "{:?}\tComputation complete (nodes_final: {})", 59 | timer.elapsed(), 60 | reachable.len() 61 | ); 62 | } 63 | -------------------------------------------------------------------------------- /src/iteration.rs: -------------------------------------------------------------------------------- 1 | use std::io::Write; 2 | 3 | use crate::variable::{Variable, VariableTrait}; 4 | 5 | /// An iterative context for recursive evaluation. 6 | /// 7 | /// An `Iteration` tracks monotonic variables, and monitors their progress. 8 | /// It can inform the user if they have ceased changing, at which point the 9 | /// computation should be done. 10 | #[derive(Default)] 11 | pub struct Iteration { 12 | variables: Vec>, 13 | round: u32, 14 | debug_stats: Option>, 15 | } 16 | 17 | impl Iteration { 18 | /// Create a new iterative context. 19 | pub fn new() -> Self { 20 | Self::default() 21 | } 22 | /// Reports whether any of the monitored variables have changed since 23 | /// the most recent call. 24 | pub fn changed(&mut self) -> bool { 25 | self.round += 1; 26 | 27 | let mut result = false; 28 | for variable in self.variables.iter_mut() { 29 | if variable.changed() { 30 | result = true; 31 | } 32 | 33 | if let Some(ref mut stats_writer) = self.debug_stats { 34 | variable.dump_stats(self.round, stats_writer); 35 | } 36 | } 37 | result 38 | } 39 | /// Creates a new named variable associated with the iterative context. 40 | pub fn variable(&mut self, name: &str) -> Variable { 41 | let variable = Variable::new(name); 42 | self.variables.push(Box::new(variable.clone())); 43 | variable 44 | } 45 | /// Creates a new named variable associated with the iterative context. 46 | /// 47 | /// This variable will not be maintained distinctly, and may advertise tuples as 48 | /// recent multiple times (perhaps unboundedly many times). 49 | pub fn variable_indistinct(&mut self, name: &str) -> Variable { 50 | let mut variable = Variable::new(name); 51 | variable.distinct = false; 52 | self.variables.push(Box::new(variable.clone())); 53 | variable 54 | } 55 | 56 | /// Set up this Iteration to write debug statistics about each variable, 57 | /// for each round of the computation. 58 | pub fn record_stats_to(&mut self, mut w: Box) { 59 | // print column names header 60 | writeln!(w, "Variable,Round,Stable count,Recent count") 61 | .expect("Couldn't write debug stats CSV header"); 62 | 63 | self.debug_stats = Some(w); 64 | } 65 | } 66 | -------------------------------------------------------------------------------- /src/join.rs: -------------------------------------------------------------------------------- 1 | //! Join functionality. 2 | 3 | use super::{Relation, Variable}; 4 | use std::cell::Ref; 5 | use std::ops::Deref; 6 | 7 | /// Implements `join`. Note that `input1` must be a variable, but 8 | /// `input2` can be either a variable or a relation. This is necessary 9 | /// because relations have no "recent" tuples, so the fn would be a 10 | /// guaranteed no-op if both arguments were relations. See also 11 | /// `join_into_relation`. 12 | pub(crate) fn join_into<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>( 13 | input1: &Variable<(Key, Val1)>, 14 | input2: impl JoinInput<'me, (Key, Val2)>, 15 | output: &Variable, 16 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Result, 17 | ) { 18 | let mut results = Vec::new(); 19 | let push_result = |k: &Key, v1: &Val1, v2: &Val2| results.push(logic(k, v1, v2)); 20 | 21 | join_delta(input1, input2, push_result); 22 | 23 | output.insert(Relation::from_vec(results)); 24 | } 25 | 26 | pub(crate) fn join_and_filter_into<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>( 27 | input1: &Variable<(Key, Val1)>, 28 | input2: impl JoinInput<'me, (Key, Val2)>, 29 | output: &Variable, 30 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Option, 31 | ) { 32 | let mut results = Vec::new(); 33 | let push_result = |k: &Key, v1: &Val1, v2: &Val2| { 34 | if let Some(result) = logic(k, v1, v2) { 35 | results.push(result); 36 | } 37 | }; 38 | 39 | join_delta(input1, input2, push_result); 40 | 41 | output.insert(Relation::from_vec(results)); 42 | } 43 | 44 | /// Joins the `recent` tuples of each input with the `stable` tuples of the other, then the 45 | /// `recent` tuples of *both* inputs. 46 | fn join_delta<'me, Key: Ord, Val1: Ord, Val2: Ord>( 47 | input1: &Variable<(Key, Val1)>, 48 | input2: impl JoinInput<'me, (Key, Val2)>, 49 | mut result: impl FnMut(&Key, &Val1, &Val2), 50 | ) { 51 | let recent1 = input1.recent(); 52 | let recent2 = input2.recent(); 53 | 54 | input2.for_each_stable_set(|batch2| { 55 | join_helper(&recent1, &batch2, &mut result); 56 | }); 57 | 58 | input1.for_each_stable_set(|batch1| { 59 | join_helper(&batch1, &recent2, &mut result); 60 | }); 61 | 62 | join_helper(&recent1, &recent2, &mut result); 63 | } 64 | 65 | /// Join, but for two relations. 66 | pub(crate) fn join_into_relation<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>( 67 | input1: &Relation<(Key, Val1)>, 68 | input2: &Relation<(Key, Val2)>, 69 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Result, 70 | ) -> Relation { 71 | let mut results = Vec::new(); 72 | 73 | join_helper(&input1.elements, &input2.elements, |k, v1, v2| { 74 | results.push(logic(k, v1, v2)); 75 | }); 76 | 77 | Relation::from_vec(results) 78 | } 79 | 80 | /// Moves all recent tuples from `input1` that are not present in `input2` into `output`. 81 | pub(crate) fn antijoin( 82 | input1: &Relation<(Key, Val)>, 83 | input2: &Relation, 84 | mut logic: impl FnMut(&Key, &Val) -> Result, 85 | ) -> Relation { 86 | let mut tuples2 = &input2[..]; 87 | 88 | let results = input1 89 | .elements 90 | .iter() 91 | .filter(|(ref key, _)| { 92 | tuples2 = gallop(tuples2, |k| k < key); 93 | tuples2.first() != Some(key) 94 | }) 95 | .map(|(ref key, ref val)| logic(key, val)) 96 | .collect::>(); 97 | 98 | Relation::from_vec(results) 99 | } 100 | 101 | fn join_helper( 102 | mut slice1: &[(K, V1)], 103 | mut slice2: &[(K, V2)], 104 | mut result: impl FnMut(&K, &V1, &V2), 105 | ) { 106 | while !slice1.is_empty() && !slice2.is_empty() { 107 | use std::cmp::Ordering; 108 | 109 | // If the keys match produce tuples, else advance the smaller key until they might. 110 | match slice1[0].0.cmp(&slice2[0].0) { 111 | Ordering::Less => { 112 | slice1 = gallop(slice1, |x| x.0 < slice2[0].0); 113 | } 114 | Ordering::Equal => { 115 | // Determine the number of matching keys in each slice. 116 | let count1 = slice1.iter().take_while(|x| x.0 == slice1[0].0).count(); 117 | let count2 = slice2.iter().take_while(|x| x.0 == slice2[0].0).count(); 118 | 119 | // Produce results from the cross-product of matches. 120 | for index1 in 0..count1 { 121 | for s2 in slice2[..count2].iter() { 122 | result(&slice1[0].0, &slice1[index1].1, &s2.1); 123 | } 124 | } 125 | 126 | // Advance slices past this key. 127 | slice1 = &slice1[count1..]; 128 | slice2 = &slice2[count2..]; 129 | } 130 | Ordering::Greater => { 131 | slice2 = gallop(slice2, |x| x.0 < slice1[0].0); 132 | } 133 | } 134 | } 135 | } 136 | 137 | pub(crate) fn gallop(mut slice: &[T], mut cmp: impl FnMut(&T) -> bool) -> &[T] { 138 | // if empty slice, or already >= element, return 139 | if !slice.is_empty() && cmp(&slice[0]) { 140 | let mut step = 1; 141 | while step < slice.len() && cmp(&slice[step]) { 142 | slice = &slice[step..]; 143 | step <<= 1; 144 | } 145 | 146 | step >>= 1; 147 | while step > 0 { 148 | if step < slice.len() && cmp(&slice[step]) { 149 | slice = &slice[step..]; 150 | } 151 | step >>= 1; 152 | } 153 | 154 | slice = &slice[1..]; // advance one, as we always stayed < value 155 | } 156 | 157 | slice 158 | } 159 | 160 | /// An input that can be used with `from_join`; either a `Variable` or a `Relation`. 161 | pub trait JoinInput<'me, Tuple: Ord>: Copy { 162 | /// If we are on iteration N of the loop, these are the tuples 163 | /// added on iteration N-1. (For a `Relation`, this is always an 164 | /// empty slice.) 165 | type RecentTuples: Deref; 166 | 167 | /// Get the set of recent tuples. 168 | fn recent(self) -> Self::RecentTuples; 169 | 170 | /// Call a function for each set of stable tuples. 171 | fn for_each_stable_set(self, f: impl FnMut(&[Tuple])); 172 | } 173 | 174 | impl<'me, Tuple: Ord> JoinInput<'me, Tuple> for &'me Variable { 175 | type RecentTuples = Ref<'me, [Tuple]>; 176 | 177 | fn recent(self) -> Self::RecentTuples { 178 | Ref::map(self.recent.borrow(), |r| &r.elements[..]) 179 | } 180 | 181 | fn for_each_stable_set(self, mut f: impl FnMut(&[Tuple])) { 182 | for stable in self.stable.borrow().iter() { 183 | f(stable) 184 | } 185 | } 186 | } 187 | 188 | impl<'me, Tuple: Ord> JoinInput<'me, Tuple> for &'me Relation { 189 | type RecentTuples = &'me [Tuple]; 190 | 191 | fn recent(self) -> Self::RecentTuples { 192 | &[] 193 | } 194 | 195 | fn for_each_stable_set(self, mut f: impl FnMut(&[Tuple])) { 196 | f(&self.elements) 197 | } 198 | } 199 | 200 | impl<'me, Tuple: Ord> JoinInput<'me, (Tuple, ())> for &'me Relation { 201 | type RecentTuples = &'me [(Tuple, ())]; 202 | 203 | fn recent(self) -> Self::RecentTuples { 204 | &[] 205 | } 206 | 207 | fn for_each_stable_set(self, mut f: impl FnMut(&[(Tuple, ())])) { 208 | use std::mem; 209 | assert_eq!(mem::size_of::<(Tuple, ())>(), mem::size_of::()); 210 | assert_eq!(mem::align_of::<(Tuple, ())>(), mem::align_of::()); 211 | 212 | // SAFETY: https://rust-lang.github.io/unsafe-code-guidelines/layout/structs-and-tuples.html#structs-with-1-zst-fields 213 | // guarantees that `T` is layout compatible with `(T, ())`, since `()` is a 1-ZST. We use 214 | // `slice::from_raw_parts` because the layout compatibility guarantee does not extend to 215 | // containers like `&[T]`. 216 | let elements: &'me [Tuple] = self.elements.as_slice(); 217 | let len = elements.len(); 218 | 219 | let elements: &'me [(Tuple, ())] = 220 | unsafe { std::slice::from_raw_parts(elements.as_ptr() as *const _, len) }; 221 | 222 | f(elements) 223 | } 224 | } 225 | -------------------------------------------------------------------------------- /src/lib.rs: -------------------------------------------------------------------------------- 1 | //! A lightweight Datalog engine in Rust 2 | //! 3 | //! The intended design is that one has static `Relation` types that are sets 4 | //! of tuples, and `Variable` types that represent monotonically increasing 5 | //! sets of tuples. 6 | //! 7 | //! The types are mostly wrappers around `Vec` indicating sorted-ness, 8 | //! and the intent is that this code can be dropped in the middle of an otherwise 9 | //! normal Rust program, run to completion, and then the results extracted as 10 | //! vectors again. 11 | 12 | #![forbid(missing_docs)] 13 | 14 | mod iteration; 15 | mod join; 16 | mod map; 17 | mod merge; 18 | mod relation; 19 | mod test; 20 | mod treefrog; 21 | mod variable; 22 | 23 | pub use crate::{ 24 | iteration::Iteration, 25 | join::JoinInput, 26 | relation::Relation, 27 | treefrog::{ 28 | extend_anti::ExtendAnti, 29 | extend_with::ExtendWith, 30 | filter_anti::FilterAnti, 31 | filter_with::FilterWith, 32 | filters::{passthrough, PrefixFilter, ValueFilter}, 33 | Leaper, Leapers, RelationLeaper, 34 | }, 35 | variable::Variable, 36 | }; 37 | -------------------------------------------------------------------------------- /src/map.rs: -------------------------------------------------------------------------------- 1 | //! Map functionality. 2 | 3 | use super::{Relation, Variable}; 4 | 5 | pub(crate) fn map_into( 6 | input: &Variable, 7 | output: &Variable, 8 | logic: impl FnMut(&T1) -> T2, 9 | ) { 10 | let results: Vec = input.recent.borrow().iter().map(logic).collect(); 11 | 12 | output.insert(Relation::from_vec(results)); 13 | } 14 | -------------------------------------------------------------------------------- /src/merge.rs: -------------------------------------------------------------------------------- 1 | //! Subroutines for merging sorted lists efficiently. 2 | 3 | use std::cmp::Ordering; 4 | 5 | /// Merges two sorted lists into a single sorted list, ignoring duplicates. 6 | pub fn merge_unique(mut a: Vec, mut b: Vec) -> Vec { 7 | // If one of the lists is zero-length, we don't need to do any work. 8 | if a.is_empty() { 9 | return b; 10 | } 11 | if b.is_empty() { 12 | return a; 13 | } 14 | 15 | // Fast path for when all the new elements are after the existing ones. 16 | // 17 | // Cannot panic because we check for empty inputs above. 18 | if *a.last().unwrap() < b[0] { 19 | a.append(&mut b); 20 | return a; 21 | } 22 | if *b.last().unwrap() < a[0] { 23 | b.append(&mut a); 24 | return b; 25 | } 26 | 27 | // Ensure that `out` always has sufficient capacity. 28 | // 29 | // SAFETY: The calls to `push_unchecked` below are safe because of this. 30 | let mut out = Vec::with_capacity(a.len() + b.len()); 31 | 32 | let mut a = a.into_iter(); 33 | let mut b = b.into_iter(); 34 | 35 | // While both inputs have elements remaining, copy the lesser element to the output vector. 36 | while a.len() != 0 && b.len() != 0 { 37 | // SAFETY: The following calls to `get_unchecked` and `next_unchecked` are safe because we 38 | // ensure that `a.len() > 0` and `b.len() > 0` inside the loop. 39 | // 40 | // I was hoping to avoid using "unchecked" operations, but it seems the bounds checks 41 | // don't get optimized away. Using `ExactSizeIterator::is_empty` instead of checking `len` 42 | // seemed to help, but that method is unstable. 43 | 44 | let a_elem = unsafe { a.as_slice().get_unchecked(0) }; 45 | let b_elem = unsafe { b.as_slice().get_unchecked(0) }; 46 | match a_elem.cmp(b_elem) { 47 | Ordering::Less => unsafe { push_unchecked(&mut out, next_unchecked(&mut a)) }, 48 | Ordering::Greater => unsafe { push_unchecked(&mut out, next_unchecked(&mut b)) }, 49 | Ordering::Equal => unsafe { 50 | push_unchecked(&mut out, next_unchecked(&mut a)); 51 | std::mem::drop(next_unchecked(&mut b)); 52 | }, 53 | } 54 | } 55 | 56 | // Once either `a` or `b` runs out of elements, copy all remaining elements in the other one 57 | // directly to the back of the output list. 58 | // 59 | // This branch is free because we have to check `a.is_empty()` above anyways. 60 | // 61 | // Calling `push_unchecked` in a loop was slightly faster than `out.extend(...)` 62 | // despite the fact that `std::vec::IntoIter` implements `TrustedLen`. 63 | if a.len() != 0 { 64 | for elem in a { 65 | unsafe { 66 | push_unchecked(&mut out, elem); 67 | } 68 | } 69 | } else { 70 | for elem in b { 71 | unsafe { 72 | push_unchecked(&mut out, elem); 73 | } 74 | } 75 | } 76 | 77 | out 78 | } 79 | 80 | /// Pushes `value` to `vec` without checking that the vector has sufficient capacity. 81 | /// 82 | /// If `vec.len() == vec.cap()`, calling this function is UB. 83 | unsafe fn push_unchecked(vec: &mut Vec, value: T) { 84 | let end = vec.as_mut_ptr().add(vec.len()); 85 | std::ptr::write(end, value); 86 | vec.set_len(vec.len() + 1); 87 | } 88 | 89 | /// Equivalent to `iter.next().unwrap()` that is UB to call when `iter` is empty. 90 | unsafe fn next_unchecked(iter: &mut std::vec::IntoIter) -> T { 91 | match iter.next() { 92 | Some(x) => x, 93 | None => std::hint::unreachable_unchecked(), 94 | } 95 | } 96 | -------------------------------------------------------------------------------- /src/relation.rs: -------------------------------------------------------------------------------- 1 | use std::iter::FromIterator; 2 | 3 | use crate::{ 4 | join, 5 | merge, 6 | treefrog::{self, Leapers}, 7 | }; 8 | 9 | /// A static, ordered list of key-value pairs. 10 | /// 11 | /// A relation represents a fixed set of key-value pairs. In many places in a 12 | /// Datalog computation we want to be sure that certain relations are not able 13 | /// to vary (for example, in antijoins). 14 | #[derive(Clone, Debug, PartialEq, Eq)] 15 | pub struct Relation { 16 | /// Sorted list of distinct tuples. 17 | pub elements: Vec, 18 | } 19 | 20 | impl Relation { 21 | /// Merges two relations into their union. 22 | pub fn merge(self, other: Self) -> Self { 23 | let elements = merge::merge_unique(self.elements, other.elements); 24 | Relation { elements } 25 | } 26 | 27 | /// Creates a `Relation` from the elements of the `iterator`. 28 | /// 29 | /// Same as the `from_iter` method from `std::iter::FromIterator` trait. 30 | pub fn from_iter(iterator: I) -> Self 31 | where 32 | I: IntoIterator, 33 | { 34 | iterator.into_iter().collect() 35 | } 36 | 37 | /// Creates a `Relation` using the `leapjoin` logic; 38 | /// see [`Variable::from_leapjoin`](crate::Variable::from_leapjoin) 39 | pub fn from_leapjoin<'leap, SourceTuple: Ord, Val: Ord + 'leap>( 40 | source: &Relation, 41 | leapers: impl Leapers<'leap, SourceTuple, Val>, 42 | logic: impl FnMut(&SourceTuple, &Val) -> Tuple, 43 | ) -> Self { 44 | treefrog::leapjoin(&source.elements, leapers, logic) 45 | } 46 | 47 | /// Creates a `Relation` by joining the values from `input1` and `input2` and then applying 48 | /// `logic`. Like [`Variable::from_join`](crate::Variable::from_join) except for use where 49 | /// the inputs are not varying across iterations. 50 | pub fn from_join( 51 | input1: &Relation<(Key, Val1)>, 52 | input2: &Relation<(Key, Val2)>, 53 | logic: impl FnMut(&Key, &Val1, &Val2) -> Tuple, 54 | ) -> Self { 55 | join::join_into_relation(input1, input2, logic) 56 | } 57 | 58 | /// Creates a `Relation` by removing all values from `input1` that share a key with `input2`, 59 | /// and then transforming the resulting tuples with the `logic` closure. Like 60 | /// [`Variable::from_antijoin`](crate::Variable::from_antijoin) except for use where the 61 | /// inputs are not varying across iterations. 62 | pub fn from_antijoin( 63 | input1: &Relation<(Key, Val1)>, 64 | input2: &Relation, 65 | logic: impl FnMut(&Key, &Val1) -> Tuple, 66 | ) -> Self { 67 | join::antijoin(input1, input2, logic) 68 | } 69 | 70 | /// Construct a new relation by mapping another one. Equivalent to 71 | /// creating an iterator but perhaps more convenient. Analogous to 72 | /// `Variable::from_map`. 73 | pub fn from_map(input: &Relation, logic: impl FnMut(&T2) -> Tuple) -> Self { 74 | input.iter().map(logic).collect() 75 | } 76 | 77 | /// Creates a `Relation` from a vector of tuples. 78 | pub fn from_vec(mut elements: Vec) -> Self { 79 | elements.sort(); 80 | elements.dedup(); 81 | Relation { elements } 82 | } 83 | } 84 | 85 | impl From> for Relation { 86 | fn from(iterator: Vec) -> Self { 87 | Self::from_vec(iterator) 88 | } 89 | } 90 | 91 | impl FromIterator for Relation { 92 | fn from_iter(iterator: I) -> Self 93 | where 94 | I: IntoIterator, 95 | { 96 | Relation::from_vec(iterator.into_iter().collect()) 97 | } 98 | } 99 | 100 | impl<'tuple, Tuple: 'tuple + Clone + Ord> FromIterator<&'tuple Tuple> for Relation { 101 | fn from_iter(iterator: I) -> Self 102 | where 103 | I: IntoIterator, 104 | { 105 | Relation::from_vec(iterator.into_iter().cloned().collect()) 106 | } 107 | } 108 | 109 | impl std::ops::Deref for Relation { 110 | type Target = [Tuple]; 111 | fn deref(&self) -> &Self::Target { 112 | &self.elements[..] 113 | } 114 | } 115 | -------------------------------------------------------------------------------- /src/test.rs: -------------------------------------------------------------------------------- 1 | #![cfg(test)] 2 | 3 | use crate::Iteration; 4 | use crate::Relation; 5 | use crate::RelationLeaper; 6 | use proptest::prelude::*; 7 | use proptest::{proptest, proptest_helper}; 8 | 9 | fn inputs() -> impl Strategy> { 10 | prop::collection::vec((0_u32..100, 0_u32..100), 1..500) 11 | } 12 | 13 | /// The original way to use datafrog -- computes reachable nodes from a set of edges 14 | fn reachable_with_var_join(edges: &[(u32, u32)]) -> Relation<(u32, u32)> { 15 | let edges: Relation<_> = edges.iter().collect(); 16 | let mut iteration = Iteration::new(); 17 | 18 | let edges_by_successor = iteration.variable::<(u32, u32)>("edges_invert"); 19 | edges_by_successor.extend(edges.iter().map(|&(n1, n2)| (n2, n1))); 20 | 21 | let reachable = iteration.variable::<(u32, u32)>("reachable"); 22 | reachable.insert(edges); 23 | 24 | while iteration.changed() { 25 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3). 26 | reachable.from_join(&reachable, &edges_by_successor, |&_, &n3, &n1| (n1, n3)); 27 | } 28 | 29 | reachable.complete() 30 | } 31 | 32 | /// Like `reachable`, but using a relation as an input to `from_join` 33 | fn reachable_with_relation_join(edges: &[(u32, u32)]) -> Relation<(u32, u32)> { 34 | let edges: Relation<_> = edges.iter().collect(); 35 | let mut iteration = Iteration::new(); 36 | 37 | // NB. Changed from `reachable_with_var_join`: 38 | let edges_by_successor: Relation<_> = edges.iter().map(|&(n1, n2)| (n2, n1)).collect(); 39 | 40 | let reachable = iteration.variable::<(u32, u32)>("reachable"); 41 | reachable.insert(edges); 42 | 43 | while iteration.changed() { 44 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3). 45 | reachable.from_join(&reachable, &edges_by_successor, |&_, &n3, &n1| (n1, n3)); 46 | } 47 | 48 | reachable.complete() 49 | } 50 | 51 | fn reachable_with_leapfrog(edges: &[(u32, u32)]) -> Relation<(u32, u32)> { 52 | let edges: Relation<_> = edges.iter().collect(); 53 | let mut iteration = Iteration::new(); 54 | 55 | let edges_by_successor: Relation<_> = edges.iter().map(|&(n1, n2)| (n2, n1)).collect(); 56 | 57 | let reachable = iteration.variable::<(u32, u32)>("reachable"); 58 | reachable.insert(edges); 59 | 60 | while iteration.changed() { 61 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3). 62 | reachable.from_leapjoin( 63 | &reachable, 64 | edges_by_successor.extend_with(|&(n2, _)| n2), 65 | |&(_, n3), &n1| (n1, n3), 66 | ); 67 | } 68 | 69 | reachable.complete() 70 | } 71 | 72 | /// Computes a join where the values are summed -- uses iteration 73 | /// variables (the original datafrog technique). 74 | fn sum_join_via_var( 75 | input1_slice: &[(u32, u32)], 76 | input2_slice: &[(u32, u32)], 77 | ) -> Relation<(u32, u32)> { 78 | let mut iteration = Iteration::new(); 79 | 80 | let input1 = iteration.variable::<(u32, u32)>("input1"); 81 | input1.extend(input1_slice); 82 | 83 | let input2 = iteration.variable::<(u32, u32)>("input1"); 84 | input2.extend(input2_slice); 85 | 86 | let output = iteration.variable::<(u32, u32)>("output"); 87 | 88 | while iteration.changed() { 89 | // output(K1, V1 * 100 + V2) :- input1(K1, V1), input2(K1, V2). 90 | output.from_join(&input1, &input2, |&k1, &v1, &v2| (k1, v1 * 100 + v2)); 91 | } 92 | 93 | output.complete() 94 | } 95 | 96 | /// Computes a join where the values are summed -- uses iteration 97 | /// variables (the original datafrog technique). 98 | fn sum_join_via_relation( 99 | input1_slice: &[(u32, u32)], 100 | input2_slice: &[(u32, u32)], 101 | ) -> Relation<(u32, u32)> { 102 | let input1: Relation<_> = input1_slice.iter().collect(); 103 | let input2: Relation<_> = input2_slice.iter().collect(); 104 | Relation::from_join(&input1, &input2, |&k1, &v1, &v2| (k1, v1 * 100 + v2)) 105 | } 106 | 107 | proptest! { 108 | #[test] 109 | fn reachable_leapfrog_vs_var_join(edges in inputs()) { 110 | let reachable1 = reachable_with_var_join(&edges); 111 | let reachable2 = reachable_with_leapfrog(&edges); 112 | assert_eq!(reachable1.elements, reachable2.elements); 113 | } 114 | 115 | #[test] 116 | fn reachable_rel_join_vs_var_join(edges in inputs()) { 117 | let reachable1 = reachable_with_var_join(&edges); 118 | let reachable2 = reachable_with_relation_join(&edges); 119 | assert_eq!(reachable1.elements, reachable2.elements); 120 | } 121 | 122 | #[test] 123 | fn sum_join_from_var_vs_rel((set1, set2) in (inputs(), inputs())) { 124 | let output1 = sum_join_via_var(&set1, &set2); 125 | let output2 = sum_join_via_relation(&set1, &set2); 126 | assert_eq!(output1.elements, output2.elements); 127 | } 128 | 129 | /// Test the behavior of `filter_anti` used on its own in a 130 | /// leapjoin -- effectively it becomes an "intersection" 131 | /// operation. 132 | #[test] 133 | fn filter_with_on_its_own((set1, set2) in (inputs(), inputs())) { 134 | let input1: Relation<(u32, u32)> = set1.iter().collect(); 135 | let input2: Relation<(u32, u32)> = set2.iter().collect(); 136 | let intersection1 = Relation::from_leapjoin( 137 | &input1, 138 | input2.filter_with(|&tuple| tuple), 139 | |&tuple, &()| tuple, 140 | ); 141 | 142 | let intersection2: Relation<(u32, u32)> = input1.elements.iter() 143 | .filter(|t| input2.elements.binary_search(&t).is_ok()) 144 | .collect(); 145 | 146 | assert_eq!(intersection1.elements, intersection2.elements); 147 | } 148 | 149 | /// Test the behavior of `filter_anti` used on its own in a 150 | /// leapjoin -- effectively it becomes a "set minus" operation. 151 | #[test] 152 | fn filter_anti_on_its_own((set1, set2) in (inputs(), inputs())) { 153 | let input1: Relation<(u32, u32)> = set1.iter().collect(); 154 | let input2: Relation<(u32, u32)> = set2.iter().collect(); 155 | 156 | let difference1 = Relation::from_leapjoin( 157 | &input1, 158 | input2.filter_anti(|&tuple| tuple), 159 | |&tuple, &()| tuple, 160 | ); 161 | 162 | let difference2: Relation<(u32, u32)> = input1.elements.iter() 163 | .filter(|t| input2.elements.binary_search(&t).is_err()) 164 | .collect(); 165 | 166 | assert_eq!(difference1.elements, difference2.elements); 167 | } 168 | } 169 | 170 | /// Test that `from_leapjoin` matches against the tuples from an 171 | /// `extend` that precedes first iteration. 172 | /// 173 | /// This was always true, but wasn't immediately obvious to me until I 174 | /// re-read the code more carefully. -nikomatsakis 175 | #[test] 176 | fn leapjoin_from_extend() { 177 | let doubles: Relation<(u32, u32)> = (0..10).map(|i| (i, i * 2)).collect(); 178 | 179 | let mut iteration = Iteration::new(); 180 | 181 | let variable = iteration.variable::<(u32, u32)>("variable"); 182 | variable.extend(Some((2, 2))); 183 | 184 | while iteration.changed() { 185 | variable.from_leapjoin( 186 | &variable, 187 | doubles.extend_with(|&(i, _)| i), 188 | |&(i, _), &j| (i, j), 189 | ); 190 | } 191 | 192 | let variable = variable.complete(); 193 | 194 | assert_eq!(variable.elements, vec![(2, 2), (2, 4)]); 195 | } 196 | 197 | #[test] 198 | fn passthrough_leaper() { 199 | let mut iteration = Iteration::new(); 200 | 201 | let variable = iteration.variable::<(u32, u32)>("variable"); 202 | variable.extend((0..10).map(|i| (i, i))); 203 | 204 | while iteration.changed() { 205 | variable.from_leapjoin( 206 | &variable, 207 | ( 208 | crate::passthrough(), // Without this, the test would fail at runtime. 209 | crate::PrefixFilter::from(|&(i, _)| i <= 20), 210 | ), 211 | |&(i, j), ()| (2*i, 2*j), 212 | ); 213 | } 214 | 215 | let variable = variable.complete(); 216 | 217 | let mut expected: Vec<_> = (0..10).map(|i| (i, i)).collect(); 218 | expected.extend((10..20).filter_map(|i| (i%2 == 0).then(|| (i, i)))); 219 | expected.extend((20..=40).filter_map(|i| (i%4 == 0).then(|| (i, i)))); 220 | assert_eq!(&*variable, &expected); 221 | } 222 | 223 | #[test] 224 | fn relation_from_antijoin() { 225 | let lhs: Relation<_> = (0 .. 10).map(|x| (x, x)).collect(); 226 | let rhs: Relation<_> = (0 .. 10).filter(|x| x % 2 == 0).collect(); 227 | let expected: Relation<_> = (0 .. 10).filter(|x| x % 2 == 1).map(|x| (x, x)).collect(); 228 | 229 | let result = Relation::from_antijoin(&lhs, &rhs, |a, b| (*a, *b)); 230 | 231 | assert_eq!(result.elements, expected.elements); 232 | } 233 | -------------------------------------------------------------------------------- /src/treefrog.rs: -------------------------------------------------------------------------------- 1 | //! Join functionality. 2 | 3 | use super::Relation; 4 | 5 | /// Performs treefrog leapjoin using a list of leapers. 6 | pub(crate) fn leapjoin<'leap, Tuple: Ord, Val: Ord + 'leap, Result: Ord>( 7 | source: &[Tuple], 8 | mut leapers: impl Leapers<'leap, Tuple, Val>, 9 | mut logic: impl FnMut(&Tuple, &Val) -> Result, 10 | ) -> Relation { 11 | let mut result = Vec::new(); // temp output storage. 12 | let mut values = Vec::new(); // temp value storage. 13 | 14 | for tuple in source { 15 | // Determine which leaper would propose the fewest values. 16 | let mut min_index = usize::max_value(); 17 | let mut min_count = usize::max_value(); 18 | leapers.for_each_count(tuple, |index, count| { 19 | if min_count > count { 20 | min_count = count; 21 | min_index = index; 22 | } 23 | }); 24 | 25 | // We had best have at least one relation restricting values. 26 | assert!(min_count < usize::max_value()); 27 | 28 | // If there are values to propose: 29 | if min_count > 0 { 30 | // Push the values that `min_index` "proposes" into `values`. 31 | leapers.propose(tuple, min_index, &mut values); 32 | 33 | // Give other leapers a chance to remove values from 34 | // anti-joins or filters. 35 | leapers.intersect(tuple, min_index, &mut values); 36 | 37 | // Push remaining items into result. 38 | for val in values.drain(..) { 39 | result.push(logic(tuple, val)); 40 | } 41 | } 42 | } 43 | 44 | Relation::from_vec(result) 45 | } 46 | 47 | /// Implemented for a tuple of leapers 48 | pub trait Leapers<'leap, Tuple, Val> { 49 | /// Internal method: 50 | fn for_each_count(&mut self, tuple: &Tuple, op: impl FnMut(usize, usize)); 51 | 52 | /// Internal method: 53 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>); 54 | 55 | /// Internal method: 56 | fn intersect(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>); 57 | } 58 | 59 | macro_rules! tuple_leapers { 60 | ($($Ty:ident)*) => { 61 | #[allow(unused_assignments, non_snake_case)] 62 | impl<'leap, Tuple, Val, $($Ty),*> Leapers<'leap, Tuple, Val> for ($($Ty,)*) 63 | where 64 | $($Ty: Leaper<'leap, Tuple, Val>,)* 65 | { 66 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) { 67 | let ($($Ty,)*) = self; 68 | let mut index = 0; 69 | $( 70 | let count = $Ty.count(tuple); 71 | op(index, count); 72 | index += 1; 73 | )* 74 | } 75 | 76 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) { 77 | let ($($Ty,)*) = self; 78 | let mut index = 0; 79 | $( 80 | if min_index == index { 81 | return $Ty.propose(tuple, values); 82 | } 83 | index += 1; 84 | )* 85 | panic!("no match found for min_index={}", min_index); 86 | } 87 | 88 | fn intersect(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) { 89 | let ($($Ty,)*) = self; 90 | let mut index = 0; 91 | $( 92 | if min_index != index { 93 | $Ty.intersect(tuple, values); 94 | } 95 | index += 1; 96 | )* 97 | } 98 | } 99 | } 100 | } 101 | 102 | tuple_leapers!(A B); 103 | tuple_leapers!(A B C); 104 | tuple_leapers!(A B C D); 105 | tuple_leapers!(A B C D E); 106 | tuple_leapers!(A B C D E F); 107 | tuple_leapers!(A B C D E F G); 108 | 109 | /// Methods to support treefrog leapjoin. 110 | pub trait Leaper<'leap, Tuple, Val> { 111 | /// Estimates the number of proposed values. 112 | fn count(&mut self, prefix: &Tuple) -> usize; 113 | /// Populates `values` with proposed values. 114 | fn propose(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>); 115 | /// Restricts `values` to proposed values. 116 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>); 117 | } 118 | 119 | pub(crate) mod filters { 120 | use super::Leaper; 121 | use super::Leapers; 122 | 123 | /// A treefrog leaper that tests each of the tuples from the main 124 | /// input (the "prefix"). Use like `PrefixFilter::from(|tuple| 125 | /// ...)`; if the closure returns true, then the tuple is 126 | /// retained, else it will be ignored. This leaper can be used in 127 | /// isolation in which case it just acts like a filter on the 128 | /// input (the "proposed value" will be `()` type). 129 | pub struct PrefixFilter bool> { 130 | phantom: ::std::marker::PhantomData, 131 | predicate: Func, 132 | } 133 | 134 | impl<'leap, Tuple, Func> PrefixFilter 135 | where 136 | Func: Fn(&Tuple) -> bool, 137 | { 138 | /// Creates a new filter based on the prefix 139 | pub fn from(predicate: Func) -> Self { 140 | PrefixFilter { 141 | phantom: ::std::marker::PhantomData, 142 | predicate, 143 | } 144 | } 145 | } 146 | 147 | impl<'leap, Tuple, Val, Func> Leaper<'leap, Tuple, Val> for PrefixFilter 148 | where 149 | Func: Fn(&Tuple) -> bool, 150 | { 151 | /// Estimates the number of proposed values. 152 | fn count(&mut self, prefix: &Tuple) -> usize { 153 | if (self.predicate)(prefix) { 154 | usize::max_value() 155 | } else { 156 | 0 157 | } 158 | } 159 | /// Populates `values` with proposed values. 160 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) { 161 | panic!("PrefixFilter::propose(): variable apparently unbound"); 162 | } 163 | /// Restricts `values` to proposed values. 164 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) { 165 | // We can only be here if we returned max_value() above. 166 | } 167 | } 168 | 169 | impl<'leap, Tuple, Func> Leapers<'leap, Tuple, ()> for PrefixFilter 170 | where 171 | Func: Fn(&Tuple) -> bool, 172 | { 173 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) { 174 | if >::count(self, tuple) == 0 { 175 | op(0, 0) 176 | } else { 177 | // we will "propose" the `()` value if the predicate applies 178 | op(0, 1) 179 | } 180 | } 181 | 182 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 183 | assert_eq!(min_index, 0); 184 | values.push(&()); 185 | } 186 | 187 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 188 | assert_eq!(min_index, 0); 189 | assert_eq!(values.len(), 1); 190 | } 191 | } 192 | 193 | pub struct Passthrough { 194 | phantom: ::std::marker::PhantomData, 195 | } 196 | 197 | impl Passthrough { 198 | fn new() -> Self { 199 | Passthrough { 200 | phantom: ::std::marker::PhantomData, 201 | } 202 | } 203 | } 204 | 205 | impl<'leap, Tuple> Leaper<'leap, Tuple, ()> for Passthrough { 206 | /// Estimates the number of proposed values. 207 | fn count(&mut self, _prefix: &Tuple) -> usize { 208 | 1 209 | } 210 | /// Populates `values` with proposed values. 211 | fn propose(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap ()>) { 212 | values.push(&()) 213 | } 214 | /// Restricts `values` to proposed values. 215 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap ()>) { 216 | // `Passthrough` never removes values (although if we're here it indicates that the user 217 | // didn't need a `Passthrough` in the first place) 218 | } 219 | } 220 | 221 | /// Returns a leaper that proposes a single copy of each tuple from the main input. 222 | /// 223 | /// Use this when you don't need any "extend" leapers in a join, only "filter"s. For example, 224 | /// in the following datalog rule, all terms in the second and third predicate are bound in the 225 | /// first one (the "main input" to our leapjoin). 226 | /// 227 | /// ```prolog 228 | /// error(loan, point) :- 229 | /// origin_contains_loan_at(origin, loan, point), % main input 230 | /// origin_live_at(origin, point), 231 | /// loan_invalidated_at(loan, point). 232 | /// ``` 233 | /// 234 | /// Without a passthrough leaper, neither the filter for `origin_live_at` nor the one for 235 | /// `loan_invalidated_at` would propose any tuples, and the leapjoin would panic at runtime. 236 | pub fn passthrough() -> Passthrough { 237 | Passthrough::new() 238 | } 239 | 240 | /// A treefrog leaper based on a predicate of prefix and value. 241 | /// Use like `ValueFilter::from(|tuple, value| ...)`. The closure 242 | /// should return true if `value` ought to be retained. The 243 | /// `value` will be a value proposed elsewhere by an `extend_with` 244 | /// leaper. 245 | /// 246 | /// This leaper cannot be used in isolation, it must be combined 247 | /// with other leapers. 248 | pub struct ValueFilter bool> { 249 | phantom: ::std::marker::PhantomData<(Tuple, Val)>, 250 | predicate: Func, 251 | } 252 | 253 | impl<'leap, Tuple, Val, Func> ValueFilter 254 | where 255 | Func: Fn(&Tuple, &Val) -> bool, 256 | { 257 | /// Creates a new filter based on the prefix 258 | pub fn from(predicate: Func) -> Self { 259 | ValueFilter { 260 | phantom: ::std::marker::PhantomData, 261 | predicate, 262 | } 263 | } 264 | } 265 | 266 | impl<'leap, Tuple, Val, Func> Leaper<'leap, Tuple, Val> for ValueFilter 267 | where 268 | Func: Fn(&Tuple, &Val) -> bool, 269 | { 270 | /// Estimates the number of proposed values. 271 | fn count(&mut self, _prefix: &Tuple) -> usize { 272 | usize::max_value() 273 | } 274 | /// Populates `values` with proposed values. 275 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) { 276 | panic!("PrefixFilter::propose(): variable apparently unbound"); 277 | } 278 | /// Restricts `values` to proposed values. 279 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>) { 280 | values.retain(|val| (self.predicate)(prefix, val)); 281 | } 282 | } 283 | } 284 | 285 | /// Extension method for relations. 286 | pub trait RelationLeaper { 287 | /// Extend with `Val` using the elements of the relation. 288 | fn extend_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>( 289 | &'leap self, 290 | key_func: Func, 291 | ) -> extend_with::ExtendWith<'leap, Key, Val, Tuple, Func> 292 | where 293 | Key: 'leap, 294 | Val: 'leap; 295 | /// Extend with `Val` using the complement of the relation. 296 | fn extend_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>( 297 | &'leap self, 298 | key_func: Func, 299 | ) -> extend_anti::ExtendAnti<'leap, Key, Val, Tuple, Func> 300 | where 301 | Key: 'leap, 302 | Val: 'leap; 303 | /// Extend with any value if tuple is present in relation. 304 | fn filter_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>( 305 | &'leap self, 306 | key_func: Func, 307 | ) -> filter_with::FilterWith<'leap, Key, Val, Tuple, Func> 308 | where 309 | Key: 'leap, 310 | Val: 'leap; 311 | /// Extend with any value if tuple is absent from relation. 312 | fn filter_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>( 313 | &'leap self, 314 | key_func: Func, 315 | ) -> filter_anti::FilterAnti<'leap, Key, Val, Tuple, Func> 316 | where 317 | Key: 'leap, 318 | Val: 'leap; 319 | } 320 | 321 | impl RelationLeaper for Relation<(Key, Val)> { 322 | fn extend_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>( 323 | &'leap self, 324 | key_func: Func, 325 | ) -> extend_with::ExtendWith<'leap, Key, Val, Tuple, Func> 326 | where 327 | Key: 'leap, 328 | Val: 'leap, 329 | { 330 | extend_with::ExtendWith::from(self, key_func) 331 | } 332 | fn extend_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>( 333 | &'leap self, 334 | key_func: Func, 335 | ) -> extend_anti::ExtendAnti<'leap, Key, Val, Tuple, Func> 336 | where 337 | Key: 'leap, 338 | Val: 'leap, 339 | { 340 | extend_anti::ExtendAnti::from(self, key_func) 341 | } 342 | fn filter_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>( 343 | &'leap self, 344 | key_func: Func, 345 | ) -> filter_with::FilterWith<'leap, Key, Val, Tuple, Func> 346 | where 347 | Key: 'leap, 348 | Val: 'leap, 349 | { 350 | filter_with::FilterWith::from(self, key_func) 351 | } 352 | fn filter_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>( 353 | &'leap self, 354 | key_func: Func, 355 | ) -> filter_anti::FilterAnti<'leap, Key, Val, Tuple, Func> 356 | where 357 | Key: 'leap, 358 | Val: 'leap, 359 | { 360 | filter_anti::FilterAnti::from(self, key_func) 361 | } 362 | } 363 | 364 | pub(crate) mod extend_with { 365 | use super::{binary_search, Leaper, Leapers, Relation}; 366 | use crate::join::gallop; 367 | 368 | /// Wraps a Relation as a leaper. 369 | pub struct ExtendWith<'leap, Key, Val, Tuple, Func> 370 | where 371 | Key: Ord + 'leap, 372 | Val: Ord + 'leap, 373 | Tuple: Ord, 374 | Func: Fn(&Tuple) -> Key, 375 | { 376 | relation: &'leap Relation<(Key, Val)>, 377 | start: usize, 378 | end: usize, 379 | key_func: Func, 380 | old_key: Option, 381 | phantom: ::std::marker::PhantomData, 382 | } 383 | 384 | impl<'leap, Key, Val, Tuple, Func> ExtendWith<'leap, Key, Val, Tuple, Func> 385 | where 386 | Key: Ord + 'leap, 387 | Val: Ord + 'leap, 388 | Tuple: Ord, 389 | Func: Fn(&Tuple) -> Key, 390 | { 391 | /// Constructs a ExtendWith from a relation and key and value function. 392 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self { 393 | ExtendWith { 394 | relation, 395 | start: 0, 396 | end: 0, 397 | key_func, 398 | old_key: None, 399 | phantom: ::std::marker::PhantomData, 400 | } 401 | } 402 | } 403 | 404 | impl<'leap, Key, Val, Tuple, Func> Leaper<'leap, Tuple, Val> 405 | for ExtendWith<'leap, Key, Val, Tuple, Func> 406 | where 407 | Key: Ord + 'leap, 408 | Val: Ord + 'leap, 409 | Tuple: Ord, 410 | Func: Fn(&Tuple) -> Key, 411 | { 412 | fn count(&mut self, prefix: &Tuple) -> usize { 413 | let key = (self.key_func)(prefix); 414 | if self.old_key.as_ref() != Some(&key) { 415 | self.start = binary_search(&self.relation.elements, |x| &x.0 < &key); 416 | let slice1 = &self.relation[self.start..]; 417 | let slice2 = gallop(slice1, |x| &x.0 <= &key); 418 | self.end = self.relation.len() - slice2.len(); 419 | 420 | self.old_key = Some(key); 421 | } 422 | 423 | self.end - self.start 424 | } 425 | fn propose(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap Val>) { 426 | let slice = &self.relation[self.start..self.end]; 427 | values.extend(slice.iter().map(|&(_, ref val)| val)); 428 | } 429 | fn intersect(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap Val>) { 430 | let mut slice = &self.relation[self.start..self.end]; 431 | values.retain(|v| { 432 | slice = gallop(slice, |kv| &kv.1 < v); 433 | slice.get(0).map(|kv| &kv.1) == Some(v) 434 | }); 435 | } 436 | } 437 | 438 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, Val> 439 | for ExtendWith<'leap, Key, Val, Tuple, Func> 440 | where 441 | Key: Ord + 'leap, 442 | Val: Ord + 'leap, 443 | Tuple: Ord, 444 | Func: Fn(&Tuple) -> Key, 445 | { 446 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) { 447 | op(0, self.count(tuple)) 448 | } 449 | 450 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) { 451 | assert_eq!(min_index, 0); 452 | Leaper::propose(self, tuple, values); 453 | } 454 | 455 | fn intersect(&mut self, _: &Tuple, min_index: usize, _: &mut Vec<&'leap Val>) { 456 | assert_eq!(min_index, 0); 457 | } 458 | } 459 | } 460 | 461 | pub(crate) mod extend_anti { 462 | use std::ops::Range; 463 | 464 | use super::{binary_search, Leaper, Relation}; 465 | use crate::join::gallop; 466 | 467 | /// Wraps a Relation as a leaper. 468 | pub struct ExtendAnti<'leap, Key, Val, Tuple, Func> 469 | where 470 | Key: Ord + 'leap, 471 | Val: Ord + 'leap, 472 | Tuple: Ord, 473 | Func: Fn(&Tuple) -> Key, 474 | { 475 | relation: &'leap Relation<(Key, Val)>, 476 | key_func: Func, 477 | old_key: Option<(Key, Range)>, 478 | phantom: ::std::marker::PhantomData, 479 | } 480 | 481 | impl<'leap, Key, Val, Tuple, Func> ExtendAnti<'leap, Key, Val, Tuple, Func> 482 | where 483 | Key: Ord + 'leap, 484 | Val: Ord + 'leap, 485 | Tuple: Ord, 486 | Func: Fn(&Tuple) -> Key, 487 | { 488 | /// Constructs a ExtendAnti from a relation and key and value function. 489 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self { 490 | ExtendAnti { 491 | relation, 492 | key_func, 493 | old_key: None, 494 | phantom: ::std::marker::PhantomData, 495 | } 496 | } 497 | } 498 | 499 | impl<'leap, Key: Ord, Val: Ord + 'leap, Tuple: Ord, Func> Leaper<'leap, Tuple, Val> 500 | for ExtendAnti<'leap, Key, Val, Tuple, Func> 501 | where 502 | Key: Ord + 'leap, 503 | Val: Ord + 'leap, 504 | Tuple: Ord, 505 | Func: Fn(&Tuple) -> Key, 506 | { 507 | fn count(&mut self, _prefix: &Tuple) -> usize { 508 | usize::max_value() 509 | } 510 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) { 511 | panic!("ExtendAnti::propose(): variable apparently unbound."); 512 | } 513 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>) { 514 | let key = (self.key_func)(prefix); 515 | 516 | let range = match self.old_key.as_ref() { 517 | Some((old, range)) if old == &key => range.clone(), 518 | 519 | _ => { 520 | let start = binary_search(&self.relation.elements, |x| &x.0 < &key); 521 | let slice1 = &self.relation[start..]; 522 | let slice2 = gallop(slice1, |x| &x.0 <= &key); 523 | let range = start..self.relation.len()-slice2.len(); 524 | 525 | self.old_key = Some((key, range.clone())); 526 | 527 | range 528 | } 529 | }; 530 | 531 | let mut slice = &self.relation[range]; 532 | if !slice.is_empty() { 533 | values.retain(|v| { 534 | slice = gallop(slice, |kv| &kv.1 < v); 535 | slice.get(0).map(|kv| &kv.1) != Some(v) 536 | }); 537 | } 538 | } 539 | } 540 | } 541 | 542 | pub(crate) mod filter_with { 543 | 544 | use super::{Leaper, Leapers, Relation}; 545 | 546 | /// Wraps a Relation as a leaper. 547 | pub struct FilterWith<'leap, Key, Val, Tuple, Func> 548 | where 549 | Key: Ord + 'leap, 550 | Val: Ord + 'leap, 551 | Tuple: Ord, 552 | Func: Fn(&Tuple) -> (Key, Val), 553 | { 554 | relation: &'leap Relation<(Key, Val)>, 555 | key_func: Func, 556 | old_key_val: Option<((Key, Val), bool)>, 557 | phantom: ::std::marker::PhantomData, 558 | } 559 | 560 | impl<'leap, Key, Val, Tuple, Func> FilterWith<'leap, Key, Val, Tuple, Func> 561 | where 562 | Key: Ord + 'leap, 563 | Val: Ord + 'leap, 564 | Tuple: Ord, 565 | Func: Fn(&Tuple) -> (Key, Val), 566 | { 567 | /// Constructs a FilterWith from a relation and key and value function. 568 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self { 569 | FilterWith { 570 | relation, 571 | key_func, 572 | old_key_val: None, 573 | phantom: ::std::marker::PhantomData, 574 | } 575 | } 576 | } 577 | 578 | impl<'leap, Key, Val, Val2, Tuple, Func> Leaper<'leap, Tuple, Val2> 579 | for FilterWith<'leap, Key, Val, Tuple, Func> 580 | where 581 | Key: Ord + 'leap, 582 | Val: Ord + 'leap, 583 | Tuple: Ord, 584 | Func: Fn(&Tuple) -> (Key, Val), 585 | { 586 | fn count(&mut self, prefix: &Tuple) -> usize { 587 | let key_val = (self.key_func)(prefix); 588 | 589 | if let Some((ref old_key_val, is_present)) = self.old_key_val { 590 | if old_key_val == &key_val { 591 | return if is_present { usize::MAX } else { 0 }; 592 | } 593 | } 594 | 595 | let is_present = self.relation.binary_search(&key_val).is_ok(); 596 | self.old_key_val = Some((key_val, is_present)); 597 | if is_present { usize::MAX } else { 0 } 598 | } 599 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) { 600 | panic!("FilterWith::propose(): variable apparently unbound."); 601 | } 602 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) { 603 | // Only here because we didn't return zero above, right? 604 | } 605 | } 606 | 607 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, ()> 608 | for FilterWith<'leap, Key, Val, Tuple, Func> 609 | where 610 | Key: Ord + 'leap, 611 | Val: Ord + 'leap, 612 | Tuple: Ord, 613 | Func: Fn(&Tuple) -> (Key, Val), 614 | { 615 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) { 616 | if >::count(self, tuple) == 0 { 617 | op(0, 0) 618 | } else { 619 | op(0, 1) 620 | } 621 | } 622 | 623 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 624 | assert_eq!(min_index, 0); 625 | values.push(&()); 626 | } 627 | 628 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 629 | assert_eq!(min_index, 0); 630 | assert_eq!(values.len(), 1); 631 | } 632 | } 633 | } 634 | 635 | pub(crate) mod filter_anti { 636 | 637 | use super::{Leaper, Leapers, Relation}; 638 | 639 | /// Wraps a Relation as a leaper. 640 | pub struct FilterAnti<'leap, Key, Val, Tuple, Func> 641 | where 642 | Key: Ord + 'leap, 643 | Val: Ord + 'leap, 644 | Tuple: Ord, 645 | Func: Fn(&Tuple) -> (Key, Val), 646 | { 647 | relation: &'leap Relation<(Key, Val)>, 648 | key_func: Func, 649 | old_key_val: Option<((Key, Val), bool)>, 650 | phantom: ::std::marker::PhantomData, 651 | } 652 | 653 | impl<'leap, Key, Val, Tuple, Func> FilterAnti<'leap, Key, Val, Tuple, Func> 654 | where 655 | Key: Ord + 'leap, 656 | Val: Ord + 'leap, 657 | Tuple: Ord, 658 | Func: Fn(&Tuple) -> (Key, Val), 659 | { 660 | /// Constructs a FilterAnti from a relation and key and value function. 661 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self { 662 | FilterAnti { 663 | relation, 664 | key_func, 665 | old_key_val: None, 666 | phantom: ::std::marker::PhantomData, 667 | } 668 | } 669 | } 670 | 671 | impl<'leap, Key: Ord, Val: Ord + 'leap, Val2, Tuple: Ord, Func> Leaper<'leap, Tuple, Val2> 672 | for FilterAnti<'leap, Key, Val, Tuple, Func> 673 | where 674 | Key: Ord + 'leap, 675 | Val: Ord + 'leap, 676 | Tuple: Ord, 677 | Func: Fn(&Tuple) -> (Key, Val), 678 | { 679 | fn count(&mut self, prefix: &Tuple) -> usize { 680 | let key_val = (self.key_func)(prefix); 681 | 682 | if let Some((ref old_key_val, is_present)) = self.old_key_val { 683 | if old_key_val == &key_val { 684 | return if is_present { 0 } else { usize::MAX }; 685 | } 686 | } 687 | 688 | let is_present = self.relation.binary_search(&key_val).is_ok(); 689 | self.old_key_val = Some((key_val, is_present)); 690 | if is_present { 0 } else { usize::MAX } 691 | } 692 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) { 693 | panic!("FilterAnti::propose(): variable apparently unbound."); 694 | } 695 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) { 696 | // Only here because we didn't return zero above, right? 697 | } 698 | } 699 | 700 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, ()> 701 | for FilterAnti<'leap, Key, Val, Tuple, Func> 702 | where 703 | Key: Ord + 'leap, 704 | Val: Ord + 'leap, 705 | Tuple: Ord, 706 | Func: Fn(&Tuple) -> (Key, Val), 707 | { 708 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) { 709 | if >::count(self, tuple) == 0 { 710 | op(0, 0) 711 | } else { 712 | op(0, 1) 713 | } 714 | } 715 | 716 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 717 | // We only get here if `tuple` is *not* a member of `self.relation` 718 | assert_eq!(min_index, 0); 719 | values.push(&()); 720 | } 721 | 722 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) { 723 | // We only get here if `tuple` is not a member of `self.relation` 724 | assert_eq!(min_index, 0); 725 | assert_eq!(values.len(), 1); 726 | } 727 | } 728 | } 729 | 730 | /// Returns the lowest index for which `cmp(&vec[i])` returns `true`, assuming `vec` is in sorted 731 | /// order. 732 | /// 733 | /// By accepting a vector instead of a slice, we can do a small optimization when computing the 734 | /// midpoint. 735 | fn binary_search(vec: &Vec, mut cmp: impl FnMut(&T) -> bool) -> usize { 736 | // The midpoint calculation we use below is only correct for vectors with less than `isize::MAX` 737 | // elements. This is always true for vectors of sized types but maybe not for ZSTs? Sorting 738 | // ZSTs doesn't make much sense, so just forbid it here. 739 | assert!(std::mem::size_of::() > 0); 740 | 741 | // we maintain the invariant that `lo` many elements of `slice` satisfy `cmp`. 742 | // `hi` is maintained at the first element we know does not satisfy `cmp`. 743 | 744 | let mut hi = vec.len(); 745 | let mut lo = 0; 746 | while lo < hi { 747 | // Unlike in the general case, this expression cannot overflow because `Vec` is limited to 748 | // `isize::MAX` capacity and we disallow ZSTs above. If we needed to support slices or 749 | // vectors of ZSTs, which don't have an upper bound on their size AFAIK, we would need to 750 | // use a slightly less efficient version that cannot overflow: `lo + (hi - lo) / 2`. 751 | let mid = (hi + lo) / 2; 752 | 753 | // LLVM seems to be unable to prove that `mid` is always less than `vec.len()`, so use 754 | // `get_unchecked` to avoid a bounds check since this code is hot. 755 | let el: &T = unsafe { vec.get_unchecked(mid) }; 756 | if cmp(el) { 757 | lo = mid + 1; 758 | } else { 759 | hi = mid; 760 | } 761 | } 762 | lo 763 | } 764 | -------------------------------------------------------------------------------- /src/variable.rs: -------------------------------------------------------------------------------- 1 | use std::cell::RefCell; 2 | use std::io::Write; 3 | use std::iter::FromIterator; 4 | use std::rc::Rc; 5 | 6 | use crate::{ 7 | join::{self, JoinInput}, 8 | map, 9 | relation::Relation, 10 | treefrog::{self, Leapers}, 11 | }; 12 | 13 | /// A type that can report on whether it has changed. 14 | pub(crate) trait VariableTrait { 15 | /// Reports whether the variable has changed since it was last asked. 16 | fn changed(&mut self) -> bool; 17 | 18 | /// Dumps statistics about the variable internals, for debug and profiling purposes. 19 | fn dump_stats(&self, round: u32, w: &mut dyn Write); 20 | } 21 | 22 | /// An monotonically increasing set of `Tuple`s. 23 | /// 24 | /// There are three stages in the lifecycle of a tuple: 25 | /// 26 | /// 1. A tuple is added to `self.to_add`, but is not yet visible externally. 27 | /// 2. Newly added tuples are then promoted to `self.recent` for one iteration. 28 | /// 3. After one iteration, recent tuples are moved to `self.stable` for posterity. 29 | /// 30 | /// Each time `self.changed()` is called, the `recent` relation is folded into `stable`, 31 | /// and the `to_add` relations are merged, potentially deduplicated against `stable`, and 32 | /// then made `recent`. This way, across calls to `changed()` all added tuples are in 33 | /// `recent` at least once and eventually all are in `stable`. 34 | /// 35 | /// A `Variable` may optionally be instructed not to de-duplicate its tuples, for reasons 36 | /// of performance. Such a variable cannot be relied on to terminate iterative computation, 37 | /// and it is important that any cycle of derivations have at least one de-duplicating 38 | /// variable on it. 39 | pub struct Variable { 40 | /// Should the variable be maintained distinctly. 41 | pub(crate) distinct: bool, 42 | /// A useful name for the variable. 43 | pub(crate) name: String, 44 | /// A list of relations whose union are the accepted tuples. 45 | pub stable: Rc>>>, 46 | /// A list of recent tuples, still to be processed. 47 | pub recent: Rc>>, 48 | /// A list of future tuples, to be introduced. 49 | pub(crate) to_add: Rc>>>, 50 | } 51 | 52 | impl Variable { 53 | /// Returns the name used to create this variable. 54 | pub fn name(&self) -> &str { 55 | self.name.as_str() 56 | } 57 | 58 | /// Returns the total number of "stable" tuples in this variable. 59 | pub fn num_stable(&self) -> usize { 60 | self.stable.borrow().iter().map(|x| x.len()).sum() 61 | } 62 | 63 | /// Returns `true` if this variable contains only "stable" tuples. 64 | /// 65 | /// Calling `Iteration::changed()` on such `Variables` will not change them unless new tuples 66 | /// are added. 67 | pub fn is_stable(&self) -> bool { 68 | self.recent.borrow().is_empty() && self.to_add.borrow().is_empty() 69 | } 70 | } 71 | 72 | // Operator implementations. 73 | impl Variable { 74 | /// Adds tuples that result from joining `input1` and `input2` -- 75 | /// each of the inputs must be a set of (Key, Value) tuples. Both 76 | /// `input1` and `input2` must have the same type of key (`K`) but 77 | /// they can have distinct value types (`V1` and `V2` 78 | /// respectively). The `logic` closure will be invoked for each 79 | /// key that appears in both inputs; it is also given the two 80 | /// values, and from those it should construct the resulting 81 | /// value. 82 | /// 83 | /// Note that `input1` must be a variable, but `input2` can be a 84 | /// relation or a variable. Therefore, you cannot join two 85 | /// relations with this method. This is not because the result 86 | /// would be wrong, but because it would be inefficient: the 87 | /// result from such a join cannot vary across iterations (as 88 | /// relations are fixed), so you should prefer to invoke `insert` 89 | /// on a relation created by `Relation::from_join` instead. 90 | /// 91 | /// # Examples 92 | /// 93 | /// This example starts a collection with the pairs (x, x+1) and (x+1, x) for x in 0 .. 10. 94 | /// It then adds pairs (y, z) for which (x, y) and (x, z) are present. Because the initial 95 | /// pairs are symmetric, this should result in all pairs (x, y) for x and y in 0 .. 11. 96 | /// 97 | /// ``` 98 | /// use datafrog::{Iteration, Relation}; 99 | /// 100 | /// let mut iteration = Iteration::new(); 101 | /// let variable = iteration.variable::<(usize, usize)>("source"); 102 | /// variable.extend((0 .. 10).map(|x| (x, x + 1))); 103 | /// variable.extend((0 .. 10).map(|x| (x + 1, x))); 104 | /// 105 | /// while iteration.changed() { 106 | /// variable.from_join(&variable, &variable, |&key, &val1, &val2| (val1, val2)); 107 | /// } 108 | /// 109 | /// let result = variable.complete(); 110 | /// assert_eq!(result.len(), 121); 111 | /// ``` 112 | pub fn from_join<'me, K: Ord, V1: Ord, V2: Ord>( 113 | &self, 114 | input1: &'me Variable<(K, V1)>, 115 | input2: impl JoinInput<'me, (K, V2)>, 116 | logic: impl FnMut(&K, &V1, &V2) -> Tuple, 117 | ) { 118 | join::join_into(input1, input2, self, logic) 119 | } 120 | 121 | /// Same as [`Variable::from_join`], but lets you ignore some of the resulting tuples. 122 | /// 123 | /// # Examples 124 | /// 125 | /// This is the same example from `Variable::from_join`, but it filters any tuples where the 126 | /// absolute difference is greater than 3. As a result, it generates all pairs (x, y) for x and 127 | /// y in 0 .. 11 such that |x - y| <= 3. 128 | /// 129 | /// ``` 130 | /// use datafrog::{Iteration, Relation}; 131 | /// 132 | /// let mut iteration = Iteration::new(); 133 | /// let variable = iteration.variable::<(isize, isize)>("source"); 134 | /// variable.extend((0 .. 10).map(|x| (x, x + 1))); 135 | /// variable.extend((0 .. 10).map(|x| (x + 1, x))); 136 | /// 137 | /// while iteration.changed() { 138 | /// variable.from_join_filtered(&variable, &variable, |&key, &val1, &val2| { 139 | /// ((val1 - val2).abs() <= 3).then(|| (val1, val2)) 140 | /// }); 141 | /// } 142 | /// 143 | /// let result = variable.complete(); 144 | /// 145 | /// let mut expected_cnt = 0; 146 | /// for i in 0i32..11 { 147 | /// for j in 0i32..11 { 148 | /// if (i - j).abs() <= 3 { 149 | /// expected_cnt += 1; 150 | /// } 151 | /// } 152 | /// } 153 | /// 154 | /// assert_eq!(result.len(), expected_cnt); 155 | /// ``` 156 | pub fn from_join_filtered<'me, K: Ord, V1: Ord, V2: Ord>( 157 | &self, 158 | input1: &'me Variable<(K, V1)>, 159 | input2: impl JoinInput<'me, (K, V2)>, 160 | logic: impl FnMut(&K, &V1, &V2) -> Option, 161 | ) { 162 | join::join_and_filter_into(input1, input2, self, logic) 163 | } 164 | 165 | /// Adds tuples from `input1` whose key is not present in `input2`. 166 | /// 167 | /// Note that `input1` must be a variable: if you have a relation 168 | /// instead, you can use `Relation::from_antijoin` and then 169 | /// `Variable::insert`. Note that the result will not vary during 170 | /// the iteration. 171 | /// 172 | /// # Examples 173 | /// 174 | /// This example starts a collection with the pairs (x, x+1) for x in 0 .. 10. It then 175 | /// adds any pairs (x+1,x) for which x is not a multiple of three. That excludes four 176 | /// pairs (for 0, 3, 6, and 9) which should leave us with 16 total pairs. 177 | /// 178 | /// ``` 179 | /// use datafrog::{Iteration, Relation}; 180 | /// 181 | /// let mut iteration = Iteration::new(); 182 | /// let variable = iteration.variable::<(usize, usize)>("source"); 183 | /// variable.extend((0 .. 10).map(|x| (x, x + 1))); 184 | /// 185 | /// let relation: Relation<_> = (0 .. 10).filter(|x| x % 3 == 0).collect(); 186 | /// 187 | /// while iteration.changed() { 188 | /// variable.from_antijoin(&variable, &relation, |&key, &val| (val, key)); 189 | /// } 190 | /// 191 | /// let result = variable.complete(); 192 | /// assert_eq!(result.len(), 16); 193 | /// ``` 194 | pub fn from_antijoin( 195 | &self, 196 | input1: &Variable<(K, V)>, 197 | input2: &Relation, 198 | logic: impl FnMut(&K, &V) -> Tuple, 199 | ) { 200 | self.insert(join::antijoin(&input1.recent.borrow(), input2, logic)) 201 | } 202 | 203 | /// Adds tuples that result from mapping `input`. 204 | /// 205 | /// # Examples 206 | /// 207 | /// This example starts a collection with the pairs (x, x) for x in 0 .. 10. It then 208 | /// repeatedly adds any pairs (x, z) for (x, y) in the collection, where z is the Collatz 209 | /// step for y: it is y/2 if y is even, and 3*y + 1 if y is odd. This produces all of the 210 | /// pairs (x, y) where x visits y as part of its Collatz journey. 211 | /// 212 | /// ``` 213 | /// use datafrog::{Iteration, Relation}; 214 | /// 215 | /// let mut iteration = Iteration::new(); 216 | /// let variable = iteration.variable::<(usize, usize)>("source"); 217 | /// variable.extend((0 .. 10).map(|x| (x, x))); 218 | /// 219 | /// while iteration.changed() { 220 | /// variable.from_map(&variable, |&(key, val)| 221 | /// if val % 2 == 0 { 222 | /// (key, val/2) 223 | /// } 224 | /// else { 225 | /// (key, 3*val + 1) 226 | /// }); 227 | /// } 228 | /// 229 | /// let result = variable.complete(); 230 | /// assert_eq!(result.len(), 74); 231 | /// ``` 232 | pub fn from_map(&self, input: &Variable, logic: impl FnMut(&T2) -> Tuple) { 233 | map::map_into(input, self, logic) 234 | } 235 | 236 | /// Adds tuples that result from combining `source` with the 237 | /// relations given in `leapers`. This operation is very flexible 238 | /// and can be used to do a combination of joins and anti-joins. 239 | /// The main limitation is that the things being combined must 240 | /// consist of one dynamic variable (`source`) and then several 241 | /// fixed relations (`leapers`). 242 | /// 243 | /// The idea is as follows: 244 | /// 245 | /// - You will be inserting new tuples that result from joining (and anti-joining) 246 | /// some dynamic variable `source` of source tuples (`SourceTuple`) 247 | /// with some set of values (of type `Val`). 248 | /// - You provide these values by combining `source` with a set of leapers 249 | /// `leapers`, each of which is derived from a fixed relation. The `leapers` 250 | /// should be either a single leaper (of suitable type) or else a tuple of leapers. 251 | /// You can create a leaper in one of two ways: 252 | /// - Extension: In this case, you have a relation of type `(K, Val)` for some 253 | /// type `K`. You provide a closure that maps from `SourceTuple` to the key 254 | /// `K`. If you use `relation.extend_with`, then any `Val` values the 255 | /// relation provides will be added to the set of values; if you use 256 | /// `extend_anti`, then the `Val` values will be removed. 257 | /// - Filtering: In this case, you have a relation of type `K` for some 258 | /// type `K` and you provide a closure that maps from `SourceTuple` to 259 | /// the key `K`. Filters don't provide values but they remove source 260 | /// tuples. 261 | /// - Finally, you get a callback `logic` that accepts each `(SourceTuple, Val)` 262 | /// that was successfully joined (and not filtered) and which maps to the 263 | /// type of this variable. 264 | pub fn from_leapjoin<'leap, SourceTuple: Ord, Val: Ord + 'leap>( 265 | &self, 266 | source: &Variable, 267 | leapers: impl Leapers<'leap, SourceTuple, Val>, 268 | logic: impl FnMut(&SourceTuple, &Val) -> Tuple, 269 | ) { 270 | self.insert(treefrog::leapjoin(&source.recent.borrow(), leapers, logic)); 271 | } 272 | } 273 | 274 | impl Clone for Variable { 275 | fn clone(&self) -> Self { 276 | Variable { 277 | distinct: self.distinct, 278 | name: self.name.clone(), 279 | stable: self.stable.clone(), 280 | recent: self.recent.clone(), 281 | to_add: self.to_add.clone(), 282 | } 283 | } 284 | } 285 | 286 | impl Variable { 287 | pub(crate) fn new(name: &str) -> Self { 288 | Variable { 289 | distinct: true, 290 | name: name.to_string(), 291 | stable: Rc::new(RefCell::new(Vec::new())), 292 | recent: Rc::new(RefCell::new(Vec::new().into())), 293 | to_add: Rc::new(RefCell::new(Vec::new())), 294 | } 295 | } 296 | 297 | /// Inserts a relation into the variable. 298 | /// 299 | /// This is most commonly used to load initial values into a variable. 300 | /// it is not obvious that it should be commonly used otherwise, but 301 | /// it should not be harmful. 302 | pub fn insert(&self, relation: Relation) { 303 | if !relation.is_empty() { 304 | self.to_add.borrow_mut().push(relation); 305 | } 306 | } 307 | 308 | /// Extend the variable with values from the iterator. 309 | /// 310 | /// This is most commonly used to load initial values into a variable. 311 | /// it is not obvious that it should be commonly used otherwise, but 312 | /// it should not be harmful. 313 | pub fn extend(&self, iterator: impl IntoIterator) 314 | where 315 | Relation: FromIterator, 316 | { 317 | self.insert(iterator.into_iter().collect()); 318 | } 319 | 320 | /// Consumes the variable and returns a relation. 321 | /// 322 | /// This method removes the ability for the variable to develop, and 323 | /// flattens all internal tuples down to one relation. The method 324 | /// asserts that iteration has completed, in that `self.recent` and 325 | /// `self.to_add` should both be empty. 326 | pub fn complete(self) -> Relation { 327 | assert!(self.is_stable()); 328 | let mut result: Relation = Vec::new().into(); 329 | while let Some(batch) = self.stable.borrow_mut().pop() { 330 | result = result.merge(batch); 331 | } 332 | result 333 | } 334 | } 335 | 336 | impl VariableTrait for Variable { 337 | fn changed(&mut self) -> bool { 338 | // 1. Merge self.recent into self.stable. 339 | if !self.recent.borrow().is_empty() { 340 | let mut recent = 341 | ::std::mem::replace(&mut (*self.recent.borrow_mut()), Vec::new().into()); 342 | while self 343 | .stable 344 | .borrow() 345 | .last() 346 | .map(|x| x.len() <= 2 * recent.len()) 347 | == Some(true) 348 | { 349 | let last = self.stable.borrow_mut().pop().unwrap(); 350 | recent = recent.merge(last); 351 | } 352 | self.stable.borrow_mut().push(recent); 353 | } 354 | 355 | // 2. Move self.to_add into self.recent. 356 | let to_add = self.to_add.borrow_mut().pop(); 357 | if let Some(mut to_add) = to_add { 358 | while let Some(to_add_more) = self.to_add.borrow_mut().pop() { 359 | to_add = to_add.merge(to_add_more); 360 | } 361 | // 2b. Restrict `to_add` to tuples not in `self.stable`. 362 | if self.distinct { 363 | for batch in self.stable.borrow().iter() { 364 | let mut slice = &batch[..]; 365 | // Only gallop if the slice is relatively large. 366 | if slice.len() > 4 * to_add.elements.len() { 367 | to_add.elements.retain(|x| { 368 | slice = join::gallop(slice, |y| y < x); 369 | slice.is_empty() || &slice[0] != x 370 | }); 371 | } else { 372 | to_add.elements.retain(|x| { 373 | while !slice.is_empty() && &slice[0] < x { 374 | slice = &slice[1..]; 375 | } 376 | slice.is_empty() || &slice[0] != x 377 | }); 378 | } 379 | } 380 | } 381 | *self.recent.borrow_mut() = to_add; 382 | } 383 | 384 | // let mut total = 0; 385 | // for tuple in self.stable.borrow().iter() { 386 | // total += tuple.len(); 387 | // } 388 | 389 | // println!("Variable\t{}\t{}\t{}", self.name, total, self.recent.borrow().len()); 390 | 391 | !self.recent.borrow().is_empty() 392 | } 393 | 394 | fn dump_stats(&self, round: u32, w: &mut dyn Write) { 395 | let mut stable_count = 0; 396 | for tuple in self.stable.borrow().iter() { 397 | stable_count += tuple.len(); 398 | } 399 | 400 | writeln!( 401 | w, 402 | "{:?},{},{},{}", 403 | self.name, 404 | round, 405 | stable_count, 406 | self.recent.borrow().len() 407 | ) 408 | .unwrap_or_else(|e| { 409 | panic!( 410 | "Couldn't write stats for variable {}, round {}: {}", 411 | self.name, round, e 412 | ) 413 | }); 414 | } 415 | } 416 | 417 | // impl Drop for Variable { 418 | // fn drop(&mut self) { 419 | // let mut total = 0; 420 | // for batch in self.stable.borrow().iter() { 421 | // total += batch.len(); 422 | // } 423 | // println!("FINAL: {:?}\t{:?}", self.name, total); 424 | // } 425 | // } 426 | -------------------------------------------------------------------------------- /triagebot.toml: -------------------------------------------------------------------------------- 1 | [assign] 2 | --------------------------------------------------------------------------------