├── .github
└── workflows
│ └── main.yml
├── .gitignore
├── CODE_OF_CONDUCT.md
├── Cargo.toml
├── LICENSE-APACHE
├── LICENSE-MIT
├── README.md
├── RELEASES.md
├── examples
├── borrow_check.rs
└── graspan1.rs
├── src
├── iteration.rs
├── join.rs
├── lib.rs
├── map.rs
├── merge.rs
├── relation.rs
├── test.rs
├── treefrog.rs
└── variable.rs
└── triagebot.toml
/.github/workflows/main.yml:
--------------------------------------------------------------------------------
1 |
2 | name: CI
3 |
4 | on:
5 | push:
6 | branches: [ master ]
7 | pull_request:
8 |
9 | jobs:
10 | test:
11 | name: Run tests
12 | runs-on: ubuntu-latest
13 | continue-on-error: ${{ matrix.rust == 'nightly' }}
14 | strategy:
15 | matrix:
16 | rust: [beta, nightly]
17 | steps:
18 | - uses: actions/checkout@v2
19 | with:
20 | fetch-depth: 1
21 |
22 | - name: Install rust toolchain
23 | uses: actions-rs/toolchain@v1
24 | with:
25 | toolchain: ${{ matrix.rust }}
26 | profile: minimal
27 | override: true
28 |
29 | - name: Build datafrog
30 | run: cargo build
31 |
32 | - name: Execute tests
33 | run: cargo test
34 |
35 | - name: Check examples
36 | run: cargo check --examples
37 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Generated by Cargo
2 | # will have compiled files and executables
3 | /target/
4 |
5 | # Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
6 | # More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
7 | Cargo.lock
8 |
9 | # These are backup files generated by rustfmt
10 | **/*.rs.bk
11 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # The Rust Code of Conduct
2 |
3 | A version of this document [can be found online](https://www.rust-lang.org/conduct.html).
4 |
5 | ## Conduct
6 |
7 | **Contact**: [rust-mods@rust-lang.org](mailto:rust-mods@rust-lang.org)
8 |
9 | * We are committed to providing a friendly, safe and welcoming environment for all, regardless of level of experience, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, religion, nationality, or other similar characteristic.
10 | * On IRC, please avoid using overtly sexual nicknames or other nicknames that might detract from a friendly, safe and welcoming environment for all.
11 | * Please be kind and courteous. There's no need to be mean or rude.
12 | * Respect that people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer.
13 | * Please keep unstructured critique to a minimum. If you have solid ideas you want to experiment with, make a fork and see how it works.
14 | * We will exclude you from interaction if you insult, demean or harass anyone. That is not welcome behavior. We interpret the term "harassment" as including the definition in the Citizen Code of Conduct; if you have any lack of clarity about what might be included in that concept, please read their definition. In particular, we don't tolerate behavior that excludes people in socially marginalized groups.
15 | * Private harassment is also unacceptable. No matter who you are, if you feel you have been or are being harassed or made uncomfortable by a community member, please contact one of the channel ops or any of the [Rust moderation team][mod_team] immediately. Whether you're a regular contributor or a newcomer, we care about making this community a safe place for you and we've got your back.
16 | * Likewise any spamming, trolling, flaming, baiting or other attention-stealing behavior is not welcome.
17 |
18 | ## Moderation
19 |
20 |
21 | These are the policies for upholding our community's standards of conduct. If you feel that a thread needs moderation, please contact the [Rust moderation team][mod_team].
22 |
23 | 1. Remarks that violate the Rust standards of conduct, including hateful, hurtful, oppressive, or exclusionary remarks, are not allowed. (Cursing is allowed, but never targeting another user, and never in a hateful manner.)
24 | 2. Remarks that moderators find inappropriate, whether listed in the code of conduct or not, are also not allowed.
25 | 3. Moderators will first respond to such remarks with a warning.
26 | 4. If the warning is unheeded, the user will be "kicked," i.e., kicked out of the communication channel to cool off.
27 | 5. If the user comes back and continues to make trouble, they will be banned, i.e., indefinitely excluded.
28 | 6. Moderators may choose at their discretion to un-ban the user if it was a first offense and they offer the offended party a genuine apology.
29 | 7. If a moderator bans someone and you think it was unjustified, please take it up with that moderator, or with a different moderator, **in private**. Complaints about bans in-channel are not allowed.
30 | 8. Moderators are held to a higher standard than other community members. If a moderator creates an inappropriate situation, they should expect less leeway than others.
31 |
32 | In the Rust community we strive to go the extra step to look out for each other. Don't just aim to be technically unimpeachable, try to be your best self. In particular, avoid flirting with offensive or sensitive issues, particularly if they're off-topic; this all too often leads to unnecessary fights, hurt feelings, and damaged trust; worse, it can drive people away from the community entirely.
33 |
34 | And if someone takes issue with something you said or did, resist the urge to be defensive. Just stop doing what it was they complained about and apologize. Even if you feel you were misinterpreted or unfairly accused, chances are good there was something you could've communicated better — remember that it's your responsibility to make your fellow Rustaceans comfortable. Everyone wants to get along and we are all here first and foremost because we want to talk about cool technology. You will find that people will be eager to assume good intent and forgive as long as you earn their trust.
35 |
36 | The enforcement policies listed above apply to all official Rust venues; including official IRC channels (#rust, #rust-internals, #rust-tools, #rust-libs, #rustc, #rust-beginners, #rust-docs, #rust-community, #rust-lang, and #cargo); GitHub repositories under rust-lang, rust-lang-nursery, and rust-lang-deprecated; and all forums under rust-lang.org (users.rust-lang.org, internals.rust-lang.org). For other projects adopting the Rust Code of Conduct, please contact the maintainers of those projects for enforcement. If you wish to use this code of conduct for your own project, consider explicitly mentioning your moderation policy or making a copy with your own moderation policy so as to avoid confusion.
37 |
38 | *Adapted from the [Node.js Policy on Trolling](http://blog.izs.me/post/30036893703/policy-on-trolling) as well as the [Contributor Covenant v1.3.0](https://www.contributor-covenant.org/version/1/3/0/).*
39 |
40 | [mod_team]: https://www.rust-lang.org/team.html#Moderation-team
41 |
--------------------------------------------------------------------------------
/Cargo.toml:
--------------------------------------------------------------------------------
1 | [package]
2 | name = "datafrog"
3 | version = "2.0.1"
4 | authors = ["Frank McSherry ", "The Rust Project Developers", "Datafrog Developers"]
5 | license = "Apache-2.0/MIT"
6 | description = "Lightweight Datalog engine intended to be embedded in other Rust programs"
7 | readme = "README.md"
8 | keywords = ["datalog", "analysis"]
9 | repository = "https://github.com/rust-lang-nursery/datafrog"
10 | edition = "2018"
11 |
12 | [badges]
13 | is-it-maintained-issue-resolution = { repository = "https://github.com/rust-lang-nursery/datafrog" }
14 | is-it-maintained-open-issues = { repository = "https://github.com/rust-lang-nursery/datafrog" }
15 |
16 | [dev-dependencies]
17 | proptest = "0.8.7"
18 |
--------------------------------------------------------------------------------
/LICENSE-APACHE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/LICENSE-MIT:
--------------------------------------------------------------------------------
1 | Permission is hereby granted, free of charge, to any
2 | person obtaining a copy of this software and associated
3 | documentation files (the "Software"), to deal in the
4 | Software without restriction, including without
5 | limitation the rights to use, copy, modify, merge,
6 | publish, distribute, sublicense, and/or sell copies of
7 | the Software, and to permit persons to whom the Software
8 | is furnished to do so, subject to the following
9 | conditions:
10 |
11 | The above copyright notice and this permission notice
12 | shall be included in all copies or substantial portions
13 | of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
16 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
17 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
18 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
19 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
20 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
22 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
23 | DEALINGS IN THE SOFTWARE.
24 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # datafrog
2 |
3 | Datafrog is a lightweight Datalog engine intended to be embedded in other Rust programs.
4 |
5 | Datafrog has no runtime, and relies on you to build and repeatedly apply the update rules.
6 | It tries to help you do this correctly. As an example, here is how you might write a reachability
7 | query using Datafrog (minus the part where we populate the `nodes` and `edges` initial relations).
8 |
9 | ```rust
10 | extern crate datafrog;
11 | use datafrog::Iteration;
12 |
13 | fn main() {
14 | // Prepare initial values, ..
15 | let nodes: Vec<(u32,u32)> = vec![
16 | // ..
17 | ];
18 | let edges: Vec<(u32,u32)> = vec![
19 | // ..
20 | ];
21 |
22 | // Create a new iteration context, ..
23 | let mut iteration = Iteration::new();
24 |
25 | // .. some variables, ..
26 | let nodes_var = iteration.variable::<(u32,u32)>("nodes");
27 | let edges_var = iteration.variable::<(u32,u32)>("edges");
28 |
29 | // .. load them with some initial values, ..
30 | nodes_var.insert(nodes.into());
31 | edges_var.insert(edges.into());
32 |
33 | // .. and then start iterating rules!
34 | while iteration.changed() {
35 | // nodes(a,c) <- nodes(a,b), edges(b,c)
36 | nodes_var.from_join(&nodes_var, &edges_var, |_b, &a, &c| (c,a));
37 | }
38 |
39 | // extract the final results.
40 | let reachable: Vec<(u32,u32)> = nodes_var.complete();
41 | }
42 | ```
43 |
44 | If you'd like to read more about how it works, check out [this blog post](https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md).
45 |
46 | ## Authorship
47 |
48 | Datafrog was initially developed by [Frank McSherry][fmc] and was
49 | later transferred to the rust-lang-nursery organization. Thanks Frank!
50 |
51 | [fmc]: https://github.com/frankmcsherry
52 |
--------------------------------------------------------------------------------
/RELEASES.md:
--------------------------------------------------------------------------------
1 | # 2.0.1
2 |
3 | - Work around a rustdoc ICE (#24)
4 |
5 | # 2.0.0
6 |
7 | - Breaking changes:
8 | - leapjoin now takes a tuple of leapers, and not a `&mut` slice:
9 | - `from_leapjoin(&input, &mut [&mut foo.extend_with(...), ..], ..)` becomes
10 | `from_leapjoin(&input, (foo.extend_with(...), ..), ..)`
11 | - if there is only one leaper, no tuple is needed
12 | - `Relation::from` now requires a vector, not an iterator; use
13 | `Relation::from_iter` instead
14 | - Changed the API to permit using `Relation` and `Variable` more interchangeably,
15 | and added a number of operations to construct relations directly, like `Relation::from_join`
16 | - Extended leapfrog triejoin with new operations (`PrefixFilter` and `ValueFilter`)
17 |
18 | # 1.0.0
19 |
20 | - Added leapfrog triejoin (#11).
21 | - Have badges and repo links now!
22 | - Minor performance improvements (#13).
23 |
24 | # 0.1.0
25 |
26 | - Initial release.
27 |
--------------------------------------------------------------------------------
/examples/borrow_check.rs:
--------------------------------------------------------------------------------
1 | extern crate datafrog;
2 | use datafrog::Iteration;
3 |
4 | type Region = u32;
5 | type Borrow = u32;
6 | type Point = u32;
7 |
8 | fn main() {
9 | let subset = {
10 | // Create a new iteration context, ...
11 | let mut iteration1 = Iteration::new();
12 |
13 | // .. some variables, ..
14 | let subset = iteration1.variable::<(Region, Region, Point)>("subset");
15 |
16 | // different indices for `subset`.
17 | let subset_r1p = iteration1.variable::<((Region, Point), Region)>("subset_r1p");
18 | let subset_r2p = iteration1.variable::<((Region, Point), Region)>("subset_r2p");
19 | let subset_p = iteration1.variable::<(Point, (Region, Region))>("subset_p");
20 |
21 | // temporaries as we perform a multi-way join.
22 | let subset_1 = iteration1.variable::<((Region, Point), Region)>("subset_1");
23 | let subset_2 = iteration1.variable::<((Region, Point), Region)>("subset_2");
24 |
25 | let region_live_at = iteration1.variable::<((Region, Point), ())>("region_live_at");
26 | let cfg_edge_p = iteration1.variable::<(Point, Point)>("cfg_edge_p");
27 |
28 | // load initial facts.
29 | subset.insert(Vec::new().into());
30 | region_live_at.insert(Vec::new().into());
31 | cfg_edge_p.insert(Vec::new().into());
32 |
33 | // .. and then start iterating rules!
34 | while iteration1.changed() {
35 | // remap fields to re-index by keys.
36 | subset_r1p.from_map(&subset, |&(r1, r2, p)| ((r1, p), r2));
37 | subset_r2p.from_map(&subset, |&(r1, r2, p)| ((r2, p), r1));
38 | subset_p.from_map(&subset, |&(r1, r2, p)| (p, (r1, r2)));
39 |
40 | // R0: subset(R1, R2, P) :- outlives(R1, R2, P).
41 | // Already loaded; outlives is static.
42 |
43 | // R1: subset(R1, R3, P) :-
44 | // subset(R1, R2, P),
45 | // subset(R2, R3, P).
46 | subset.from_join(&subset_r2p, &subset_r1p, |&(_r2, p), &r1, &r3| (r1, r3, p));
47 |
48 | // R2: subset(R1, R2, Q) :-
49 | // subset(R1, R2, P),
50 | // cfg_edge(P, Q),
51 | // region_live_at(R1, Q),
52 | // region_live_at(R2, Q).
53 |
54 | subset_1.from_join(&subset_p, &cfg_edge_p, |&_p, &(r1, r2), &q| ((r1, q), r2));
55 | subset_2.from_join(&subset_1, ®ion_live_at, |&(r1, q), &r2, &()| {
56 | ((r2, q), r1)
57 | });
58 | subset.from_join(&subset_2, ®ion_live_at, |&(r2, q), &r1, &()| (r1, r2, q));
59 | }
60 |
61 | subset_r1p.complete()
62 | };
63 |
64 | let _requires = {
65 | // Create a new iteration context, ...
66 | let mut iteration2 = Iteration::new();
67 |
68 | // .. some variables, ..
69 | let requires = iteration2.variable::<(Region, Borrow, Point)>("requires");
70 | requires.insert(Vec::new().into());
71 |
72 | let requires_rp = iteration2.variable::<((Region, Point), Borrow)>("requires_rp");
73 | let requires_bp = iteration2.variable::<((Borrow, Point), Region)>("requires_bp");
74 |
75 | let requires_1 = iteration2.variable::<(Point, (Borrow, Region))>("requires_1");
76 | let requires_2 = iteration2.variable::<((Region, Point), Borrow)>("requires_2");
77 |
78 | let subset_r1p = iteration2.variable::<((Region, Point), Region)>("subset_r1p");
79 | subset_r1p.insert(subset);
80 |
81 | let killed = Vec::new().into();
82 | let region_live_at = iteration2.variable::<((Region, Point), ())>("region_live_at");
83 | let cfg_edge_p = iteration2.variable::<(Point, Point)>("cfg_edge_p");
84 |
85 | // .. and then start iterating rules!
86 | while iteration2.changed() {
87 | requires_rp.from_map(&requires, |&(r, b, p)| ((r, p), b));
88 | requires_bp.from_map(&requires, |&(r, b, p)| ((b, p), r));
89 |
90 | // requires(R, B, P) :- borrow_region(R, B, P).
91 | // Already loaded; borrow_region is static.
92 |
93 | // requires(R2, B, P) :-
94 | // requires(R1, B, P),
95 | // subset(R1, R2, P).
96 | requires.from_join(&requires_rp, &subset_r1p, |&(_r1, p), &b, &r2| (r2, b, p));
97 |
98 | // requires(R, B, Q) :-
99 | // requires(R, B, P),
100 | // !killed(B, P),
101 | // cfg_edge(P, Q),
102 | // (region_live_at(R, Q); universal_region(R)).
103 |
104 | requires_1.from_antijoin(&requires_bp, &killed, |&(b, p), &r| (p, (b, r)));
105 | requires_2.from_join(&requires_1, &cfg_edge_p, |&_p, &(b, r), &q| ((r, q), b));
106 | requires.from_join(&requires_2, ®ion_live_at, |&(r, q), &b, &()| (r, b, q));
107 | }
108 |
109 | requires.complete()
110 | };
111 |
112 | // borrow_live_at(B, P) :- requires(R, B, P), region_live_at(R, P)
113 |
114 | // borrow_live_at(B, P) :- requires(R, B, P), universal_region(R).
115 | }
116 |
--------------------------------------------------------------------------------
/examples/graspan1.rs:
--------------------------------------------------------------------------------
1 | extern crate datafrog;
2 | use datafrog::Iteration;
3 |
4 | fn main() {
5 | let timer = ::std::time::Instant::now();
6 |
7 | // Make space for input data.
8 | let mut nodes = Vec::new();
9 | let mut edges = Vec::new();
10 |
11 | // Read input data from a handy file.
12 | use std::fs::File;
13 | use std::io::{BufRead, BufReader};
14 |
15 | let filename = std::env::args().nth(1).unwrap();
16 | let file = BufReader::new(File::open(filename).unwrap());
17 | for readline in file.lines() {
18 | let line = readline.expect("read error");
19 | if !line.is_empty() && !line.starts_with('#') {
20 | let mut elts = line[..].split_whitespace();
21 | let src: u32 = elts.next().unwrap().parse().expect("malformed src");
22 | let dst: u32 = elts.next().unwrap().parse().expect("malformed dst");
23 | let typ: &str = elts.next().unwrap();
24 | match typ {
25 | "n" => {
26 | nodes.push((dst, src));
27 | }
28 | "e" => {
29 | edges.push((src, dst));
30 | }
31 | unk => panic!("unknown type: {}", unk),
32 | }
33 | }
34 | }
35 |
36 | println!("{:?}\tData loaded", timer.elapsed());
37 |
38 | // Create a new iteration context, ...
39 | let mut iteration = Iteration::new();
40 |
41 | // .. some variables, ..
42 | let variable1 = iteration.variable::<(u32, u32)>("nodes");
43 | let variable2 = iteration.variable::<(u32, u32)>("edges");
44 |
45 | // .. load them with some initial values, ..
46 | variable1.insert(nodes.into());
47 | variable2.insert(edges.into());
48 |
49 | // .. and then start iterating rules!
50 | while iteration.changed() {
51 | // N(a,c) <- N(a,b), E(b,c)
52 | variable1.from_join(&variable1, &variable2, |_b, &a, &c| (c, a));
53 | }
54 |
55 | let reachable = variable1.complete();
56 |
57 | println!(
58 | "{:?}\tComputation complete (nodes_final: {})",
59 | timer.elapsed(),
60 | reachable.len()
61 | );
62 | }
63 |
--------------------------------------------------------------------------------
/src/iteration.rs:
--------------------------------------------------------------------------------
1 | use std::io::Write;
2 |
3 | use crate::variable::{Variable, VariableTrait};
4 |
5 | /// An iterative context for recursive evaluation.
6 | ///
7 | /// An `Iteration` tracks monotonic variables, and monitors their progress.
8 | /// It can inform the user if they have ceased changing, at which point the
9 | /// computation should be done.
10 | #[derive(Default)]
11 | pub struct Iteration {
12 | variables: Vec>,
13 | round: u32,
14 | debug_stats: Option>,
15 | }
16 |
17 | impl Iteration {
18 | /// Create a new iterative context.
19 | pub fn new() -> Self {
20 | Self::default()
21 | }
22 | /// Reports whether any of the monitored variables have changed since
23 | /// the most recent call.
24 | pub fn changed(&mut self) -> bool {
25 | self.round += 1;
26 |
27 | let mut result = false;
28 | for variable in self.variables.iter_mut() {
29 | if variable.changed() {
30 | result = true;
31 | }
32 |
33 | if let Some(ref mut stats_writer) = self.debug_stats {
34 | variable.dump_stats(self.round, stats_writer);
35 | }
36 | }
37 | result
38 | }
39 | /// Creates a new named variable associated with the iterative context.
40 | pub fn variable(&mut self, name: &str) -> Variable {
41 | let variable = Variable::new(name);
42 | self.variables.push(Box::new(variable.clone()));
43 | variable
44 | }
45 | /// Creates a new named variable associated with the iterative context.
46 | ///
47 | /// This variable will not be maintained distinctly, and may advertise tuples as
48 | /// recent multiple times (perhaps unboundedly many times).
49 | pub fn variable_indistinct(&mut self, name: &str) -> Variable {
50 | let mut variable = Variable::new(name);
51 | variable.distinct = false;
52 | self.variables.push(Box::new(variable.clone()));
53 | variable
54 | }
55 |
56 | /// Set up this Iteration to write debug statistics about each variable,
57 | /// for each round of the computation.
58 | pub fn record_stats_to(&mut self, mut w: Box) {
59 | // print column names header
60 | writeln!(w, "Variable,Round,Stable count,Recent count")
61 | .expect("Couldn't write debug stats CSV header");
62 |
63 | self.debug_stats = Some(w);
64 | }
65 | }
66 |
--------------------------------------------------------------------------------
/src/join.rs:
--------------------------------------------------------------------------------
1 | //! Join functionality.
2 |
3 | use super::{Relation, Variable};
4 | use std::cell::Ref;
5 | use std::ops::Deref;
6 |
7 | /// Implements `join`. Note that `input1` must be a variable, but
8 | /// `input2` can be either a variable or a relation. This is necessary
9 | /// because relations have no "recent" tuples, so the fn would be a
10 | /// guaranteed no-op if both arguments were relations. See also
11 | /// `join_into_relation`.
12 | pub(crate) fn join_into<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>(
13 | input1: &Variable<(Key, Val1)>,
14 | input2: impl JoinInput<'me, (Key, Val2)>,
15 | output: &Variable,
16 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Result,
17 | ) {
18 | let mut results = Vec::new();
19 | let push_result = |k: &Key, v1: &Val1, v2: &Val2| results.push(logic(k, v1, v2));
20 |
21 | join_delta(input1, input2, push_result);
22 |
23 | output.insert(Relation::from_vec(results));
24 | }
25 |
26 | pub(crate) fn join_and_filter_into<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>(
27 | input1: &Variable<(Key, Val1)>,
28 | input2: impl JoinInput<'me, (Key, Val2)>,
29 | output: &Variable,
30 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Option,
31 | ) {
32 | let mut results = Vec::new();
33 | let push_result = |k: &Key, v1: &Val1, v2: &Val2| {
34 | if let Some(result) = logic(k, v1, v2) {
35 | results.push(result);
36 | }
37 | };
38 |
39 | join_delta(input1, input2, push_result);
40 |
41 | output.insert(Relation::from_vec(results));
42 | }
43 |
44 | /// Joins the `recent` tuples of each input with the `stable` tuples of the other, then the
45 | /// `recent` tuples of *both* inputs.
46 | fn join_delta<'me, Key: Ord, Val1: Ord, Val2: Ord>(
47 | input1: &Variable<(Key, Val1)>,
48 | input2: impl JoinInput<'me, (Key, Val2)>,
49 | mut result: impl FnMut(&Key, &Val1, &Val2),
50 | ) {
51 | let recent1 = input1.recent();
52 | let recent2 = input2.recent();
53 |
54 | input2.for_each_stable_set(|batch2| {
55 | join_helper(&recent1, &batch2, &mut result);
56 | });
57 |
58 | input1.for_each_stable_set(|batch1| {
59 | join_helper(&batch1, &recent2, &mut result);
60 | });
61 |
62 | join_helper(&recent1, &recent2, &mut result);
63 | }
64 |
65 | /// Join, but for two relations.
66 | pub(crate) fn join_into_relation<'me, Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>(
67 | input1: &Relation<(Key, Val1)>,
68 | input2: &Relation<(Key, Val2)>,
69 | mut logic: impl FnMut(&Key, &Val1, &Val2) -> Result,
70 | ) -> Relation {
71 | let mut results = Vec::new();
72 |
73 | join_helper(&input1.elements, &input2.elements, |k, v1, v2| {
74 | results.push(logic(k, v1, v2));
75 | });
76 |
77 | Relation::from_vec(results)
78 | }
79 |
80 | /// Moves all recent tuples from `input1` that are not present in `input2` into `output`.
81 | pub(crate) fn antijoin(
82 | input1: &Relation<(Key, Val)>,
83 | input2: &Relation,
84 | mut logic: impl FnMut(&Key, &Val) -> Result,
85 | ) -> Relation {
86 | let mut tuples2 = &input2[..];
87 |
88 | let results = input1
89 | .elements
90 | .iter()
91 | .filter(|(ref key, _)| {
92 | tuples2 = gallop(tuples2, |k| k < key);
93 | tuples2.first() != Some(key)
94 | })
95 | .map(|(ref key, ref val)| logic(key, val))
96 | .collect::>();
97 |
98 | Relation::from_vec(results)
99 | }
100 |
101 | fn join_helper(
102 | mut slice1: &[(K, V1)],
103 | mut slice2: &[(K, V2)],
104 | mut result: impl FnMut(&K, &V1, &V2),
105 | ) {
106 | while !slice1.is_empty() && !slice2.is_empty() {
107 | use std::cmp::Ordering;
108 |
109 | // If the keys match produce tuples, else advance the smaller key until they might.
110 | match slice1[0].0.cmp(&slice2[0].0) {
111 | Ordering::Less => {
112 | slice1 = gallop(slice1, |x| x.0 < slice2[0].0);
113 | }
114 | Ordering::Equal => {
115 | // Determine the number of matching keys in each slice.
116 | let count1 = slice1.iter().take_while(|x| x.0 == slice1[0].0).count();
117 | let count2 = slice2.iter().take_while(|x| x.0 == slice2[0].0).count();
118 |
119 | // Produce results from the cross-product of matches.
120 | for index1 in 0..count1 {
121 | for s2 in slice2[..count2].iter() {
122 | result(&slice1[0].0, &slice1[index1].1, &s2.1);
123 | }
124 | }
125 |
126 | // Advance slices past this key.
127 | slice1 = &slice1[count1..];
128 | slice2 = &slice2[count2..];
129 | }
130 | Ordering::Greater => {
131 | slice2 = gallop(slice2, |x| x.0 < slice1[0].0);
132 | }
133 | }
134 | }
135 | }
136 |
137 | pub(crate) fn gallop(mut slice: &[T], mut cmp: impl FnMut(&T) -> bool) -> &[T] {
138 | // if empty slice, or already >= element, return
139 | if !slice.is_empty() && cmp(&slice[0]) {
140 | let mut step = 1;
141 | while step < slice.len() && cmp(&slice[step]) {
142 | slice = &slice[step..];
143 | step <<= 1;
144 | }
145 |
146 | step >>= 1;
147 | while step > 0 {
148 | if step < slice.len() && cmp(&slice[step]) {
149 | slice = &slice[step..];
150 | }
151 | step >>= 1;
152 | }
153 |
154 | slice = &slice[1..]; // advance one, as we always stayed < value
155 | }
156 |
157 | slice
158 | }
159 |
160 | /// An input that can be used with `from_join`; either a `Variable` or a `Relation`.
161 | pub trait JoinInput<'me, Tuple: Ord>: Copy {
162 | /// If we are on iteration N of the loop, these are the tuples
163 | /// added on iteration N-1. (For a `Relation`, this is always an
164 | /// empty slice.)
165 | type RecentTuples: Deref;
166 |
167 | /// Get the set of recent tuples.
168 | fn recent(self) -> Self::RecentTuples;
169 |
170 | /// Call a function for each set of stable tuples.
171 | fn for_each_stable_set(self, f: impl FnMut(&[Tuple]));
172 | }
173 |
174 | impl<'me, Tuple: Ord> JoinInput<'me, Tuple> for &'me Variable {
175 | type RecentTuples = Ref<'me, [Tuple]>;
176 |
177 | fn recent(self) -> Self::RecentTuples {
178 | Ref::map(self.recent.borrow(), |r| &r.elements[..])
179 | }
180 |
181 | fn for_each_stable_set(self, mut f: impl FnMut(&[Tuple])) {
182 | for stable in self.stable.borrow().iter() {
183 | f(stable)
184 | }
185 | }
186 | }
187 |
188 | impl<'me, Tuple: Ord> JoinInput<'me, Tuple> for &'me Relation {
189 | type RecentTuples = &'me [Tuple];
190 |
191 | fn recent(self) -> Self::RecentTuples {
192 | &[]
193 | }
194 |
195 | fn for_each_stable_set(self, mut f: impl FnMut(&[Tuple])) {
196 | f(&self.elements)
197 | }
198 | }
199 |
200 | impl<'me, Tuple: Ord> JoinInput<'me, (Tuple, ())> for &'me Relation {
201 | type RecentTuples = &'me [(Tuple, ())];
202 |
203 | fn recent(self) -> Self::RecentTuples {
204 | &[]
205 | }
206 |
207 | fn for_each_stable_set(self, mut f: impl FnMut(&[(Tuple, ())])) {
208 | use std::mem;
209 | assert_eq!(mem::size_of::<(Tuple, ())>(), mem::size_of::());
210 | assert_eq!(mem::align_of::<(Tuple, ())>(), mem::align_of::());
211 |
212 | // SAFETY: https://rust-lang.github.io/unsafe-code-guidelines/layout/structs-and-tuples.html#structs-with-1-zst-fields
213 | // guarantees that `T` is layout compatible with `(T, ())`, since `()` is a 1-ZST. We use
214 | // `slice::from_raw_parts` because the layout compatibility guarantee does not extend to
215 | // containers like `&[T]`.
216 | let elements: &'me [Tuple] = self.elements.as_slice();
217 | let len = elements.len();
218 |
219 | let elements: &'me [(Tuple, ())] =
220 | unsafe { std::slice::from_raw_parts(elements.as_ptr() as *const _, len) };
221 |
222 | f(elements)
223 | }
224 | }
225 |
--------------------------------------------------------------------------------
/src/lib.rs:
--------------------------------------------------------------------------------
1 | //! A lightweight Datalog engine in Rust
2 | //!
3 | //! The intended design is that one has static `Relation` types that are sets
4 | //! of tuples, and `Variable` types that represent monotonically increasing
5 | //! sets of tuples.
6 | //!
7 | //! The types are mostly wrappers around `Vec` indicating sorted-ness,
8 | //! and the intent is that this code can be dropped in the middle of an otherwise
9 | //! normal Rust program, run to completion, and then the results extracted as
10 | //! vectors again.
11 |
12 | #![forbid(missing_docs)]
13 |
14 | mod iteration;
15 | mod join;
16 | mod map;
17 | mod merge;
18 | mod relation;
19 | mod test;
20 | mod treefrog;
21 | mod variable;
22 |
23 | pub use crate::{
24 | iteration::Iteration,
25 | join::JoinInput,
26 | relation::Relation,
27 | treefrog::{
28 | extend_anti::ExtendAnti,
29 | extend_with::ExtendWith,
30 | filter_anti::FilterAnti,
31 | filter_with::FilterWith,
32 | filters::{passthrough, PrefixFilter, ValueFilter},
33 | Leaper, Leapers, RelationLeaper,
34 | },
35 | variable::Variable,
36 | };
37 |
--------------------------------------------------------------------------------
/src/map.rs:
--------------------------------------------------------------------------------
1 | //! Map functionality.
2 |
3 | use super::{Relation, Variable};
4 |
5 | pub(crate) fn map_into(
6 | input: &Variable,
7 | output: &Variable,
8 | logic: impl FnMut(&T1) -> T2,
9 | ) {
10 | let results: Vec = input.recent.borrow().iter().map(logic).collect();
11 |
12 | output.insert(Relation::from_vec(results));
13 | }
14 |
--------------------------------------------------------------------------------
/src/merge.rs:
--------------------------------------------------------------------------------
1 | //! Subroutines for merging sorted lists efficiently.
2 |
3 | use std::cmp::Ordering;
4 |
5 | /// Merges two sorted lists into a single sorted list, ignoring duplicates.
6 | pub fn merge_unique(mut a: Vec, mut b: Vec) -> Vec {
7 | // If one of the lists is zero-length, we don't need to do any work.
8 | if a.is_empty() {
9 | return b;
10 | }
11 | if b.is_empty() {
12 | return a;
13 | }
14 |
15 | // Fast path for when all the new elements are after the existing ones.
16 | //
17 | // Cannot panic because we check for empty inputs above.
18 | if *a.last().unwrap() < b[0] {
19 | a.append(&mut b);
20 | return a;
21 | }
22 | if *b.last().unwrap() < a[0] {
23 | b.append(&mut a);
24 | return b;
25 | }
26 |
27 | // Ensure that `out` always has sufficient capacity.
28 | //
29 | // SAFETY: The calls to `push_unchecked` below are safe because of this.
30 | let mut out = Vec::with_capacity(a.len() + b.len());
31 |
32 | let mut a = a.into_iter();
33 | let mut b = b.into_iter();
34 |
35 | // While both inputs have elements remaining, copy the lesser element to the output vector.
36 | while a.len() != 0 && b.len() != 0 {
37 | // SAFETY: The following calls to `get_unchecked` and `next_unchecked` are safe because we
38 | // ensure that `a.len() > 0` and `b.len() > 0` inside the loop.
39 | //
40 | // I was hoping to avoid using "unchecked" operations, but it seems the bounds checks
41 | // don't get optimized away. Using `ExactSizeIterator::is_empty` instead of checking `len`
42 | // seemed to help, but that method is unstable.
43 |
44 | let a_elem = unsafe { a.as_slice().get_unchecked(0) };
45 | let b_elem = unsafe { b.as_slice().get_unchecked(0) };
46 | match a_elem.cmp(b_elem) {
47 | Ordering::Less => unsafe { push_unchecked(&mut out, next_unchecked(&mut a)) },
48 | Ordering::Greater => unsafe { push_unchecked(&mut out, next_unchecked(&mut b)) },
49 | Ordering::Equal => unsafe {
50 | push_unchecked(&mut out, next_unchecked(&mut a));
51 | std::mem::drop(next_unchecked(&mut b));
52 | },
53 | }
54 | }
55 |
56 | // Once either `a` or `b` runs out of elements, copy all remaining elements in the other one
57 | // directly to the back of the output list.
58 | //
59 | // This branch is free because we have to check `a.is_empty()` above anyways.
60 | //
61 | // Calling `push_unchecked` in a loop was slightly faster than `out.extend(...)`
62 | // despite the fact that `std::vec::IntoIter` implements `TrustedLen`.
63 | if a.len() != 0 {
64 | for elem in a {
65 | unsafe {
66 | push_unchecked(&mut out, elem);
67 | }
68 | }
69 | } else {
70 | for elem in b {
71 | unsafe {
72 | push_unchecked(&mut out, elem);
73 | }
74 | }
75 | }
76 |
77 | out
78 | }
79 |
80 | /// Pushes `value` to `vec` without checking that the vector has sufficient capacity.
81 | ///
82 | /// If `vec.len() == vec.cap()`, calling this function is UB.
83 | unsafe fn push_unchecked(vec: &mut Vec, value: T) {
84 | let end = vec.as_mut_ptr().add(vec.len());
85 | std::ptr::write(end, value);
86 | vec.set_len(vec.len() + 1);
87 | }
88 |
89 | /// Equivalent to `iter.next().unwrap()` that is UB to call when `iter` is empty.
90 | unsafe fn next_unchecked(iter: &mut std::vec::IntoIter) -> T {
91 | match iter.next() {
92 | Some(x) => x,
93 | None => std::hint::unreachable_unchecked(),
94 | }
95 | }
96 |
--------------------------------------------------------------------------------
/src/relation.rs:
--------------------------------------------------------------------------------
1 | use std::iter::FromIterator;
2 |
3 | use crate::{
4 | join,
5 | merge,
6 | treefrog::{self, Leapers},
7 | };
8 |
9 | /// A static, ordered list of key-value pairs.
10 | ///
11 | /// A relation represents a fixed set of key-value pairs. In many places in a
12 | /// Datalog computation we want to be sure that certain relations are not able
13 | /// to vary (for example, in antijoins).
14 | #[derive(Clone, Debug, PartialEq, Eq)]
15 | pub struct Relation {
16 | /// Sorted list of distinct tuples.
17 | pub elements: Vec,
18 | }
19 |
20 | impl Relation {
21 | /// Merges two relations into their union.
22 | pub fn merge(self, other: Self) -> Self {
23 | let elements = merge::merge_unique(self.elements, other.elements);
24 | Relation { elements }
25 | }
26 |
27 | /// Creates a `Relation` from the elements of the `iterator`.
28 | ///
29 | /// Same as the `from_iter` method from `std::iter::FromIterator` trait.
30 | pub fn from_iter(iterator: I) -> Self
31 | where
32 | I: IntoIterator- ,
33 | {
34 | iterator.into_iter().collect()
35 | }
36 |
37 | /// Creates a `Relation` using the `leapjoin` logic;
38 | /// see [`Variable::from_leapjoin`](crate::Variable::from_leapjoin)
39 | pub fn from_leapjoin<'leap, SourceTuple: Ord, Val: Ord + 'leap>(
40 | source: &Relation,
41 | leapers: impl Leapers<'leap, SourceTuple, Val>,
42 | logic: impl FnMut(&SourceTuple, &Val) -> Tuple,
43 | ) -> Self {
44 | treefrog::leapjoin(&source.elements, leapers, logic)
45 | }
46 |
47 | /// Creates a `Relation` by joining the values from `input1` and `input2` and then applying
48 | /// `logic`. Like [`Variable::from_join`](crate::Variable::from_join) except for use where
49 | /// the inputs are not varying across iterations.
50 | pub fn from_join(
51 | input1: &Relation<(Key, Val1)>,
52 | input2: &Relation<(Key, Val2)>,
53 | logic: impl FnMut(&Key, &Val1, &Val2) -> Tuple,
54 | ) -> Self {
55 | join::join_into_relation(input1, input2, logic)
56 | }
57 |
58 | /// Creates a `Relation` by removing all values from `input1` that share a key with `input2`,
59 | /// and then transforming the resulting tuples with the `logic` closure. Like
60 | /// [`Variable::from_antijoin`](crate::Variable::from_antijoin) except for use where the
61 | /// inputs are not varying across iterations.
62 | pub fn from_antijoin(
63 | input1: &Relation<(Key, Val1)>,
64 | input2: &Relation,
65 | logic: impl FnMut(&Key, &Val1) -> Tuple,
66 | ) -> Self {
67 | join::antijoin(input1, input2, logic)
68 | }
69 |
70 | /// Construct a new relation by mapping another one. Equivalent to
71 | /// creating an iterator but perhaps more convenient. Analogous to
72 | /// `Variable::from_map`.
73 | pub fn from_map(input: &Relation, logic: impl FnMut(&T2) -> Tuple) -> Self {
74 | input.iter().map(logic).collect()
75 | }
76 |
77 | /// Creates a `Relation` from a vector of tuples.
78 | pub fn from_vec(mut elements: Vec) -> Self {
79 | elements.sort();
80 | elements.dedup();
81 | Relation { elements }
82 | }
83 | }
84 |
85 | impl From> for Relation {
86 | fn from(iterator: Vec) -> Self {
87 | Self::from_vec(iterator)
88 | }
89 | }
90 |
91 | impl FromIterator for Relation {
92 | fn from_iter(iterator: I) -> Self
93 | where
94 | I: IntoIterator
- ,
95 | {
96 | Relation::from_vec(iterator.into_iter().collect())
97 | }
98 | }
99 |
100 | impl<'tuple, Tuple: 'tuple + Clone + Ord> FromIterator<&'tuple Tuple> for Relation {
101 | fn from_iter(iterator: I) -> Self
102 | where
103 | I: IntoIterator
- ,
104 | {
105 | Relation::from_vec(iterator.into_iter().cloned().collect())
106 | }
107 | }
108 |
109 | impl std::ops::Deref for Relation {
110 | type Target = [Tuple];
111 | fn deref(&self) -> &Self::Target {
112 | &self.elements[..]
113 | }
114 | }
115 |
--------------------------------------------------------------------------------
/src/test.rs:
--------------------------------------------------------------------------------
1 | #![cfg(test)]
2 |
3 | use crate::Iteration;
4 | use crate::Relation;
5 | use crate::RelationLeaper;
6 | use proptest::prelude::*;
7 | use proptest::{proptest, proptest_helper};
8 |
9 | fn inputs() -> impl Strategy> {
10 | prop::collection::vec((0_u32..100, 0_u32..100), 1..500)
11 | }
12 |
13 | /// The original way to use datafrog -- computes reachable nodes from a set of edges
14 | fn reachable_with_var_join(edges: &[(u32, u32)]) -> Relation<(u32, u32)> {
15 | let edges: Relation<_> = edges.iter().collect();
16 | let mut iteration = Iteration::new();
17 |
18 | let edges_by_successor = iteration.variable::<(u32, u32)>("edges_invert");
19 | edges_by_successor.extend(edges.iter().map(|&(n1, n2)| (n2, n1)));
20 |
21 | let reachable = iteration.variable::<(u32, u32)>("reachable");
22 | reachable.insert(edges);
23 |
24 | while iteration.changed() {
25 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3).
26 | reachable.from_join(&reachable, &edges_by_successor, |&_, &n3, &n1| (n1, n3));
27 | }
28 |
29 | reachable.complete()
30 | }
31 |
32 | /// Like `reachable`, but using a relation as an input to `from_join`
33 | fn reachable_with_relation_join(edges: &[(u32, u32)]) -> Relation<(u32, u32)> {
34 | let edges: Relation<_> = edges.iter().collect();
35 | let mut iteration = Iteration::new();
36 |
37 | // NB. Changed from `reachable_with_var_join`:
38 | let edges_by_successor: Relation<_> = edges.iter().map(|&(n1, n2)| (n2, n1)).collect();
39 |
40 | let reachable = iteration.variable::<(u32, u32)>("reachable");
41 | reachable.insert(edges);
42 |
43 | while iteration.changed() {
44 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3).
45 | reachable.from_join(&reachable, &edges_by_successor, |&_, &n3, &n1| (n1, n3));
46 | }
47 |
48 | reachable.complete()
49 | }
50 |
51 | fn reachable_with_leapfrog(edges: &[(u32, u32)]) -> Relation<(u32, u32)> {
52 | let edges: Relation<_> = edges.iter().collect();
53 | let mut iteration = Iteration::new();
54 |
55 | let edges_by_successor: Relation<_> = edges.iter().map(|&(n1, n2)| (n2, n1)).collect();
56 |
57 | let reachable = iteration.variable::<(u32, u32)>("reachable");
58 | reachable.insert(edges);
59 |
60 | while iteration.changed() {
61 | // reachable(N1, N3) :- edges(N1, N2), reachable(N2, N3).
62 | reachable.from_leapjoin(
63 | &reachable,
64 | edges_by_successor.extend_with(|&(n2, _)| n2),
65 | |&(_, n3), &n1| (n1, n3),
66 | );
67 | }
68 |
69 | reachable.complete()
70 | }
71 |
72 | /// Computes a join where the values are summed -- uses iteration
73 | /// variables (the original datafrog technique).
74 | fn sum_join_via_var(
75 | input1_slice: &[(u32, u32)],
76 | input2_slice: &[(u32, u32)],
77 | ) -> Relation<(u32, u32)> {
78 | let mut iteration = Iteration::new();
79 |
80 | let input1 = iteration.variable::<(u32, u32)>("input1");
81 | input1.extend(input1_slice);
82 |
83 | let input2 = iteration.variable::<(u32, u32)>("input1");
84 | input2.extend(input2_slice);
85 |
86 | let output = iteration.variable::<(u32, u32)>("output");
87 |
88 | while iteration.changed() {
89 | // output(K1, V1 * 100 + V2) :- input1(K1, V1), input2(K1, V2).
90 | output.from_join(&input1, &input2, |&k1, &v1, &v2| (k1, v1 * 100 + v2));
91 | }
92 |
93 | output.complete()
94 | }
95 |
96 | /// Computes a join where the values are summed -- uses iteration
97 | /// variables (the original datafrog technique).
98 | fn sum_join_via_relation(
99 | input1_slice: &[(u32, u32)],
100 | input2_slice: &[(u32, u32)],
101 | ) -> Relation<(u32, u32)> {
102 | let input1: Relation<_> = input1_slice.iter().collect();
103 | let input2: Relation<_> = input2_slice.iter().collect();
104 | Relation::from_join(&input1, &input2, |&k1, &v1, &v2| (k1, v1 * 100 + v2))
105 | }
106 |
107 | proptest! {
108 | #[test]
109 | fn reachable_leapfrog_vs_var_join(edges in inputs()) {
110 | let reachable1 = reachable_with_var_join(&edges);
111 | let reachable2 = reachable_with_leapfrog(&edges);
112 | assert_eq!(reachable1.elements, reachable2.elements);
113 | }
114 |
115 | #[test]
116 | fn reachable_rel_join_vs_var_join(edges in inputs()) {
117 | let reachable1 = reachable_with_var_join(&edges);
118 | let reachable2 = reachable_with_relation_join(&edges);
119 | assert_eq!(reachable1.elements, reachable2.elements);
120 | }
121 |
122 | #[test]
123 | fn sum_join_from_var_vs_rel((set1, set2) in (inputs(), inputs())) {
124 | let output1 = sum_join_via_var(&set1, &set2);
125 | let output2 = sum_join_via_relation(&set1, &set2);
126 | assert_eq!(output1.elements, output2.elements);
127 | }
128 |
129 | /// Test the behavior of `filter_anti` used on its own in a
130 | /// leapjoin -- effectively it becomes an "intersection"
131 | /// operation.
132 | #[test]
133 | fn filter_with_on_its_own((set1, set2) in (inputs(), inputs())) {
134 | let input1: Relation<(u32, u32)> = set1.iter().collect();
135 | let input2: Relation<(u32, u32)> = set2.iter().collect();
136 | let intersection1 = Relation::from_leapjoin(
137 | &input1,
138 | input2.filter_with(|&tuple| tuple),
139 | |&tuple, &()| tuple,
140 | );
141 |
142 | let intersection2: Relation<(u32, u32)> = input1.elements.iter()
143 | .filter(|t| input2.elements.binary_search(&t).is_ok())
144 | .collect();
145 |
146 | assert_eq!(intersection1.elements, intersection2.elements);
147 | }
148 |
149 | /// Test the behavior of `filter_anti` used on its own in a
150 | /// leapjoin -- effectively it becomes a "set minus" operation.
151 | #[test]
152 | fn filter_anti_on_its_own((set1, set2) in (inputs(), inputs())) {
153 | let input1: Relation<(u32, u32)> = set1.iter().collect();
154 | let input2: Relation<(u32, u32)> = set2.iter().collect();
155 |
156 | let difference1 = Relation::from_leapjoin(
157 | &input1,
158 | input2.filter_anti(|&tuple| tuple),
159 | |&tuple, &()| tuple,
160 | );
161 |
162 | let difference2: Relation<(u32, u32)> = input1.elements.iter()
163 | .filter(|t| input2.elements.binary_search(&t).is_err())
164 | .collect();
165 |
166 | assert_eq!(difference1.elements, difference2.elements);
167 | }
168 | }
169 |
170 | /// Test that `from_leapjoin` matches against the tuples from an
171 | /// `extend` that precedes first iteration.
172 | ///
173 | /// This was always true, but wasn't immediately obvious to me until I
174 | /// re-read the code more carefully. -nikomatsakis
175 | #[test]
176 | fn leapjoin_from_extend() {
177 | let doubles: Relation<(u32, u32)> = (0..10).map(|i| (i, i * 2)).collect();
178 |
179 | let mut iteration = Iteration::new();
180 |
181 | let variable = iteration.variable::<(u32, u32)>("variable");
182 | variable.extend(Some((2, 2)));
183 |
184 | while iteration.changed() {
185 | variable.from_leapjoin(
186 | &variable,
187 | doubles.extend_with(|&(i, _)| i),
188 | |&(i, _), &j| (i, j),
189 | );
190 | }
191 |
192 | let variable = variable.complete();
193 |
194 | assert_eq!(variable.elements, vec![(2, 2), (2, 4)]);
195 | }
196 |
197 | #[test]
198 | fn passthrough_leaper() {
199 | let mut iteration = Iteration::new();
200 |
201 | let variable = iteration.variable::<(u32, u32)>("variable");
202 | variable.extend((0..10).map(|i| (i, i)));
203 |
204 | while iteration.changed() {
205 | variable.from_leapjoin(
206 | &variable,
207 | (
208 | crate::passthrough(), // Without this, the test would fail at runtime.
209 | crate::PrefixFilter::from(|&(i, _)| i <= 20),
210 | ),
211 | |&(i, j), ()| (2*i, 2*j),
212 | );
213 | }
214 |
215 | let variable = variable.complete();
216 |
217 | let mut expected: Vec<_> = (0..10).map(|i| (i, i)).collect();
218 | expected.extend((10..20).filter_map(|i| (i%2 == 0).then(|| (i, i))));
219 | expected.extend((20..=40).filter_map(|i| (i%4 == 0).then(|| (i, i))));
220 | assert_eq!(&*variable, &expected);
221 | }
222 |
223 | #[test]
224 | fn relation_from_antijoin() {
225 | let lhs: Relation<_> = (0 .. 10).map(|x| (x, x)).collect();
226 | let rhs: Relation<_> = (0 .. 10).filter(|x| x % 2 == 0).collect();
227 | let expected: Relation<_> = (0 .. 10).filter(|x| x % 2 == 1).map(|x| (x, x)).collect();
228 |
229 | let result = Relation::from_antijoin(&lhs, &rhs, |a, b| (*a, *b));
230 |
231 | assert_eq!(result.elements, expected.elements);
232 | }
233 |
--------------------------------------------------------------------------------
/src/treefrog.rs:
--------------------------------------------------------------------------------
1 | //! Join functionality.
2 |
3 | use super::Relation;
4 |
5 | /// Performs treefrog leapjoin using a list of leapers.
6 | pub(crate) fn leapjoin<'leap, Tuple: Ord, Val: Ord + 'leap, Result: Ord>(
7 | source: &[Tuple],
8 | mut leapers: impl Leapers<'leap, Tuple, Val>,
9 | mut logic: impl FnMut(&Tuple, &Val) -> Result,
10 | ) -> Relation {
11 | let mut result = Vec::new(); // temp output storage.
12 | let mut values = Vec::new(); // temp value storage.
13 |
14 | for tuple in source {
15 | // Determine which leaper would propose the fewest values.
16 | let mut min_index = usize::max_value();
17 | let mut min_count = usize::max_value();
18 | leapers.for_each_count(tuple, |index, count| {
19 | if min_count > count {
20 | min_count = count;
21 | min_index = index;
22 | }
23 | });
24 |
25 | // We had best have at least one relation restricting values.
26 | assert!(min_count < usize::max_value());
27 |
28 | // If there are values to propose:
29 | if min_count > 0 {
30 | // Push the values that `min_index` "proposes" into `values`.
31 | leapers.propose(tuple, min_index, &mut values);
32 |
33 | // Give other leapers a chance to remove values from
34 | // anti-joins or filters.
35 | leapers.intersect(tuple, min_index, &mut values);
36 |
37 | // Push remaining items into result.
38 | for val in values.drain(..) {
39 | result.push(logic(tuple, val));
40 | }
41 | }
42 | }
43 |
44 | Relation::from_vec(result)
45 | }
46 |
47 | /// Implemented for a tuple of leapers
48 | pub trait Leapers<'leap, Tuple, Val> {
49 | /// Internal method:
50 | fn for_each_count(&mut self, tuple: &Tuple, op: impl FnMut(usize, usize));
51 |
52 | /// Internal method:
53 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>);
54 |
55 | /// Internal method:
56 | fn intersect(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>);
57 | }
58 |
59 | macro_rules! tuple_leapers {
60 | ($($Ty:ident)*) => {
61 | #[allow(unused_assignments, non_snake_case)]
62 | impl<'leap, Tuple, Val, $($Ty),*> Leapers<'leap, Tuple, Val> for ($($Ty,)*)
63 | where
64 | $($Ty: Leaper<'leap, Tuple, Val>,)*
65 | {
66 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) {
67 | let ($($Ty,)*) = self;
68 | let mut index = 0;
69 | $(
70 | let count = $Ty.count(tuple);
71 | op(index, count);
72 | index += 1;
73 | )*
74 | }
75 |
76 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) {
77 | let ($($Ty,)*) = self;
78 | let mut index = 0;
79 | $(
80 | if min_index == index {
81 | return $Ty.propose(tuple, values);
82 | }
83 | index += 1;
84 | )*
85 | panic!("no match found for min_index={}", min_index);
86 | }
87 |
88 | fn intersect(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) {
89 | let ($($Ty,)*) = self;
90 | let mut index = 0;
91 | $(
92 | if min_index != index {
93 | $Ty.intersect(tuple, values);
94 | }
95 | index += 1;
96 | )*
97 | }
98 | }
99 | }
100 | }
101 |
102 | tuple_leapers!(A B);
103 | tuple_leapers!(A B C);
104 | tuple_leapers!(A B C D);
105 | tuple_leapers!(A B C D E);
106 | tuple_leapers!(A B C D E F);
107 | tuple_leapers!(A B C D E F G);
108 |
109 | /// Methods to support treefrog leapjoin.
110 | pub trait Leaper<'leap, Tuple, Val> {
111 | /// Estimates the number of proposed values.
112 | fn count(&mut self, prefix: &Tuple) -> usize;
113 | /// Populates `values` with proposed values.
114 | fn propose(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>);
115 | /// Restricts `values` to proposed values.
116 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>);
117 | }
118 |
119 | pub(crate) mod filters {
120 | use super::Leaper;
121 | use super::Leapers;
122 |
123 | /// A treefrog leaper that tests each of the tuples from the main
124 | /// input (the "prefix"). Use like `PrefixFilter::from(|tuple|
125 | /// ...)`; if the closure returns true, then the tuple is
126 | /// retained, else it will be ignored. This leaper can be used in
127 | /// isolation in which case it just acts like a filter on the
128 | /// input (the "proposed value" will be `()` type).
129 | pub struct PrefixFilter bool> {
130 | phantom: ::std::marker::PhantomData,
131 | predicate: Func,
132 | }
133 |
134 | impl<'leap, Tuple, Func> PrefixFilter
135 | where
136 | Func: Fn(&Tuple) -> bool,
137 | {
138 | /// Creates a new filter based on the prefix
139 | pub fn from(predicate: Func) -> Self {
140 | PrefixFilter {
141 | phantom: ::std::marker::PhantomData,
142 | predicate,
143 | }
144 | }
145 | }
146 |
147 | impl<'leap, Tuple, Val, Func> Leaper<'leap, Tuple, Val> for PrefixFilter
148 | where
149 | Func: Fn(&Tuple) -> bool,
150 | {
151 | /// Estimates the number of proposed values.
152 | fn count(&mut self, prefix: &Tuple) -> usize {
153 | if (self.predicate)(prefix) {
154 | usize::max_value()
155 | } else {
156 | 0
157 | }
158 | }
159 | /// Populates `values` with proposed values.
160 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) {
161 | panic!("PrefixFilter::propose(): variable apparently unbound");
162 | }
163 | /// Restricts `values` to proposed values.
164 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) {
165 | // We can only be here if we returned max_value() above.
166 | }
167 | }
168 |
169 | impl<'leap, Tuple, Func> Leapers<'leap, Tuple, ()> for PrefixFilter
170 | where
171 | Func: Fn(&Tuple) -> bool,
172 | {
173 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) {
174 | if >::count(self, tuple) == 0 {
175 | op(0, 0)
176 | } else {
177 | // we will "propose" the `()` value if the predicate applies
178 | op(0, 1)
179 | }
180 | }
181 |
182 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
183 | assert_eq!(min_index, 0);
184 | values.push(&());
185 | }
186 |
187 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
188 | assert_eq!(min_index, 0);
189 | assert_eq!(values.len(), 1);
190 | }
191 | }
192 |
193 | pub struct Passthrough {
194 | phantom: ::std::marker::PhantomData,
195 | }
196 |
197 | impl Passthrough {
198 | fn new() -> Self {
199 | Passthrough {
200 | phantom: ::std::marker::PhantomData,
201 | }
202 | }
203 | }
204 |
205 | impl<'leap, Tuple> Leaper<'leap, Tuple, ()> for Passthrough {
206 | /// Estimates the number of proposed values.
207 | fn count(&mut self, _prefix: &Tuple) -> usize {
208 | 1
209 | }
210 | /// Populates `values` with proposed values.
211 | fn propose(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap ()>) {
212 | values.push(&())
213 | }
214 | /// Restricts `values` to proposed values.
215 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap ()>) {
216 | // `Passthrough` never removes values (although if we're here it indicates that the user
217 | // didn't need a `Passthrough` in the first place)
218 | }
219 | }
220 |
221 | /// Returns a leaper that proposes a single copy of each tuple from the main input.
222 | ///
223 | /// Use this when you don't need any "extend" leapers in a join, only "filter"s. For example,
224 | /// in the following datalog rule, all terms in the second and third predicate are bound in the
225 | /// first one (the "main input" to our leapjoin).
226 | ///
227 | /// ```prolog
228 | /// error(loan, point) :-
229 | /// origin_contains_loan_at(origin, loan, point), % main input
230 | /// origin_live_at(origin, point),
231 | /// loan_invalidated_at(loan, point).
232 | /// ```
233 | ///
234 | /// Without a passthrough leaper, neither the filter for `origin_live_at` nor the one for
235 | /// `loan_invalidated_at` would propose any tuples, and the leapjoin would panic at runtime.
236 | pub fn passthrough() -> Passthrough {
237 | Passthrough::new()
238 | }
239 |
240 | /// A treefrog leaper based on a predicate of prefix and value.
241 | /// Use like `ValueFilter::from(|tuple, value| ...)`. The closure
242 | /// should return true if `value` ought to be retained. The
243 | /// `value` will be a value proposed elsewhere by an `extend_with`
244 | /// leaper.
245 | ///
246 | /// This leaper cannot be used in isolation, it must be combined
247 | /// with other leapers.
248 | pub struct ValueFilter bool> {
249 | phantom: ::std::marker::PhantomData<(Tuple, Val)>,
250 | predicate: Func,
251 | }
252 |
253 | impl<'leap, Tuple, Val, Func> ValueFilter
254 | where
255 | Func: Fn(&Tuple, &Val) -> bool,
256 | {
257 | /// Creates a new filter based on the prefix
258 | pub fn from(predicate: Func) -> Self {
259 | ValueFilter {
260 | phantom: ::std::marker::PhantomData,
261 | predicate,
262 | }
263 | }
264 | }
265 |
266 | impl<'leap, Tuple, Val, Func> Leaper<'leap, Tuple, Val> for ValueFilter
267 | where
268 | Func: Fn(&Tuple, &Val) -> bool,
269 | {
270 | /// Estimates the number of proposed values.
271 | fn count(&mut self, _prefix: &Tuple) -> usize {
272 | usize::max_value()
273 | }
274 | /// Populates `values` with proposed values.
275 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) {
276 | panic!("PrefixFilter::propose(): variable apparently unbound");
277 | }
278 | /// Restricts `values` to proposed values.
279 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>) {
280 | values.retain(|val| (self.predicate)(prefix, val));
281 | }
282 | }
283 | }
284 |
285 | /// Extension method for relations.
286 | pub trait RelationLeaper {
287 | /// Extend with `Val` using the elements of the relation.
288 | fn extend_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>(
289 | &'leap self,
290 | key_func: Func,
291 | ) -> extend_with::ExtendWith<'leap, Key, Val, Tuple, Func>
292 | where
293 | Key: 'leap,
294 | Val: 'leap;
295 | /// Extend with `Val` using the complement of the relation.
296 | fn extend_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>(
297 | &'leap self,
298 | key_func: Func,
299 | ) -> extend_anti::ExtendAnti<'leap, Key, Val, Tuple, Func>
300 | where
301 | Key: 'leap,
302 | Val: 'leap;
303 | /// Extend with any value if tuple is present in relation.
304 | fn filter_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>(
305 | &'leap self,
306 | key_func: Func,
307 | ) -> filter_with::FilterWith<'leap, Key, Val, Tuple, Func>
308 | where
309 | Key: 'leap,
310 | Val: 'leap;
311 | /// Extend with any value if tuple is absent from relation.
312 | fn filter_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>(
313 | &'leap self,
314 | key_func: Func,
315 | ) -> filter_anti::FilterAnti<'leap, Key, Val, Tuple, Func>
316 | where
317 | Key: 'leap,
318 | Val: 'leap;
319 | }
320 |
321 | impl RelationLeaper for Relation<(Key, Val)> {
322 | fn extend_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>(
323 | &'leap self,
324 | key_func: Func,
325 | ) -> extend_with::ExtendWith<'leap, Key, Val, Tuple, Func>
326 | where
327 | Key: 'leap,
328 | Val: 'leap,
329 | {
330 | extend_with::ExtendWith::from(self, key_func)
331 | }
332 | fn extend_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> Key>(
333 | &'leap self,
334 | key_func: Func,
335 | ) -> extend_anti::ExtendAnti<'leap, Key, Val, Tuple, Func>
336 | where
337 | Key: 'leap,
338 | Val: 'leap,
339 | {
340 | extend_anti::ExtendAnti::from(self, key_func)
341 | }
342 | fn filter_with<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>(
343 | &'leap self,
344 | key_func: Func,
345 | ) -> filter_with::FilterWith<'leap, Key, Val, Tuple, Func>
346 | where
347 | Key: 'leap,
348 | Val: 'leap,
349 | {
350 | filter_with::FilterWith::from(self, key_func)
351 | }
352 | fn filter_anti<'leap, Tuple: Ord, Func: Fn(&Tuple) -> (Key, Val)>(
353 | &'leap self,
354 | key_func: Func,
355 | ) -> filter_anti::FilterAnti<'leap, Key, Val, Tuple, Func>
356 | where
357 | Key: 'leap,
358 | Val: 'leap,
359 | {
360 | filter_anti::FilterAnti::from(self, key_func)
361 | }
362 | }
363 |
364 | pub(crate) mod extend_with {
365 | use super::{binary_search, Leaper, Leapers, Relation};
366 | use crate::join::gallop;
367 |
368 | /// Wraps a Relation as a leaper.
369 | pub struct ExtendWith<'leap, Key, Val, Tuple, Func>
370 | where
371 | Key: Ord + 'leap,
372 | Val: Ord + 'leap,
373 | Tuple: Ord,
374 | Func: Fn(&Tuple) -> Key,
375 | {
376 | relation: &'leap Relation<(Key, Val)>,
377 | start: usize,
378 | end: usize,
379 | key_func: Func,
380 | old_key: Option,
381 | phantom: ::std::marker::PhantomData,
382 | }
383 |
384 | impl<'leap, Key, Val, Tuple, Func> ExtendWith<'leap, Key, Val, Tuple, Func>
385 | where
386 | Key: Ord + 'leap,
387 | Val: Ord + 'leap,
388 | Tuple: Ord,
389 | Func: Fn(&Tuple) -> Key,
390 | {
391 | /// Constructs a ExtendWith from a relation and key and value function.
392 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self {
393 | ExtendWith {
394 | relation,
395 | start: 0,
396 | end: 0,
397 | key_func,
398 | old_key: None,
399 | phantom: ::std::marker::PhantomData,
400 | }
401 | }
402 | }
403 |
404 | impl<'leap, Key, Val, Tuple, Func> Leaper<'leap, Tuple, Val>
405 | for ExtendWith<'leap, Key, Val, Tuple, Func>
406 | where
407 | Key: Ord + 'leap,
408 | Val: Ord + 'leap,
409 | Tuple: Ord,
410 | Func: Fn(&Tuple) -> Key,
411 | {
412 | fn count(&mut self, prefix: &Tuple) -> usize {
413 | let key = (self.key_func)(prefix);
414 | if self.old_key.as_ref() != Some(&key) {
415 | self.start = binary_search(&self.relation.elements, |x| &x.0 < &key);
416 | let slice1 = &self.relation[self.start..];
417 | let slice2 = gallop(slice1, |x| &x.0 <= &key);
418 | self.end = self.relation.len() - slice2.len();
419 |
420 | self.old_key = Some(key);
421 | }
422 |
423 | self.end - self.start
424 | }
425 | fn propose(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap Val>) {
426 | let slice = &self.relation[self.start..self.end];
427 | values.extend(slice.iter().map(|&(_, ref val)| val));
428 | }
429 | fn intersect(&mut self, _prefix: &Tuple, values: &mut Vec<&'leap Val>) {
430 | let mut slice = &self.relation[self.start..self.end];
431 | values.retain(|v| {
432 | slice = gallop(slice, |kv| &kv.1 < v);
433 | slice.get(0).map(|kv| &kv.1) == Some(v)
434 | });
435 | }
436 | }
437 |
438 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, Val>
439 | for ExtendWith<'leap, Key, Val, Tuple, Func>
440 | where
441 | Key: Ord + 'leap,
442 | Val: Ord + 'leap,
443 | Tuple: Ord,
444 | Func: Fn(&Tuple) -> Key,
445 | {
446 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) {
447 | op(0, self.count(tuple))
448 | }
449 |
450 | fn propose(&mut self, tuple: &Tuple, min_index: usize, values: &mut Vec<&'leap Val>) {
451 | assert_eq!(min_index, 0);
452 | Leaper::propose(self, tuple, values);
453 | }
454 |
455 | fn intersect(&mut self, _: &Tuple, min_index: usize, _: &mut Vec<&'leap Val>) {
456 | assert_eq!(min_index, 0);
457 | }
458 | }
459 | }
460 |
461 | pub(crate) mod extend_anti {
462 | use std::ops::Range;
463 |
464 | use super::{binary_search, Leaper, Relation};
465 | use crate::join::gallop;
466 |
467 | /// Wraps a Relation as a leaper.
468 | pub struct ExtendAnti<'leap, Key, Val, Tuple, Func>
469 | where
470 | Key: Ord + 'leap,
471 | Val: Ord + 'leap,
472 | Tuple: Ord,
473 | Func: Fn(&Tuple) -> Key,
474 | {
475 | relation: &'leap Relation<(Key, Val)>,
476 | key_func: Func,
477 | old_key: Option<(Key, Range)>,
478 | phantom: ::std::marker::PhantomData,
479 | }
480 |
481 | impl<'leap, Key, Val, Tuple, Func> ExtendAnti<'leap, Key, Val, Tuple, Func>
482 | where
483 | Key: Ord + 'leap,
484 | Val: Ord + 'leap,
485 | Tuple: Ord,
486 | Func: Fn(&Tuple) -> Key,
487 | {
488 | /// Constructs a ExtendAnti from a relation and key and value function.
489 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self {
490 | ExtendAnti {
491 | relation,
492 | key_func,
493 | old_key: None,
494 | phantom: ::std::marker::PhantomData,
495 | }
496 | }
497 | }
498 |
499 | impl<'leap, Key: Ord, Val: Ord + 'leap, Tuple: Ord, Func> Leaper<'leap, Tuple, Val>
500 | for ExtendAnti<'leap, Key, Val, Tuple, Func>
501 | where
502 | Key: Ord + 'leap,
503 | Val: Ord + 'leap,
504 | Tuple: Ord,
505 | Func: Fn(&Tuple) -> Key,
506 | {
507 | fn count(&mut self, _prefix: &Tuple) -> usize {
508 | usize::max_value()
509 | }
510 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val>) {
511 | panic!("ExtendAnti::propose(): variable apparently unbound.");
512 | }
513 | fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'leap Val>) {
514 | let key = (self.key_func)(prefix);
515 |
516 | let range = match self.old_key.as_ref() {
517 | Some((old, range)) if old == &key => range.clone(),
518 |
519 | _ => {
520 | let start = binary_search(&self.relation.elements, |x| &x.0 < &key);
521 | let slice1 = &self.relation[start..];
522 | let slice2 = gallop(slice1, |x| &x.0 <= &key);
523 | let range = start..self.relation.len()-slice2.len();
524 |
525 | self.old_key = Some((key, range.clone()));
526 |
527 | range
528 | }
529 | };
530 |
531 | let mut slice = &self.relation[range];
532 | if !slice.is_empty() {
533 | values.retain(|v| {
534 | slice = gallop(slice, |kv| &kv.1 < v);
535 | slice.get(0).map(|kv| &kv.1) != Some(v)
536 | });
537 | }
538 | }
539 | }
540 | }
541 |
542 | pub(crate) mod filter_with {
543 |
544 | use super::{Leaper, Leapers, Relation};
545 |
546 | /// Wraps a Relation as a leaper.
547 | pub struct FilterWith<'leap, Key, Val, Tuple, Func>
548 | where
549 | Key: Ord + 'leap,
550 | Val: Ord + 'leap,
551 | Tuple: Ord,
552 | Func: Fn(&Tuple) -> (Key, Val),
553 | {
554 | relation: &'leap Relation<(Key, Val)>,
555 | key_func: Func,
556 | old_key_val: Option<((Key, Val), bool)>,
557 | phantom: ::std::marker::PhantomData,
558 | }
559 |
560 | impl<'leap, Key, Val, Tuple, Func> FilterWith<'leap, Key, Val, Tuple, Func>
561 | where
562 | Key: Ord + 'leap,
563 | Val: Ord + 'leap,
564 | Tuple: Ord,
565 | Func: Fn(&Tuple) -> (Key, Val),
566 | {
567 | /// Constructs a FilterWith from a relation and key and value function.
568 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self {
569 | FilterWith {
570 | relation,
571 | key_func,
572 | old_key_val: None,
573 | phantom: ::std::marker::PhantomData,
574 | }
575 | }
576 | }
577 |
578 | impl<'leap, Key, Val, Val2, Tuple, Func> Leaper<'leap, Tuple, Val2>
579 | for FilterWith<'leap, Key, Val, Tuple, Func>
580 | where
581 | Key: Ord + 'leap,
582 | Val: Ord + 'leap,
583 | Tuple: Ord,
584 | Func: Fn(&Tuple) -> (Key, Val),
585 | {
586 | fn count(&mut self, prefix: &Tuple) -> usize {
587 | let key_val = (self.key_func)(prefix);
588 |
589 | if let Some((ref old_key_val, is_present)) = self.old_key_val {
590 | if old_key_val == &key_val {
591 | return if is_present { usize::MAX } else { 0 };
592 | }
593 | }
594 |
595 | let is_present = self.relation.binary_search(&key_val).is_ok();
596 | self.old_key_val = Some((key_val, is_present));
597 | if is_present { usize::MAX } else { 0 }
598 | }
599 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) {
600 | panic!("FilterWith::propose(): variable apparently unbound.");
601 | }
602 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) {
603 | // Only here because we didn't return zero above, right?
604 | }
605 | }
606 |
607 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, ()>
608 | for FilterWith<'leap, Key, Val, Tuple, Func>
609 | where
610 | Key: Ord + 'leap,
611 | Val: Ord + 'leap,
612 | Tuple: Ord,
613 | Func: Fn(&Tuple) -> (Key, Val),
614 | {
615 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) {
616 | if >::count(self, tuple) == 0 {
617 | op(0, 0)
618 | } else {
619 | op(0, 1)
620 | }
621 | }
622 |
623 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
624 | assert_eq!(min_index, 0);
625 | values.push(&());
626 | }
627 |
628 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
629 | assert_eq!(min_index, 0);
630 | assert_eq!(values.len(), 1);
631 | }
632 | }
633 | }
634 |
635 | pub(crate) mod filter_anti {
636 |
637 | use super::{Leaper, Leapers, Relation};
638 |
639 | /// Wraps a Relation as a leaper.
640 | pub struct FilterAnti<'leap, Key, Val, Tuple, Func>
641 | where
642 | Key: Ord + 'leap,
643 | Val: Ord + 'leap,
644 | Tuple: Ord,
645 | Func: Fn(&Tuple) -> (Key, Val),
646 | {
647 | relation: &'leap Relation<(Key, Val)>,
648 | key_func: Func,
649 | old_key_val: Option<((Key, Val), bool)>,
650 | phantom: ::std::marker::PhantomData,
651 | }
652 |
653 | impl<'leap, Key, Val, Tuple, Func> FilterAnti<'leap, Key, Val, Tuple, Func>
654 | where
655 | Key: Ord + 'leap,
656 | Val: Ord + 'leap,
657 | Tuple: Ord,
658 | Func: Fn(&Tuple) -> (Key, Val),
659 | {
660 | /// Constructs a FilterAnti from a relation and key and value function.
661 | pub fn from(relation: &'leap Relation<(Key, Val)>, key_func: Func) -> Self {
662 | FilterAnti {
663 | relation,
664 | key_func,
665 | old_key_val: None,
666 | phantom: ::std::marker::PhantomData,
667 | }
668 | }
669 | }
670 |
671 | impl<'leap, Key: Ord, Val: Ord + 'leap, Val2, Tuple: Ord, Func> Leaper<'leap, Tuple, Val2>
672 | for FilterAnti<'leap, Key, Val, Tuple, Func>
673 | where
674 | Key: Ord + 'leap,
675 | Val: Ord + 'leap,
676 | Tuple: Ord,
677 | Func: Fn(&Tuple) -> (Key, Val),
678 | {
679 | fn count(&mut self, prefix: &Tuple) -> usize {
680 | let key_val = (self.key_func)(prefix);
681 |
682 | if let Some((ref old_key_val, is_present)) = self.old_key_val {
683 | if old_key_val == &key_val {
684 | return if is_present { 0 } else { usize::MAX };
685 | }
686 | }
687 |
688 | let is_present = self.relation.binary_search(&key_val).is_ok();
689 | self.old_key_val = Some((key_val, is_present));
690 | if is_present { 0 } else { usize::MAX }
691 | }
692 | fn propose(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) {
693 | panic!("FilterAnti::propose(): variable apparently unbound.");
694 | }
695 | fn intersect(&mut self, _prefix: &Tuple, _values: &mut Vec<&'leap Val2>) {
696 | // Only here because we didn't return zero above, right?
697 | }
698 | }
699 |
700 | impl<'leap, Key, Val, Tuple, Func> Leapers<'leap, Tuple, ()>
701 | for FilterAnti<'leap, Key, Val, Tuple, Func>
702 | where
703 | Key: Ord + 'leap,
704 | Val: Ord + 'leap,
705 | Tuple: Ord,
706 | Func: Fn(&Tuple) -> (Key, Val),
707 | {
708 | fn for_each_count(&mut self, tuple: &Tuple, mut op: impl FnMut(usize, usize)) {
709 | if >::count(self, tuple) == 0 {
710 | op(0, 0)
711 | } else {
712 | op(0, 1)
713 | }
714 | }
715 |
716 | fn propose(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
717 | // We only get here if `tuple` is *not* a member of `self.relation`
718 | assert_eq!(min_index, 0);
719 | values.push(&());
720 | }
721 |
722 | fn intersect(&mut self, _: &Tuple, min_index: usize, values: &mut Vec<&'leap ()>) {
723 | // We only get here if `tuple` is not a member of `self.relation`
724 | assert_eq!(min_index, 0);
725 | assert_eq!(values.len(), 1);
726 | }
727 | }
728 | }
729 |
730 | /// Returns the lowest index for which `cmp(&vec[i])` returns `true`, assuming `vec` is in sorted
731 | /// order.
732 | ///
733 | /// By accepting a vector instead of a slice, we can do a small optimization when computing the
734 | /// midpoint.
735 | fn binary_search(vec: &Vec, mut cmp: impl FnMut(&T) -> bool) -> usize {
736 | // The midpoint calculation we use below is only correct for vectors with less than `isize::MAX`
737 | // elements. This is always true for vectors of sized types but maybe not for ZSTs? Sorting
738 | // ZSTs doesn't make much sense, so just forbid it here.
739 | assert!(std::mem::size_of::() > 0);
740 |
741 | // we maintain the invariant that `lo` many elements of `slice` satisfy `cmp`.
742 | // `hi` is maintained at the first element we know does not satisfy `cmp`.
743 |
744 | let mut hi = vec.len();
745 | let mut lo = 0;
746 | while lo < hi {
747 | // Unlike in the general case, this expression cannot overflow because `Vec` is limited to
748 | // `isize::MAX` capacity and we disallow ZSTs above. If we needed to support slices or
749 | // vectors of ZSTs, which don't have an upper bound on their size AFAIK, we would need to
750 | // use a slightly less efficient version that cannot overflow: `lo + (hi - lo) / 2`.
751 | let mid = (hi + lo) / 2;
752 |
753 | // LLVM seems to be unable to prove that `mid` is always less than `vec.len()`, so use
754 | // `get_unchecked` to avoid a bounds check since this code is hot.
755 | let el: &T = unsafe { vec.get_unchecked(mid) };
756 | if cmp(el) {
757 | lo = mid + 1;
758 | } else {
759 | hi = mid;
760 | }
761 | }
762 | lo
763 | }
764 |
--------------------------------------------------------------------------------
/src/variable.rs:
--------------------------------------------------------------------------------
1 | use std::cell::RefCell;
2 | use std::io::Write;
3 | use std::iter::FromIterator;
4 | use std::rc::Rc;
5 |
6 | use crate::{
7 | join::{self, JoinInput},
8 | map,
9 | relation::Relation,
10 | treefrog::{self, Leapers},
11 | };
12 |
13 | /// A type that can report on whether it has changed.
14 | pub(crate) trait VariableTrait {
15 | /// Reports whether the variable has changed since it was last asked.
16 | fn changed(&mut self) -> bool;
17 |
18 | /// Dumps statistics about the variable internals, for debug and profiling purposes.
19 | fn dump_stats(&self, round: u32, w: &mut dyn Write);
20 | }
21 |
22 | /// An monotonically increasing set of `Tuple`s.
23 | ///
24 | /// There are three stages in the lifecycle of a tuple:
25 | ///
26 | /// 1. A tuple is added to `self.to_add`, but is not yet visible externally.
27 | /// 2. Newly added tuples are then promoted to `self.recent` for one iteration.
28 | /// 3. After one iteration, recent tuples are moved to `self.stable` for posterity.
29 | ///
30 | /// Each time `self.changed()` is called, the `recent` relation is folded into `stable`,
31 | /// and the `to_add` relations are merged, potentially deduplicated against `stable`, and
32 | /// then made `recent`. This way, across calls to `changed()` all added tuples are in
33 | /// `recent` at least once and eventually all are in `stable`.
34 | ///
35 | /// A `Variable` may optionally be instructed not to de-duplicate its tuples, for reasons
36 | /// of performance. Such a variable cannot be relied on to terminate iterative computation,
37 | /// and it is important that any cycle of derivations have at least one de-duplicating
38 | /// variable on it.
39 | pub struct Variable {
40 | /// Should the variable be maintained distinctly.
41 | pub(crate) distinct: bool,
42 | /// A useful name for the variable.
43 | pub(crate) name: String,
44 | /// A list of relations whose union are the accepted tuples.
45 | pub stable: Rc>>>,
46 | /// A list of recent tuples, still to be processed.
47 | pub recent: Rc>>,
48 | /// A list of future tuples, to be introduced.
49 | pub(crate) to_add: Rc>>>,
50 | }
51 |
52 | impl Variable {
53 | /// Returns the name used to create this variable.
54 | pub fn name(&self) -> &str {
55 | self.name.as_str()
56 | }
57 |
58 | /// Returns the total number of "stable" tuples in this variable.
59 | pub fn num_stable(&self) -> usize {
60 | self.stable.borrow().iter().map(|x| x.len()).sum()
61 | }
62 |
63 | /// Returns `true` if this variable contains only "stable" tuples.
64 | ///
65 | /// Calling `Iteration::changed()` on such `Variables` will not change them unless new tuples
66 | /// are added.
67 | pub fn is_stable(&self) -> bool {
68 | self.recent.borrow().is_empty() && self.to_add.borrow().is_empty()
69 | }
70 | }
71 |
72 | // Operator implementations.
73 | impl Variable {
74 | /// Adds tuples that result from joining `input1` and `input2` --
75 | /// each of the inputs must be a set of (Key, Value) tuples. Both
76 | /// `input1` and `input2` must have the same type of key (`K`) but
77 | /// they can have distinct value types (`V1` and `V2`
78 | /// respectively). The `logic` closure will be invoked for each
79 | /// key that appears in both inputs; it is also given the two
80 | /// values, and from those it should construct the resulting
81 | /// value.
82 | ///
83 | /// Note that `input1` must be a variable, but `input2` can be a
84 | /// relation or a variable. Therefore, you cannot join two
85 | /// relations with this method. This is not because the result
86 | /// would be wrong, but because it would be inefficient: the
87 | /// result from such a join cannot vary across iterations (as
88 | /// relations are fixed), so you should prefer to invoke `insert`
89 | /// on a relation created by `Relation::from_join` instead.
90 | ///
91 | /// # Examples
92 | ///
93 | /// This example starts a collection with the pairs (x, x+1) and (x+1, x) for x in 0 .. 10.
94 | /// It then adds pairs (y, z) for which (x, y) and (x, z) are present. Because the initial
95 | /// pairs are symmetric, this should result in all pairs (x, y) for x and y in 0 .. 11.
96 | ///
97 | /// ```
98 | /// use datafrog::{Iteration, Relation};
99 | ///
100 | /// let mut iteration = Iteration::new();
101 | /// let variable = iteration.variable::<(usize, usize)>("source");
102 | /// variable.extend((0 .. 10).map(|x| (x, x + 1)));
103 | /// variable.extend((0 .. 10).map(|x| (x + 1, x)));
104 | ///
105 | /// while iteration.changed() {
106 | /// variable.from_join(&variable, &variable, |&key, &val1, &val2| (val1, val2));
107 | /// }
108 | ///
109 | /// let result = variable.complete();
110 | /// assert_eq!(result.len(), 121);
111 | /// ```
112 | pub fn from_join<'me, K: Ord, V1: Ord, V2: Ord>(
113 | &self,
114 | input1: &'me Variable<(K, V1)>,
115 | input2: impl JoinInput<'me, (K, V2)>,
116 | logic: impl FnMut(&K, &V1, &V2) -> Tuple,
117 | ) {
118 | join::join_into(input1, input2, self, logic)
119 | }
120 |
121 | /// Same as [`Variable::from_join`], but lets you ignore some of the resulting tuples.
122 | ///
123 | /// # Examples
124 | ///
125 | /// This is the same example from `Variable::from_join`, but it filters any tuples where the
126 | /// absolute difference is greater than 3. As a result, it generates all pairs (x, y) for x and
127 | /// y in 0 .. 11 such that |x - y| <= 3.
128 | ///
129 | /// ```
130 | /// use datafrog::{Iteration, Relation};
131 | ///
132 | /// let mut iteration = Iteration::new();
133 | /// let variable = iteration.variable::<(isize, isize)>("source");
134 | /// variable.extend((0 .. 10).map(|x| (x, x + 1)));
135 | /// variable.extend((0 .. 10).map(|x| (x + 1, x)));
136 | ///
137 | /// while iteration.changed() {
138 | /// variable.from_join_filtered(&variable, &variable, |&key, &val1, &val2| {
139 | /// ((val1 - val2).abs() <= 3).then(|| (val1, val2))
140 | /// });
141 | /// }
142 | ///
143 | /// let result = variable.complete();
144 | ///
145 | /// let mut expected_cnt = 0;
146 | /// for i in 0i32..11 {
147 | /// for j in 0i32..11 {
148 | /// if (i - j).abs() <= 3 {
149 | /// expected_cnt += 1;
150 | /// }
151 | /// }
152 | /// }
153 | ///
154 | /// assert_eq!(result.len(), expected_cnt);
155 | /// ```
156 | pub fn from_join_filtered<'me, K: Ord, V1: Ord, V2: Ord>(
157 | &self,
158 | input1: &'me Variable<(K, V1)>,
159 | input2: impl JoinInput<'me, (K, V2)>,
160 | logic: impl FnMut(&K, &V1, &V2) -> Option,
161 | ) {
162 | join::join_and_filter_into(input1, input2, self, logic)
163 | }
164 |
165 | /// Adds tuples from `input1` whose key is not present in `input2`.
166 | ///
167 | /// Note that `input1` must be a variable: if you have a relation
168 | /// instead, you can use `Relation::from_antijoin` and then
169 | /// `Variable::insert`. Note that the result will not vary during
170 | /// the iteration.
171 | ///
172 | /// # Examples
173 | ///
174 | /// This example starts a collection with the pairs (x, x+1) for x in 0 .. 10. It then
175 | /// adds any pairs (x+1,x) for which x is not a multiple of three. That excludes four
176 | /// pairs (for 0, 3, 6, and 9) which should leave us with 16 total pairs.
177 | ///
178 | /// ```
179 | /// use datafrog::{Iteration, Relation};
180 | ///
181 | /// let mut iteration = Iteration::new();
182 | /// let variable = iteration.variable::<(usize, usize)>("source");
183 | /// variable.extend((0 .. 10).map(|x| (x, x + 1)));
184 | ///
185 | /// let relation: Relation<_> = (0 .. 10).filter(|x| x % 3 == 0).collect();
186 | ///
187 | /// while iteration.changed() {
188 | /// variable.from_antijoin(&variable, &relation, |&key, &val| (val, key));
189 | /// }
190 | ///
191 | /// let result = variable.complete();
192 | /// assert_eq!(result.len(), 16);
193 | /// ```
194 | pub fn from_antijoin(
195 | &self,
196 | input1: &Variable<(K, V)>,
197 | input2: &Relation,
198 | logic: impl FnMut(&K, &V) -> Tuple,
199 | ) {
200 | self.insert(join::antijoin(&input1.recent.borrow(), input2, logic))
201 | }
202 |
203 | /// Adds tuples that result from mapping `input`.
204 | ///
205 | /// # Examples
206 | ///
207 | /// This example starts a collection with the pairs (x, x) for x in 0 .. 10. It then
208 | /// repeatedly adds any pairs (x, z) for (x, y) in the collection, where z is the Collatz
209 | /// step for y: it is y/2 if y is even, and 3*y + 1 if y is odd. This produces all of the
210 | /// pairs (x, y) where x visits y as part of its Collatz journey.
211 | ///
212 | /// ```
213 | /// use datafrog::{Iteration, Relation};
214 | ///
215 | /// let mut iteration = Iteration::new();
216 | /// let variable = iteration.variable::<(usize, usize)>("source");
217 | /// variable.extend((0 .. 10).map(|x| (x, x)));
218 | ///
219 | /// while iteration.changed() {
220 | /// variable.from_map(&variable, |&(key, val)|
221 | /// if val % 2 == 0 {
222 | /// (key, val/2)
223 | /// }
224 | /// else {
225 | /// (key, 3*val + 1)
226 | /// });
227 | /// }
228 | ///
229 | /// let result = variable.complete();
230 | /// assert_eq!(result.len(), 74);
231 | /// ```
232 | pub fn from_map(&self, input: &Variable, logic: impl FnMut(&T2) -> Tuple) {
233 | map::map_into(input, self, logic)
234 | }
235 |
236 | /// Adds tuples that result from combining `source` with the
237 | /// relations given in `leapers`. This operation is very flexible
238 | /// and can be used to do a combination of joins and anti-joins.
239 | /// The main limitation is that the things being combined must
240 | /// consist of one dynamic variable (`source`) and then several
241 | /// fixed relations (`leapers`).
242 | ///
243 | /// The idea is as follows:
244 | ///
245 | /// - You will be inserting new tuples that result from joining (and anti-joining)
246 | /// some dynamic variable `source` of source tuples (`SourceTuple`)
247 | /// with some set of values (of type `Val`).
248 | /// - You provide these values by combining `source` with a set of leapers
249 | /// `leapers`, each of which is derived from a fixed relation. The `leapers`
250 | /// should be either a single leaper (of suitable type) or else a tuple of leapers.
251 | /// You can create a leaper in one of two ways:
252 | /// - Extension: In this case, you have a relation of type `(K, Val)` for some
253 | /// type `K`. You provide a closure that maps from `SourceTuple` to the key
254 | /// `K`. If you use `relation.extend_with`, then any `Val` values the
255 | /// relation provides will be added to the set of values; if you use
256 | /// `extend_anti`, then the `Val` values will be removed.
257 | /// - Filtering: In this case, you have a relation of type `K` for some
258 | /// type `K` and you provide a closure that maps from `SourceTuple` to
259 | /// the key `K`. Filters don't provide values but they remove source
260 | /// tuples.
261 | /// - Finally, you get a callback `logic` that accepts each `(SourceTuple, Val)`
262 | /// that was successfully joined (and not filtered) and which maps to the
263 | /// type of this variable.
264 | pub fn from_leapjoin<'leap, SourceTuple: Ord, Val: Ord + 'leap>(
265 | &self,
266 | source: &Variable,
267 | leapers: impl Leapers<'leap, SourceTuple, Val>,
268 | logic: impl FnMut(&SourceTuple, &Val) -> Tuple,
269 | ) {
270 | self.insert(treefrog::leapjoin(&source.recent.borrow(), leapers, logic));
271 | }
272 | }
273 |
274 | impl Clone for Variable {
275 | fn clone(&self) -> Self {
276 | Variable {
277 | distinct: self.distinct,
278 | name: self.name.clone(),
279 | stable: self.stable.clone(),
280 | recent: self.recent.clone(),
281 | to_add: self.to_add.clone(),
282 | }
283 | }
284 | }
285 |
286 | impl Variable {
287 | pub(crate) fn new(name: &str) -> Self {
288 | Variable {
289 | distinct: true,
290 | name: name.to_string(),
291 | stable: Rc::new(RefCell::new(Vec::new())),
292 | recent: Rc::new(RefCell::new(Vec::new().into())),
293 | to_add: Rc::new(RefCell::new(Vec::new())),
294 | }
295 | }
296 |
297 | /// Inserts a relation into the variable.
298 | ///
299 | /// This is most commonly used to load initial values into a variable.
300 | /// it is not obvious that it should be commonly used otherwise, but
301 | /// it should not be harmful.
302 | pub fn insert(&self, relation: Relation) {
303 | if !relation.is_empty() {
304 | self.to_add.borrow_mut().push(relation);
305 | }
306 | }
307 |
308 | /// Extend the variable with values from the iterator.
309 | ///
310 | /// This is most commonly used to load initial values into a variable.
311 | /// it is not obvious that it should be commonly used otherwise, but
312 | /// it should not be harmful.
313 | pub fn extend(&self, iterator: impl IntoIterator
- )
314 | where
315 | Relation: FromIterator,
316 | {
317 | self.insert(iterator.into_iter().collect());
318 | }
319 |
320 | /// Consumes the variable and returns a relation.
321 | ///
322 | /// This method removes the ability for the variable to develop, and
323 | /// flattens all internal tuples down to one relation. The method
324 | /// asserts that iteration has completed, in that `self.recent` and
325 | /// `self.to_add` should both be empty.
326 | pub fn complete(self) -> Relation {
327 | assert!(self.is_stable());
328 | let mut result: Relation = Vec::new().into();
329 | while let Some(batch) = self.stable.borrow_mut().pop() {
330 | result = result.merge(batch);
331 | }
332 | result
333 | }
334 | }
335 |
336 | impl VariableTrait for Variable {
337 | fn changed(&mut self) -> bool {
338 | // 1. Merge self.recent into self.stable.
339 | if !self.recent.borrow().is_empty() {
340 | let mut recent =
341 | ::std::mem::replace(&mut (*self.recent.borrow_mut()), Vec::new().into());
342 | while self
343 | .stable
344 | .borrow()
345 | .last()
346 | .map(|x| x.len() <= 2 * recent.len())
347 | == Some(true)
348 | {
349 | let last = self.stable.borrow_mut().pop().unwrap();
350 | recent = recent.merge(last);
351 | }
352 | self.stable.borrow_mut().push(recent);
353 | }
354 |
355 | // 2. Move self.to_add into self.recent.
356 | let to_add = self.to_add.borrow_mut().pop();
357 | if let Some(mut to_add) = to_add {
358 | while let Some(to_add_more) = self.to_add.borrow_mut().pop() {
359 | to_add = to_add.merge(to_add_more);
360 | }
361 | // 2b. Restrict `to_add` to tuples not in `self.stable`.
362 | if self.distinct {
363 | for batch in self.stable.borrow().iter() {
364 | let mut slice = &batch[..];
365 | // Only gallop if the slice is relatively large.
366 | if slice.len() > 4 * to_add.elements.len() {
367 | to_add.elements.retain(|x| {
368 | slice = join::gallop(slice, |y| y < x);
369 | slice.is_empty() || &slice[0] != x
370 | });
371 | } else {
372 | to_add.elements.retain(|x| {
373 | while !slice.is_empty() && &slice[0] < x {
374 | slice = &slice[1..];
375 | }
376 | slice.is_empty() || &slice[0] != x
377 | });
378 | }
379 | }
380 | }
381 | *self.recent.borrow_mut() = to_add;
382 | }
383 |
384 | // let mut total = 0;
385 | // for tuple in self.stable.borrow().iter() {
386 | // total += tuple.len();
387 | // }
388 |
389 | // println!("Variable\t{}\t{}\t{}", self.name, total, self.recent.borrow().len());
390 |
391 | !self.recent.borrow().is_empty()
392 | }
393 |
394 | fn dump_stats(&self, round: u32, w: &mut dyn Write) {
395 | let mut stable_count = 0;
396 | for tuple in self.stable.borrow().iter() {
397 | stable_count += tuple.len();
398 | }
399 |
400 | writeln!(
401 | w,
402 | "{:?},{},{},{}",
403 | self.name,
404 | round,
405 | stable_count,
406 | self.recent.borrow().len()
407 | )
408 | .unwrap_or_else(|e| {
409 | panic!(
410 | "Couldn't write stats for variable {}, round {}: {}",
411 | self.name, round, e
412 | )
413 | });
414 | }
415 | }
416 |
417 | // impl Drop for Variable {
418 | // fn drop(&mut self) {
419 | // let mut total = 0;
420 | // for batch in self.stable.borrow().iter() {
421 | // total += batch.len();
422 | // }
423 | // println!("FINAL: {:?}\t{:?}", self.name, total);
424 | // }
425 | // }
426 |
--------------------------------------------------------------------------------
/triagebot.toml:
--------------------------------------------------------------------------------
1 | [assign]
2 |
--------------------------------------------------------------------------------