├── .gitignore ├── Cargo.toml ├── README.md ├── examples └── walkthrough.rs └── src ├── event.rs └── lib.rs /.gitignore: -------------------------------------------------------------------------------- 1 | /target 2 | Cargo.lock 3 | -------------------------------------------------------------------------------- /Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "dip" 3 | version = "0.1.0" 4 | authors = ["theotherphil"] 5 | edition = "2018" 6 | 7 | [dependencies] 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dip 2 | 3 | A toy incremental computation framework, intended as an executable introduction to the approach used by [salsa]. The basic setup is introduced in the [salsa book]: 4 | 5 | >The key idea of salsa is that you define your program as a set of queries. Every query is used like a function `K -> V` that maps from 6 | >some key of type `K` to a value of type `V`. Queries come in two basic varieties: 7 | > 8 | >* Inputs: the base inputs to your system. You can change these whenever you like. 9 | >* Functions: pure functions (no side effects) that transform your inputs into other values. The results of queries are memoized to 10 | > avoid recomputing them a lot. When you make changes to the inputs, we'll figure out (fairly intelligently) when we can re-use these 11 | > memoized values and when we have to recompute them. 12 | 13 | This library implements enough of the memoization strategy from salsa to hopefully give a useful introduction to the approach used, without having to worry about all the other details that would be required in a real framework. In particular, we make (at least) the following simplifications: 14 | * Salsa queries can specify their own key and value types, but Dip uses a concrete enum `Key` for all query keys, and a type alias `Value = i32` for outputs. 15 | * Salsa is thread-safe and supports query cancellation. Dip always runs queries to completion on a single thread. 16 | * Salsa supports a range of caching and cache eviction policies. Dip caches all query outputs and never evicts anything. 17 | * Salsa works hard to give good performance. Dip does not. 18 | * Salsa uses procedural macros to provide a user-friendly API. Dip requires the user to do a lot of manual plumbing themselves. 19 | 20 | The best starting point is probably to run the `walkthrough` example. 21 | 22 | ``` 23 | cargo run --example walkthrough 24 | ``` 25 | 26 | This dumps a fairly detailed trace from a series of query executions to the terminal, along with some explanatory notes. 27 | 28 | The implementation lives entirely within `src/lib.rs`, except for some code in `src/event.rs` that is used solely for logging. `src/lib.rs` is intended to make sense when read from top to bottom. 29 | 30 | Example output from a query evaluation (taken from the output of running the example above): 31 | 32 |
33 | Query one_year_fee(17)
34 | |  Existing memo: (value: 100, verified_at: 3, changed_at: 3, dependencies: {(discount_age_limit, ()), (base_fee, ())})
35 | |  Checking inputs to see if any have changed since revision 3, when this memo was last verified
36 | |  |  Query discount_age_limit()
37 | |  |  |  Existing memo: (value: 16, verified_at: 3, changed_at: 3, dependencies: {})
38 | |  |  |  Memo is valid as this is an input query
39 | |  |  |  Updating stored memo to: (value: 16, verified_at: 4, changed_at: 3, dependencies: {})
40 | |  |  Dependency discount_age_limit() last changed at revision 3
41 | |  |  Query base_fee()
42 | |  |  |  Existing memo: (value: 100, verified_at: 3, changed_at: 1, dependencies: {})
43 | |  |  |  Memo is valid as this is an input query
44 | |  |  |  Updating stored memo to: (value: 100, verified_at: 4, changed_at: 1, dependencies: {})
45 | |  |  Dependency base_fee() last changed at revision 1
46 | |  Memo is valid as no inputs have changed
47 | |  Updating stored memo to: (value: 100, verified_at: 4, changed_at: 3, dependencies: {(discount_age_limit, ()), (base_fee, ())})
48 | 
49 | 50 | [salsa]: https://github.com/salsa-rs/salsa 51 | [salsa book]: https://salsa-rs.github.io/salsa/how_salsa_works.html 52 | -------------------------------------------------------------------------------- /examples/walkthrough.rs: -------------------------------------------------------------------------------- 1 | //! In our very contrived example you own a company providing training services and need to quote a 2 | //! subscription fee to your customers. 3 | //! 4 | //! The calculation is very simple: you have a fixed yearly base fee, but thanks to government 5 | //! funding can provide a discounted price to school-aged customers. 6 | 7 | // `salsa` generates a user-friendly API using procedural macros, but the users of `dip::Database` 8 | // have to do most of the plumbing themselves. 9 | // 10 | // In our example we hide this plumbing from the "end-user" by exporting only a use-case-specific 11 | // `CostsDatabase` trait from this nested module. 12 | mod implementation { 13 | // The meaning of these types is explained in src/lib.rs, which is best read from top to bottom. 14 | use dip::{Database, Key, QueryId, Value}; 15 | use std::collections::HashMap; 16 | 17 | // All dip queries have signature (&mut Database, Key) -> Value, but we define some type 18 | // aliases to make the example code easier to follow. 19 | type Dollars = i32; 20 | type Years = i32; 21 | 22 | // Identifiers for the queries (inputs or derived) used in our example. 23 | // We hide these from the end-user behind the `CostsDatabase` trait below. 24 | const BASE_FEE: QueryId = "base_fee"; 25 | const DISCOUNT_AGE_LIMIT: QueryId = "discount_age_limit"; 26 | const DISCOUNT_AMOUNT: QueryId = "discount_amount"; 27 | const ONE_YEAR_FEE: QueryId = "one_year_fee"; 28 | const TWO_YEAR_FEE: QueryId = "two_year_fee"; 29 | 30 | // This trait allows for more ergonomic-looking code in this example's `main` function, but 31 | // has no special significance - we could equally well have written this example with 32 | // freestanding functions or by using the methods on `dip::Database` directly from `main`. 33 | // 34 | // The meaning of the various methods are explained in the `impl` block below. 35 | pub trait CostsDatabase { 36 | // Setting inputs 37 | fn set_discount_age_limit(&mut self, age_limit: Years); 38 | fn set_base_fee(&mut self, base_fee: Dollars); 39 | fn set_discount_amount(&mut self, discount_amount: Dollars); 40 | 41 | // Reading inputs 42 | fn discount_age_limit(&mut self) -> Years; 43 | fn base_fee(&mut self) -> Dollars; 44 | fn discount_amount(&mut self) -> Dollars; 45 | 46 | // Derived queries 47 | fn one_year_fee(&mut self, current_age: Years) -> Dollars; 48 | fn two_year_fee(&mut self, current_age: Years) -> Dollars; 49 | } 50 | 51 | impl CostsDatabase for Database { 52 | // The ids DISCOUNT_AGE_LIMIT, BASE_FEE and DISCOUNT_AMOUNT are registered as input ids 53 | // in the call to `dip::Database::new` in the `create_database` function below. 54 | // 55 | // Input queries take an `Into` as input. In our example they have no logical inputs, 56 | // so we use `()`. 57 | fn set_discount_age_limit(&mut self, age_limit: Years) { 58 | self.set(DISCOUNT_AGE_LIMIT, (), age_limit); 59 | } 60 | fn set_base_fee(&mut self, base_fee: Dollars) { 61 | self.set(BASE_FEE, (), base_fee); 62 | } 63 | fn set_discount_amount(&mut self, discount_amount: Dollars) { 64 | self.set(DISCOUNT_AMOUNT, (), discount_amount); 65 | } 66 | 67 | // The API for querying inputs is identical to non-input queries. 68 | fn discount_age_limit(&mut self) -> Years { 69 | self.get(DISCOUNT_AGE_LIMIT, ()) 70 | } 71 | fn base_fee(&mut self) -> Dollars { 72 | self.get(BASE_FEE, ()) 73 | } 74 | fn discount_amount(&mut self) -> Dollars { 75 | self.get(DISCOUNT_AMOUNT, ()) 76 | } 77 | 78 | // Compute the one year membership fee for someone of the given age. 79 | fn one_year_fee(&mut self, current_age: Years) -> Dollars { 80 | self.get(ONE_YEAR_FEE, current_age) 81 | } 82 | 83 | // Compute the two year membership fee for someone of the given age. 84 | fn two_year_fee(&mut self, current_age: Years) -> Dollars { 85 | self.get(TWO_YEAR_FEE, current_age) 86 | } 87 | } 88 | 89 | // See comments in `create_database`. 90 | fn one_year_fee_query(db: &mut Database, current_age: Key) -> Dollars { 91 | let current_age: Years = current_age.into(); 92 | 93 | // Customers receive a discount if they're <= the discount age limit. 94 | if current_age <= db.discount_age_limit() { 95 | db.base_fee() - db.discount_amount() 96 | } else { 97 | db.base_fee() 98 | } 99 | } 100 | 101 | // See comments in `create_database`. 102 | fn two_year_fee_query(db: &mut Database, current_age: Key) -> Dollars { 103 | let current_age: Years = current_age.into(); 104 | 105 | // Compute the fees for this year and next year and add them (no loyalty discounts here). 106 | // 107 | // This is equal to `2 * one_year_fee` _unless_ you're currently at the age limit for a 108 | // young person's discount. 109 | let fee_this_year = db.one_year_fee(current_age); 110 | let fee_next_year = db.one_year_fee(current_age + 1); 111 | fee_this_year + fee_next_year 112 | } 113 | 114 | pub fn create_database() -> impl CostsDatabase { 115 | // Unlike in salsa, our users need to wire up all the queries themselves. 116 | // 117 | // First, we define the set of input ids. These are queries whose values must be provided 118 | // directly by the user. 119 | let input_ids = vec![BASE_FEE, DISCOUNT_AGE_LIMIT, DISCOUNT_AMOUNT]; 120 | 121 | // Dependency tracking and memoisation is defined in terms of QueryIds. If dip determines 122 | // that it needs to (re)compute some value then it needs to be able to look up the 123 | // appropriate query function from its id. This lookup is provided directly in the 124 | // constructor to `Database`. 125 | let mut query_functions = HashMap:: Value>>::new(); 126 | 127 | // Note that we only need to register functions for derived queries - no user-provided code 128 | // is executed when reading input queries as we just read their cached values directly. 129 | query_functions.insert(ONE_YEAR_FEE, Box::new(one_year_fee_query)); 130 | query_functions.insert(TWO_YEAR_FEE, Box::new(two_year_fee_query)); 131 | 132 | // Return a configured database and hide the plumbing from the end-users behind a trait. 133 | Database::new(input_ids, query_functions) 134 | } 135 | } 136 | 137 | use implementation::{create_database, CostsDatabase}; 138 | 139 | fn note(message: &str) { 140 | println!("\n\n****"); 141 | for line in message.lines() { 142 | println!("** {}", line.trim()); 143 | } 144 | println!("**"); 145 | println!(); 146 | } 147 | 148 | fn main() { 149 | let mut db = create_database(); 150 | 151 | note( 152 | r#"Contrived setup: you own a company that provides training services, and need to quote 153 | a subscription fee to potential customers. 154 | 155 | The calculation is very simple: you have a fixed yearly base fee, but thanks to government funding 156 | can provide a discounted price to school-aged customers. 157 | 158 | The Database used has three inputs: 159 | * base_fee() 160 | * discount_amount() 161 | * discount_age_limit() 162 | 163 | And two derived queries: 164 | * one_year_fee(age: Years) -> Dollars 165 | * two_year_fee(age: Years) -> Dollars 166 | 167 | Pseudo-code for the two derived queries: 168 | * one_year_fee(age) = if age <= discount_age_limit { base_fee - discount_amount } else { base_fee } 169 | * two_year_fee(age) = one_year_fee(age) + one_year_fee(age + 1) 170 | 171 | In the execution below we: 172 | * Set values for all of the database inputs 173 | * Run the derived queries for a few inputs, noting where existing results are being reused 174 | * Change some of the input values, rerun some derived queries and note where and why cached values require recalculation 175 | 176 | Output without leading '*'s is from Dip - the Database type emits Events and these are written to the terminal."#, 177 | ); 178 | 179 | note(r#"Before we can query fees we need to set the input values."#); 180 | db.set_base_fee(100); 181 | db.set_discount_amount(30); 182 | db.set_discount_age_limit(16); 183 | 184 | note( 185 | r#"16 is the maximum age for a young person's discount, so the one year fee for a 16 year old is base_fee - discount_amount."#, 186 | ); 187 | assert_eq!(db.one_year_fee(16), 70); 188 | 189 | note( 190 | r#"17 is greater than the maximum age for a young person's discount, so the one year fee for a 17 year old is base_fee."#, 191 | ); 192 | assert_eq!(db.one_year_fee(17), 100); 193 | 194 | note( 195 | r#"To compute the two year fee for a 17 year old we need to know the one year fee for a 17 year old and the one year fee for 196 | an 18 year old. We have already computed the first of these, so will re-use the cached value for one_year_fee(17) and 197 | compute one_year_fee(18)."#, 198 | ); 199 | assert_eq!(db.two_year_fee(17), 200); 200 | 201 | note(r#"Update the discount provided to people under the discount age limit."#); 202 | db.set_discount_amount(40); 203 | 204 | note( 205 | r#"The memo for one_year_fee(17) is out of date, as the database revision has increased since it was last verified. 206 | However, as neither the age limit threshold nor the base fees have changed its value is still valid."#, 207 | ); 208 | assert_eq!(db.one_year_fee(17), 100); 209 | 210 | note( 211 | r#"As 16 <= discount_age_limit we will spot that one of the inputs to one_year_fee(16) has changed and have to recompute."#, 212 | ); 213 | assert_eq!(db.one_year_fee(16), 60); 214 | 215 | note( 216 | r#"Government funding criteria have changed - we can now also provide discounts to 17 year olds."#, 217 | ); 218 | db.set_discount_age_limit(17); 219 | 220 | note( 221 | r#"Both one_year_fee(17) and one_year_fee(18) query the age limit, so both have potentially changed - we will need 222 | to rerun queries to tell. The value of one_year_fee(18) does not change, but the value of one_year_fee(17) does 223 | and so two_year_fee(17) also needs to be recomputed."#, 224 | ); 225 | assert_eq!(db.two_year_fee(17), 160); 226 | } 227 | -------------------------------------------------------------------------------- /src/event.rs: -------------------------------------------------------------------------------- 1 | //! An `Event` type, and helper functions to log these to the console. 2 | //! These are solely for debugging and tracing purposes - they do not affect query evaluation. 3 | 4 | use crate::{Key, Memo, Slot, Value}; 5 | use std::fmt::Write; 6 | 7 | /// `Database` is currently hardcoded to use `EventLogger` to log these events to the console. 8 | #[derive(Clone)] 9 | pub(crate) enum Event { 10 | Set(Slot, Value, usize), 11 | Get(Slot), 12 | StartedQueryEvaluation, 13 | CompletedQueryEvaluation, 14 | StoreMemo(Option, Memo), 15 | ReadMemo(Option), 16 | MemoForInputQuery, 17 | MemoVerifiedAtCurrentRevision, 18 | ValueComparison(Value, Value, usize), 19 | StartedInputChecks(usize), 20 | CompletedInputChecks(bool), 21 | ChangedAt(Slot, usize), 22 | PushActiveQuery, 23 | PopActiveQuery, 24 | } 25 | 26 | /// Logs `Events` to the console. 27 | pub(crate) struct EventLogger { 28 | indent: usize, 29 | } 30 | 31 | /// Helper macro used in `EventLogger` to make it slightly less verbose to log indented lines. 32 | macro_rules! log { 33 | ($self:expr, $($arg:tt)+) => {{ 34 | print!("{}", Self::TAB.repeat($self.indent)); 35 | println!($($arg)+) 36 | }} 37 | } 38 | 39 | impl EventLogger { 40 | pub(crate) fn new() -> EventLogger { 41 | EventLogger { indent: 0 } 42 | } 43 | 44 | /// Logs an `Event` to the console. 45 | pub(crate) fn log_event(&mut self, event: &Event) { 46 | match event { 47 | Event::Set(slot, value, revision) => { 48 | log!( 49 | self, 50 | "Setting ({}, {}) to {}", 51 | slot.id, 52 | print_key(&slot.key), 53 | value 54 | ); 55 | log!(self, "Global revision is now {}", revision); 56 | } 57 | Event::Get(slot) => { 58 | log!(self, "Query {}", print_slot_as_function_call(slot)); 59 | } 60 | Event::StartedQueryEvaluation => { 61 | log!(self, "Running query function"); 62 | self.indent += 1; 63 | } 64 | Event::CompletedQueryEvaluation => { 65 | self.indent -= 1; 66 | } 67 | Event::StoreMemo(old_memo, memo) => { 68 | if old_memo.is_some() { 69 | log!(self, "Updating stored memo to: {}", print_memo(memo)) 70 | } else { 71 | log!(self, "Storing memo: {}", print_memo(memo)) 72 | } 73 | } 74 | Event::ReadMemo(memo) => { 75 | match memo { 76 | Some(memo) => log!(self, "Existing memo: {}", print_memo(memo)), 77 | None => log!(self, "No memo currently exists"), 78 | }; 79 | } 80 | Event::ValueComparison(old_value, new_value, current_revision) => { 81 | let result = match old_value == new_value { 82 | true => format!( 83 | "New value {} is the same as the memo value, so not updating changed_at", 84 | new_value 85 | ), 86 | false => format!( 87 | "New value {} != memo value {}, so updating changed_at to {}", 88 | new_value, old_value, current_revision 89 | ), 90 | }; 91 | log!(self, "{}", result); 92 | } 93 | Event::StartedInputChecks(verified_at) => { 94 | log!( 95 | self, 96 | "Checking inputs to see if any have changed since revision {}, when this memo was last verified", 97 | verified_at 98 | ); 99 | self.indent += 1; 100 | } 101 | Event::CompletedInputChecks(any_inputs_have_changed) => { 102 | self.indent -= 1; 103 | let result = match any_inputs_have_changed { 104 | false => "valid as no inputs have changed", 105 | true => "invalid as an input has changed", 106 | }; 107 | log!(self, "Memo is {}", result) 108 | } 109 | Event::MemoForInputQuery => { 110 | log!(self, "Memo is valid as this is an input query"); 111 | } 112 | Event::MemoVerifiedAtCurrentRevision => { 113 | log!( 114 | self, 115 | "Memo is valid as it was verified at the current revision" 116 | ); 117 | } 118 | Event::ChangedAt(slot, changed_at) => { 119 | log!( 120 | self, 121 | "Dependency {} last changed at revision {}", 122 | print_slot_as_function_call(slot), 123 | changed_at, 124 | ); 125 | } 126 | Event::PushActiveQuery => { 127 | self.push(); 128 | } 129 | Event::PopActiveQuery => { 130 | self.pop(); 131 | } 132 | }; 133 | } 134 | 135 | fn push(&mut self) { 136 | self.indent += 1; 137 | } 138 | 139 | fn pop(&mut self) { 140 | self.indent -= 1; 141 | } 142 | 143 | const TAB: &'static str = "| "; 144 | } 145 | 146 | fn print_memo(memo: &Memo) -> String { 147 | let mut dependencies = String::new(); 148 | write!(&mut dependencies, "{{").unwrap(); 149 | let mut first = true; 150 | for dependency in &memo.dependencies { 151 | if !first { 152 | write!(&mut dependencies, ", ").unwrap(); 153 | } 154 | write!( 155 | &mut dependencies, 156 | "({}, {})", 157 | dependency.id, 158 | print_key(&dependency.key) 159 | ) 160 | .unwrap(); 161 | first = false; 162 | } 163 | write!(&mut dependencies, "}}").unwrap(); 164 | format!( 165 | "(value: {}, verified_at: {}, changed_at: {}, dependencies: {})", 166 | memo.value, memo.verified_at, memo.changed_at, dependencies 167 | ) 168 | } 169 | 170 | fn print_slot_as_function_call(slot: &Slot) -> String { 171 | let v = match slot.key { 172 | Key::Void => "".to_string(), 173 | Key::Int(x) => x.to_string(), 174 | }; 175 | format!("{}({})", slot.id, v) 176 | } 177 | 178 | fn print_key(key: &Key) -> String { 179 | let v = match key { 180 | Key::Void => "()".to_string(), 181 | Key::Int(x) => x.to_string(), 182 | }; 183 | format!("{}", v) 184 | } 185 | -------------------------------------------------------------------------------- /src/lib.rs: -------------------------------------------------------------------------------- 1 | //! This file contains the whole framework implementation, except for some logging code in events.rs. 2 | //! It is intended to be readable from top to bottom. 3 | 4 | use std::fmt::Debug; 5 | use std::{ 6 | collections::{HashMap, HashSet}, 7 | hash::Hash, 8 | }; 9 | 10 | // The `event` module contains logging code only - it can safely be ignored when reading this file. 11 | pub mod event; 12 | use event::{Event, EventLogger}; 13 | 14 | // Salsa supports custom key and value types for queries. 15 | // Dip does not - all keys must be of type `Key`, and all outputs must be of type `Value`. 16 | pub type Value = i32; 17 | 18 | #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] 19 | pub enum Key { 20 | Void, 21 | Int(i32), 22 | } 23 | 24 | // Some From/Into impls for `Key`, as a minor concession to user ergonomics. 25 | impl From<()> for Key { 26 | fn from(_: ()) -> Self { 27 | Key::Void 28 | } 29 | } 30 | impl From for Key { 31 | fn from(x: i32) -> Self { 32 | Key::Int(x) 33 | } 34 | } 35 | impl From for () { 36 | fn from(key: Key) -> () { 37 | match key { 38 | Key::Void => (), 39 | _ => panic!("Key type mismatch"), 40 | } 41 | } 42 | } 43 | impl From for i32 { 44 | fn from(key: Key) -> i32 { 45 | match key { 46 | Key::Int(x) => x, 47 | _ => panic!("Key type mismatch"), 48 | } 49 | } 50 | } 51 | 52 | /// A `Database` needs to know about all possible queries at the point where it is constructed. 53 | /// 54 | /// Queries come in two varieties: 55 | /// * Input queries, whose values are set explicitly by the user. 56 | /// * Derived queries, whose values are computed from other queries. 57 | /// 58 | /// Both kinds of queries are identified by `QueryIds`. 59 | pub type QueryId = &'static str; 60 | 61 | /// A `Slot` identifies a location in which to cache a query result. 62 | /// Every query takes a `Key` as input, and to uniquely identify a query evaluation 63 | /// you need to know both the id of the query and the inputs used. 64 | #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] 65 | struct Slot { 66 | id: QueryId, 67 | key: Key, 68 | } 69 | 70 | impl Slot { 71 | fn new(id: QueryId, key: Key) -> Self { 72 | Self { id, key } 73 | } 74 | } 75 | 76 | /// The output of a query, together with the information needed to work out whether its value is still valid. 77 | #[derive(Debug, Clone)] 78 | struct Memo { 79 | /// The output of the query. 80 | value: Value, 81 | /// When the user sets the value for an input query the database revision increases. 82 | /// 83 | /// This field tells us the most recent revision at which we validated the contents of this memo. 84 | /// If `verified_at == db.revision` then we know that the `Memo` is valid. 85 | verified_at: usize, 86 | /// The last revision at which the value of the memo in this slot changed. When rerunning a query 87 | /// we only update `changed_at` if the output value has actually changed. 88 | changed_at: usize, 89 | /// The other queries (and keys used when calling those queries) we used to compute this value. 90 | /// 91 | /// The value of a memo is guaranteed to be valid if none of its dependencies have changed since it 92 | /// was last verified (as queries are required to be pure functions). 93 | /// 94 | /// If the values of any dependencies have changed since this memo was verified then the value in 95 | /// this memo is no longer valid and we need to recompute it to see if its value has changed. 96 | dependencies: HashSet, 97 | } 98 | 99 | /// A query output, together with the latest revision at which the output of this query changed. 100 | struct StampedValue { 101 | value: Value, 102 | changed_at: usize, 103 | } 104 | 105 | impl StampedValue { 106 | fn new(value: Value, changed_at: usize) -> Self { 107 | Self { value, changed_at } 108 | } 109 | } 110 | 111 | /// Where everything happens. 112 | /// 113 | /// A `Database` tracks the dependencies between queries, caches results, and contains 114 | /// the logic to determine when cached results need to be recomputed. 115 | pub struct Database { 116 | /// The ids of inputs queries, i.e. those whose values will be set directly by the user 117 | /// rather than computed from the values of other queries. 118 | input_ids: Vec, 119 | /// The functions used to compute the values for derived queries. 120 | query_functions: HashMap Value>>, 121 | /// Cached query results, for both input and derived queries. 122 | storage: HashMap, 123 | /// The database revision is updated every time the user sets a value for an input query. 124 | revision: usize, 125 | /// When running queries (or when checking whether a cached result is still valid), the 126 | /// database will evaluate other queries. 127 | /// 128 | /// When evaluating a query we add this call (i.e. the (id, key) pair) to the top (i.e. last) 129 | /// element in the `active_queries` stack to record the dependency, and then push a fresh 130 | /// hash set onto the stack for the newly active query. 131 | active_queries: Vec>, 132 | /// Logs information about query execution to the console. 133 | /// Run `cargo run --example walkthrough` to see example output. 134 | logger: EventLogger, 135 | } 136 | 137 | // A helper macro to reduce the verbosity of event logging inside methods in `Database`. 138 | // You can safely ignore this macro, as well as all uses of it inside `Database`. 139 | macro_rules! event { 140 | ($self:expr, $event:path) => {{ 141 | let event = $event; 142 | $self.logger.log_event(&event) 143 | }}; 144 | ($self:expr, $event:path, $($arg:expr),*) => {{ 145 | let event = $event($($arg.clone()),*); 146 | $self.logger.log_event(&event) 147 | }} 148 | } 149 | 150 | impl Database { 151 | /// `Database` needs to know about all the queries that it will be executing at construction. 152 | pub fn new( 153 | input_ids: Vec, 154 | query_functions: HashMap Value>>, 155 | ) -> Database { 156 | Database { 157 | input_ids, 158 | query_functions, 159 | storage: HashMap::new(), 160 | revision: 0, 161 | active_queries: vec![], 162 | logger: EventLogger::new(), 163 | } 164 | } 165 | 166 | /// Sets the user-provided value for an input query. 167 | /// 168 | /// The `IntoKey` bound is just to make this slightly more ergonomic - users can pass 169 | /// `()` or an `i32` rather than needing to wrap these in a `Key` themselves. 170 | pub fn set>(&mut self, id: QueryId, key: K, value: Value) { 171 | assert!( 172 | self.input_ids.contains(&id), 173 | "{} is not a valid input id", 174 | id 175 | ); 176 | 177 | // Storage is indexed by slots - a query call is identified by a query id 178 | // and a key. Note that input queries also take a key, but a key of Key::Void 179 | // may be used for (input or derived) queries which logically take no key values. 180 | let slot = Slot::new(id, key.into()); 181 | 182 | // As all query functions are pure, the only way for database state to change is 183 | // in response to this method being called. Each time an input is set we update 184 | // the database revision. 185 | self.revision = self.revision + 1; 186 | 187 | event!(self, Event::Set, slot, value, self.revision); 188 | 189 | // If a memo exists and the new value is the same as the old value then don't 190 | // update `changed_at`. 191 | let changed_at = self.read_memo(slot) 192 | .filter(|m| m.value == value) 193 | .map(|m| m.changed_at) 194 | .unwrap_or(self.revision); 195 | 196 | // Input queries do not depend on any other queries, so their dependency sets are 197 | // always empty. 198 | let memo = Memo { 199 | value, 200 | verified_at: self.revision, 201 | changed_at, 202 | dependencies: HashSet::new(), 203 | }; 204 | 205 | // Helper method that stores the memo in `self.storage` and emits an Event reporting this. 206 | self.store_memo(slot, memo); 207 | } 208 | 209 | /// Computes or looks up the value for a query. This method is used for both input and derived queries. 210 | pub fn get>(&mut self, id: QueryId, key: K) -> Value { 211 | self.get_with_timestamp(Slot::new(id, key.into())).value 212 | } 213 | 214 | /// Computes or looks up the value for a query and returns the value along with the database revision 215 | /// at which this value last changed. 216 | fn get_with_timestamp(&mut self, slot: Slot) -> StampedValue { 217 | event!(self, Event::Get, slot); 218 | 219 | // If we called into this method as part of computing or validating the output for another query 220 | // then record this call as a dependency of the parent query. 221 | // 222 | // When we store a `Memo` with the output of a query we read its dependencies from `active_queries` 223 | // and store them in the memo. 224 | if let Some(active) = self.active_queries.last_mut() { 225 | active.insert(slot); 226 | } 227 | 228 | // Make this the currently active query. 229 | self.push_active_query(); 230 | 231 | // This `read` method could be inlined here. The only reason for not doing this is to remove the 232 | // need to call `pop_active_query` at each early return location from that method. 233 | let result = self.read(slot); 234 | 235 | // Remove the top element of `active_queries` now that we're done with it. 236 | self.pop_active_query(); 237 | 238 | result 239 | } 240 | 241 | /// The body of `get_with_timestamp` after recording this query as a dependency of the parent query (if any) 242 | /// and pushing a new entry onto the active query stack. 243 | fn read(&mut self, slot: Slot) -> StampedValue { 244 | // Helper method that queries `self.storage` for a memo in this slot and emits an Event reporting this. 245 | let memo = self.read_memo(slot); 246 | 247 | if self.is_input_query(slot.id) { 248 | // If this is an input query then we require the user to have provided a value via `.set(..)`. 249 | let memo = memo.expect("attempting to query an input slot that has not been set"); 250 | 251 | event!(self, Event::MemoForInputQuery); 252 | 253 | // If this is the first read of this input at the current revision then update the memo to reflect this. 254 | // Note that memoised values for inputs are always valid - they can't be invalidated by changes to the 255 | // values of any other queries. 256 | // 257 | // Aside, short version: 258 | // We could have chosen to handle input queries in any of several other basically equivalent ways. 259 | // 260 | // Aside, longer version: 261 | // If you're wondering why we care about `verified_at` for inputs when we've just stated that input 262 | // `Memo`s are always valid, the answer is that it doesn't really matter either way. 263 | // 264 | // The only significance of updating `verified_at` here is that it avoids the recursive call into 265 | // `get_with_timestamp` inside the `has_changed_since` method below. This has no effect on the 266 | // the set of query functions that get run, but saves a bit of pushing to and popping from the active 267 | // query stack. We could also have chosen to special case inputs inside `has_changed_since`, or to 268 | // update `verified_at` for all inputs whenever a new value is set for _any_ input query, or made the 269 | // field optional and omitted it for inputs, or chosen from a range of yet other possibilities, without 270 | // changing the algorithm or calculations performed in any material way. 271 | // 272 | // We could also have noted that `any_inputs_have_changed` as defined below would always be false 273 | // for inputs and that special casing inputs is not strictly necessary. But handling 274 | // inputs separately seemed slightly clearer. 275 | // 276 | if memo.verified_at != self.revision { 277 | let new_memo = Memo { 278 | verified_at: self.revision, 279 | ..memo 280 | }; 281 | self.store_memo(slot, new_memo); 282 | } 283 | 284 | return StampedValue::new(memo.value, memo.changed_at); 285 | } 286 | 287 | // If we have a memo and this isn't an input query then we need to check if the memoized value is still valid. 288 | if let Some(memo) = memo.clone() { 289 | // If we've verified the memo already at this revision then it must be usable. 290 | if memo.verified_at == self.revision { 291 | event!(self, Event::MemoVerifiedAtCurrentRevision); 292 | return StampedValue::new(memo.value, memo.changed_at); 293 | } 294 | 295 | // Otherwise, we need to check the dependencies of the memo to see if any of their values have changed 296 | // since the memo was last verified. 297 | event!(self, Event::StartedInputChecks, memo.verified_at); 298 | 299 | let any_inputs_have_changed = memo 300 | .dependencies 301 | .iter() 302 | .any(|&input| self.has_changed_since(input, memo.verified_at)); 303 | 304 | event!(self, Event::CompletedInputChecks, any_inputs_have_changed); 305 | 306 | // If the values used by when computing this memo have not changed this the memo is still valid 307 | // and we can update the memo's `verified_at` field and return from this method. 308 | // 309 | // Otherwise we fall through to the code after this block that recomputes the memo using its query 310 | // function. 311 | if !any_inputs_have_changed { 312 | let new_memo = Memo { 313 | verified_at: self.revision, 314 | ..memo 315 | }; 316 | self.store_memo(slot, new_memo); 317 | return StampedValue::new(memo.value, memo.changed_at); 318 | } 319 | } 320 | 321 | // If we got to this point then either we don't have a memoised value or it's out of date. 322 | // In either case we need to evaluate the query function. 323 | let new_value = self.run_query_function(slot); 324 | 325 | // Some logging. 326 | if let Some(memo) = memo.clone() { 327 | event!(self, Event::ValueComparison, memo.value, new_value, self.revision); 328 | } 329 | 330 | // If we had a memo before and the query's value hasn't actually changed then 331 | // we don't update `changed_at`. 332 | let changed_at = memo 333 | .filter(|m| m.value == new_value) 334 | .map(|m| m.changed_at) 335 | .unwrap_or(self.revision); 336 | 337 | // Store the new memo, recording its dependencies by reading from the top element of from `active_queries`. 338 | let memo = Memo { 339 | value: new_value, 340 | verified_at: self.revision, 341 | changed_at, 342 | dependencies: self.active_queries.last().unwrap().clone(), 343 | }; 344 | 345 | self.store_memo(slot, memo); 346 | StampedValue::new(new_value, changed_at) 347 | } 348 | 349 | /// Checks whether the output for a query has changed since the specified revision. 350 | /// 351 | /// If we have an up to date memo for this (query, key) pair then we can use the `changed_at` field 352 | /// from the memo. Otherwise, we need to recurse into `get_with_timestamp` to get a `StampedValue` 353 | /// for this slot. 354 | /// 355 | /// (Reminder of the state of the call stack if that happens: 356 | /// get_with_timestamp(query_one) 357 | /// -> read(query_one) 358 | /// -> has_changed_since(query_that_query_one_depends_on) 359 | /// -> get_with_timestamp(query_that_query_one_depends_on) 360 | /// -> ... 361 | /// ) 362 | fn has_changed_since(&mut self, slot: Slot, revision: usize) -> bool { 363 | let changed_at = { 364 | // If we _did_ have a mechanism for removing cached valued then we would return self.revision here if no memo existed. 365 | let memo = self.storage.get(&slot).expect( 366 | "previously queried values always exist as we never remove anything from our cache", 367 | ); 368 | 369 | // If we've verified the memo this revision then we can trust its changed_at field. 370 | if memo.verified_at == self.revision { 371 | memo.changed_at 372 | // If we've not verified the memo this revision then we need to recurse. 373 | } else { 374 | self.get_with_timestamp(slot).changed_at 375 | } 376 | }; 377 | event!(self, Event::ChangedAt, slot, changed_at); 378 | changed_at > revision 379 | } 380 | 381 | /// Find the query function with id `slot.id` and run it. 382 | /// Recall that query functions have signature `fn(&mut Database, Key) -> Value`. 383 | /// See `one_year_fee_query` in examples/walkthrough.rs for an example. 384 | fn run_query_function(&mut self, slot: Slot) -> Value { 385 | event!(self, Event::StartedQueryEvaluation); 386 | let query = self 387 | .query_functions 388 | .get(slot.id) 389 | .expect("Missing query function") 390 | .clone(); 391 | let new_value = query(self, slot.key); 392 | event!(self, Event::CompletedQueryEvaluation); 393 | new_value 394 | } 395 | 396 | fn push_active_query(&mut self) { 397 | event!(self, Event::PushActiveQuery); 398 | self.active_queries.push(HashSet::new()); 399 | } 400 | 401 | fn pop_active_query(&mut self) -> Option> { 402 | event!(self, Event::PopActiveQuery); 403 | self.active_queries.pop() 404 | } 405 | 406 | fn store_memo(&mut self, slot: Slot, memo: Memo) { 407 | // The read of the existing memo value here is solely to let us generate more helpful logs. 408 | let old_memo = self.storage.get(&slot).cloned(); 409 | event!(self, Event::StoreMemo, old_memo, memo); 410 | self.storage.insert(slot, memo); 411 | } 412 | 413 | fn read_memo(&mut self, slot: Slot) -> Option { 414 | let value = self.storage.get(&slot).cloned(); 415 | event!(self, Event::ReadMemo, value); 416 | value 417 | } 418 | 419 | fn is_input_query(&self, id: QueryId) -> bool { 420 | self.input_ids.contains(&id) 421 | } 422 | } 423 | --------------------------------------------------------------------------------