├── README.md ├── SERVER_SETUP.md ├── batch-request-schema.jsonc ├── batch-response-schema.jsonc ├── client-view-schema.jsonc ├── contributing.md ├── design.md ├── diagram.png ├── faq.md └── skateboard.png /README.md: -------------------------------------------------------------------------------- 1 | # Archived! 2 | 3 | The Replicache repo has moved to [https://github.com/rocicorp/replicache](https://github.com/rocicorp/replicache). 4 | 5 | Bye! 6 | -------------------------------------------------------------------------------- /SERVER_SETUP.md: -------------------------------------------------------------------------------- 1 | # Replicache Server Setup 2 | 3 | This document walks you through adding [Replicache](https://replicache.dev) support to an existing web service. 4 | 5 | Questions? Comments? We'd love to help you evaluate Replicache — [Join us on Slack](https://slack.replicache.dev/). 6 | You can also refer to our fully-functional [TODO sample application](https://github.com/rocicorp/replicache-sample-todo). For information about contributing, see our [contributing guide](contributing.md). 7 | 8 | **Note:** This document assumes you already know what Replicache is, why you might need it, and broadly how it works. If that's not true, see the [Replicache homepage](https://replicache.dev) for an overview, or the [design document](design.md) for a detailed deep-dive. 9 | 10 | # Overview 11 | 12 | ![Picture of Replicache Architecture](diagram.png) 13 | 14 | Replicache is a per-user cache that sits between your backend and client. To integrate Replicache, you will make changes to both your backend and your client. 15 | 16 | This document is about the server-side setup. To learn how to build a Replicache client, see [Replicache JavaScript SDK](https://github.com/rocicorp/replicache-sdk-js). 17 | 18 | ### Step 1: Downstream Sync 19 | 20 | Implement a *Client View* endpoint on your service that returns the data that should be available locally on the client for each user. This endpoint should return the *entire* view every time it is requested. 21 | 22 | Replicache will frequently query this endpoint and calculate a diff to send to each client. 23 | 24 | For example, [sample TODO app](https://github.com/rocicorp/replicache-sample-todo) returns a Client View like this: 25 | 26 | ```jsonc 27 | { 28 | "lastMutationID": 0, 29 | "clientView": { 30 | "/todo/1": { 31 | "title": "Take out the trash", 32 | "description": "Don't forget to pull it to the curb.", 33 | "done": false, 34 | }, 35 | "/todo/2": { 36 | "title": "Pick up milk", 37 | "description": "2%", 38 | "done": true, 39 | }, 40 | ... 41 | } 42 | } 43 | ``` 44 | 45 | The key/value pairs you return are up to you — Replicache just makes sure they get to the client. 46 | 47 | See [client-view-schema.jsonc](./client-view-schema.jsonc) for the schema of the Client View response. 48 | 49 | #### Client View Authentication 50 | 51 | Most applications return a Client View that is specific to the calling user. Replicache supports sending user credentials through the standard `Authorization` HTTP header. If authorization fails, the Client View should return HTTP 401 (Unauthorized). The Replicache client will prompt user code to reauthenticate in that case. 52 | 53 | Note that there is nothing Replicache-specific about Client View authenticaiton. 54 | Presumably your service already authenticates users so that they can't fetch 55 | each others' data. 56 | 57 | #### Errors 58 | 59 | All responses other than HTTP 200 with a valid JSON Client View and HTTP 401 are treated as errors by the Diff Server. The Client View response is ignored and the app is sent the last known state instead. 60 | 61 | ### Step 2: Test Downstream Sync 62 | 63 | The request to your Client View goes through the Diff Server, which turns the full Client View your service returns into a smaller delta to be applied to the state if any that your client has. If your service is publicly accessible, you can use the Diff Server service we provide. If not, for example your Client View is running on `localhost`, you can run the Diff Server yourself. 64 | 65 | Note that running the Diff Server yourself is convenient for evaluating 66 | Replicache, but in order to launch to users "for real", your Client View will 67 | need to be publicly accessible, and you will need to create a Replicache account 68 | as outlined below. 69 | 70 | #### If your Client View is publicly accessible... 71 | 72 | [Sign up for a Replicache account](https://serve.replicache.dev/signup) 73 | if you have not already. Note your `Account ID`. You will need to pass it to the 74 | Diff Server when manually `curl`ing requests below and in the `diffServerAuth` 75 | field of the 76 | [ReplicacheOptions passed to the Replicache constructor](https://replicache-sdk-js.now.sh/interfaces/replicacheoptions.html) 77 | when instantiating your JS client. 78 | 79 | #### Else if your Client View is NOT publicly accessible.. 80 | 81 | Download the Diff Server: 82 | 83 | ```bash 84 | # For OSX 85 | curl -o diffs -L https://github.com/rocicorp/diff-server/releases/latest/download/diffs-osx 86 | chmod u+x diffs 87 | 88 | # For Linux 89 | curl -o diffs -L https://github.com/rocicorp/diff-server/releases/latest/download/diffs-linux 90 | chmod u+x diffs 91 | ``` 92 | 93 | Run it: 94 | 95 | ```bash 96 | # Begins serving on localhost:7001 97 | .//diffs --disable-auth=true --db=/tmp/replicache-db --account-db=/tmp/replicache-adb serve 98 | ``` 99 | 100 | The command line above disables the Diff Server auth check so you can pass any number 101 | for the `Account ID` in the next step. 102 | 103 | #### Pull from the Diff Server 104 | 105 | To *pull* from the Diff Server provide the variables listed below and `curl` the request. 106 | 107 | ```bash 108 | # The URL of your Client View, e.g. http://yourservice.com/replicache-client-view. 109 | CLIENT_VIEW= 110 | # The value of the Authorization HTTP header to send in the Client View request to your 111 | # service, identifying the user to your service. 112 | CLIENT_VIEW_AUTH= 113 | # If you are using the Diff Server service, https://serve.replicache.dev/pull. 114 | # If you are running the Diff Server yourself, its address, e.g. http://localhost:7001/pull. 115 | DIFF_SERVER= 116 | # The Account ID assigned during the signup step above if using the Diff Server service, 117 | # otherwise any integer if you are running the Diff Server yourself, e.g., 123. 118 | ACCOUNT_ID= 119 | curl -H "Authorization: $ACCOUNT_ID" -d '{ \ 120 | "clientID": "c1", \ 121 | "clientViewAuth": "$CLIENT_VIEW_AUTH", \ 122 | "clientViewURL": "$CLIENT_VIEW", \ 123 | "baseStateID": "00000000000000000000000000000000", \ 124 | "checksum": "00000000", \ 125 | "lastMutationID": 0, \ 126 | "version": 3 \ 127 | }' $DIFF_SERVER 128 | ``` 129 | 130 | Take note of the returned `stateID` and `checksum`. Then make a change to the user's Client View in your service and pull again, but specifying a `baseStateID` and `checksum` like so: 131 | 132 | ```bash 133 | BASE_STATE_ID= 134 | CHECKSUM= 135 | # Use the same values for the following variables that you used above. 136 | CLIENT_VIEW= 137 | CLIENT_VIEW_AUTH= 138 | DIFF_SERVER= 139 | ACCOUNT_ID= 140 | curl -H "Authorization: $ACCOUNT_ID" -d '{ \ 141 | "clientID": "c1", \ 142 | "clientViewAuth": "$CLIENT_VIEW_AUTH", \ 143 | "clientViewURL": "$CLIENT_VIEW", \ 144 | "baseStateID": "$BASE_STATE_ID", \ 145 | "checksum": "$CHECKSUM", \ 146 | "lastMutationID": 0, \ 147 | "version": 3 \ 148 | }' $DIFF_SERVER 149 | ``` 150 | 151 | You'll get a response that includes only the diff! 152 | 153 | ### Step 3: Mutation ID Storage 154 | 155 | Next up: Writes. 156 | 157 | Replicache identifies each change (or *Mutation*) that originates on the Replicache Client with a *MutationID*. 158 | 159 | Your service must store the last MutationID it has processed for each client, and return it to Replicache in the Client View response. This allows Replicache to know when it can discard the speculative version of that change from the client. 160 | 161 | Storing the last MutationIDs can be done a number of ways, but typically they are stored in the same datastore as your user data. You must update the last MutationID for a client transactionally as part of each mutation. 162 | 163 | If you use e.g., Postgres, for your user data, you might store Replicache Change IDs in a table like: 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 |
ReplicacheMutationIDs
ClientIDCHAR(32)
LastMutationIDINT64
178 | 179 | ### Step 4: Upstream Sync 180 | 181 | Replicache implements upstream sync by queuing calls to your service on the client-side and uploading them in batches. Your batch endpoint can be any URL, and when you set your 182 | client up you will provide the URL via the `batchURL` parameter in the Replicache client 183 | constructor. 184 | 185 | Here is an example batch request to our TODO example app backend. 186 | 187 | ```json 188 | { 189 | "clientID": "CB94867E-94B7-48F3-A3C1-287871E1F7FD", 190 | "mutations": [ 191 | { 192 | "id": 7, 193 | "name": "todoCreate", 194 | "args": { 195 | "id": "AE2E880D-C4BD-473A-B5E0-29A4A9965EE9", 196 | "text": "Take out the trash", 197 | "order": 0.5, 198 | "complete": false 199 | } 200 | }, 201 | { 202 | "id": 8, 203 | "name": "todoUpdate", 204 | "args": { 205 | "id": "AE2E880D-C4BD-473A-B5E0-29A4A9965EE9", 206 | "complete": true 207 | } 208 | }, 209 | ... 210 | ] 211 | } 212 | ``` 213 | 214 | The response from the batch endpoint is **completely optional** — Replicache doesn't use it for anything. 215 | 216 | However, we recommend returning information about failed requests in the response. Replicache prints these to the developer console for debugging purposes. Here is an example response: 217 | 218 | 219 | ```json 220 | { 221 | "mutationInfos": [ 222 | { 223 | "id": 8, 224 | "error": "Invalid POST data: syntax error: ..." 225 | }, 226 | { 227 | "id": 9, 228 | "error": "Backend unavailable" 229 | } 230 | ] 231 | } 232 | ``` 233 | 234 | See [batch-request-schema.jsonc](./batch-request-schema.jsonc) for the detailed schema of the payload Replicache sends to your batch endpoint. See [batch-response-schema.jsonc](./batch-response-schema.jsonc) for the response schema. 235 | 236 | #### Implementing the Batch Endpoint 237 | 238 | Conceptually, the batch endpoint receives an ordered batch of mutation requests and applies them in sequence, reporting any errors back to the client. There are some sublteties to be aware of, though: 239 | 240 | * Replicache can send mutations that have already been processed. This is in fact common when the network is spotty. This is why you need to [store the last processed mutation ID](#step-4-upstream-sync): You **MUST** skip mutations you have already seen. 241 | * Generally, mutations for a given client **SHOULD** be processed serially and in-order to achieve [causal consistency](https://jepsen.io/consistency/models/causal). However if you have special knowledge that pairs of mutations are commutative, you can process them in parallel. 242 | * Each mutation **MUST** eventually be acknowledged by your service, by updating the stored `lastMutationID` value for the client and returning it in the client view. 243 | * If a mutation can't be processed temporarily (e.g., some server-side resource is temporarily unavailable), simply return early from the batch without updating `lastMutationID`. Replicache will retry the mutation later. 244 | * If a mutation can't be processed permanently (e.g., the request is invalid), mark the mutation processed by updating the stored `lastMutationID`, then continue with other mutations. 245 | * You **MUST** update `lastMutationID` atomically with handling the mutation, otherwise the state reported to the client can be inconsistent. 246 | 247 | A sample batch endpoint for Go is available in our [TODO sample app](https://github.com/rocicorp/replicache-sample-todo/blob/master/serve/handlers/batch/batch.go). 248 | 249 | ### Step 5: Example 250 | 251 | Here's a bash transcript demonstrating a series of requests Replicache might make against our [sample TODO app](https://github.com/rocicorp/replicache-sample-todo): 252 | 253 | ```bash 254 | BATCH=https://replicache-sample-todo.now.sh/serve/replicache-batch 255 | CLIENT_VIEW=https://replicache-sample-todo.now.sh/serve/replicache-client-view 256 | DIFF_SERVER=https://serve.replicache.dev/pull 257 | ACCOUNT_ID=1 # The Replicache TODO sample account. 258 | 259 | NEW_USER_EMAIL=$RANDOM@example.com 260 | 261 | # Create a new user 262 | curl -d "{\"email\":\"$NEW_USER_EMAIL\"}" https://replicache-sample-todo.now.sh/serve/login 263 | 264 | CLIENT_VIEW_AUTH= 265 | CLIENT_ID=$RANDOM 266 | LIST_ID=$RANDOM 267 | TODO_ID=$RANDOM 268 | 269 | # Create a first list and todo 270 | curl -H "Authorization: $CLIENT_VIEW_AUTH" 'https://replicache-sample-todo.now.sh/serve/replicache-batch' --data-binary @- << EOF 271 | { 272 | "clientID": "$CLIENT_ID", 273 | "mutations": [ 274 | { 275 | "id": 1, 276 | "name": "createList", 277 | "args": { 278 | "id": $LIST_ID 279 | } 280 | }, 281 | { 282 | "id": 2, 283 | "name": "createTodo", 284 | "args": { 285 | "id": $TODO_ID, 286 | "listID": $LIST_ID, 287 | "text": "Walk the dog", 288 | "complete": false 289 | } 290 | } 291 | ] 292 | } 293 | EOF 294 | 295 | # If successful, you should see an empty response e.g., {"mutationInfos":[]} 296 | 297 | # Do an initial pull from diff-server 298 | curl -H "Authorization: $ACCOUNT_ID" $DIFF_SERVER --data-binary @- << EOF 299 | { 300 | "clientID": "$CLIENT_ID", 301 | "clientViewAuth": "$CLIENT_VIEW_AUTH", 302 | "clientViewURL": "$CLIENT_VIEW", 303 | "baseStateID": "00000000000000000000000000000000", 304 | "checksum": "00000000", 305 | "lastMutationID": 0, 306 | "version": 3 307 | } 308 | EOF 309 | 310 | BASE_STATE_ID= 311 | CHECKSUM= 312 | LAST_MUTATION_ID= 313 | 314 | # Create a second TODO 315 | # Do this one via the classic REST API, circumventing Replicache entirely 316 | TODO_ID=$RANDOM 317 | curl -H "Authorization: $CLIENT_VIEW_AUTH" https://replicache-sample-todo.now.sh/serve/todo-create --data-binary @- << EOF 318 | { 319 | "id": $TODO_ID, 320 | "listID": $LIST_ID, 321 | "text": "Take out the trash", 322 | "complete": false 323 | } 324 | EOF 325 | 326 | # Do an incremental pull from diff-server 327 | # Note that only the second todo is returned 328 | curl -H "Authorization: $ACCOUNT_ID" $DIFF_SERVER --data-binary @- << EOF 329 | { 330 | "clientID": "$CLIENT_ID", 331 | "clientViewAuth": "$CLIENT_VIEW_AUTH", 332 | "clientViewURL": "$CLIENT_VIEW", 333 | "baseStateID": "$BASE_STATE_ID", 334 | "checksum": "$CHECKSUM", 335 | "lastMutationID": $LAST_MUTATION_ID, 336 | "version": 3 337 | } 338 | EOF 339 | ``` 340 | 341 | ### Step 6: 🎉🎉 342 | 343 | Woo! You're done with the backend integration. What's next?? 344 | 345 | - [Build the Client UI](https://github.com/rocicorp/replicache-sdk-js) 346 | - [Check out the full version of this sample](https://github.com/rocicorp/replicache-sample-todo) 347 | - [Check out the richer React/Babel/GCal sample](https://github.com/rocicorp/replicache-sdk-js/tree/master/sample/cal) 348 | -------------------------------------------------------------------------------- /batch-request-schema.jsonc: -------------------------------------------------------------------------------- 1 | { 2 | "type": "object", 3 | "properties": { 4 | "clientID": { 5 | "type": "string", 6 | "minLength": 1 7 | }, 8 | "mutations": { 9 | "type": "array", 10 | "items": { 11 | "type": "object", 12 | "properties": { 13 | "id": { 14 | "type": "integer", 15 | "minimum": 1 16 | }, 17 | "name": { 18 | "type": "string", 19 | "minLength": 1 20 | }, 21 | "args": {}, 22 | }, 23 | "required": ["id", "name", "args"] 24 | } 25 | } 26 | }, 27 | "required": ["clientID", "mutations"] 28 | } 29 | -------------------------------------------------------------------------------- /batch-response-schema.jsonc: -------------------------------------------------------------------------------- 1 | { 2 | "type": "array", 3 | "properties": { 4 | "mutationInfos": { 5 | "type": "array", 6 | "items": { 7 | "type": "object", 8 | "properties": { 9 | "id": { 10 | "type": "int", 11 | "minimium": 0, 12 | }, 13 | "error": { 14 | "type": "string" 15 | } 16 | }, 17 | "required": ["id"] 18 | } 19 | } 20 | }, 21 | "required": ["mutationInfos"] 22 | 23 | -------------------------------------------------------------------------------- /client-view-schema.jsonc: -------------------------------------------------------------------------------- 1 | { 2 | "type": "object", 3 | "properties": { 4 | // The last Replicache Mutation ID that your service has processed. See Step 4 and 5 for more information. 5 | "lastMutationID": {"type": "integer", "minimum": 0}, 6 | // A map of UTF8 string keys to JSON values. Any JSON type is legal for each value. 7 | // This is the data that will be available on the client side. 8 | "clientView": {"type": "object"} 9 | }, 10 | "required": ["lastMutationID", "clientView"] 11 | } 12 | -------------------------------------------------------------------------------- /contributing.md: -------------------------------------------------------------------------------- 1 | ## Replicache Contributing Guide 2 | 3 | We welcome contributions, questions, and feedback from the community. Here's a short guide 4 | to how to work together. 5 | 6 | ### Bug Reports & Discussions 7 | 8 | * File all Replicache issues in the Replicache repo https://github.com/rocicorp/replicache/issues. 9 | * This simplifies our view of what's in flight and doesn't require anyone to understand how our repos are organized. 10 | * Join our [Slack channel](https://join.slack.com/t/rocicorp/shared_invite/zt-dcez2xsi-nAhW1Lt~32Y3~~y54pMV0g) for realtime help or discussion. 11 | 12 | ### Making changes 13 | 14 | * We subscribe heavily to the practice of [talk, then code](https://dave.cheney.net/2019/02/18/talk-then-code). 15 | * Foundational, tricky, or wide-reaching changes should be discussed ahead of time on an issue in order to maximize the chances that they land well. Typically this involves a discussion of requirements and constraints and a design sketch showing major interfaces and logic. ([example](https://github.com/rocicorp/replicache/issues/27), [example](https://github.com/rocicorp/replicache/issues/30)) 16 | * Code review 17 | * Rocicorp partners prefer to get async after-merge code reviews. 18 | * Reviewer should review within 3 days. 19 | * Reviewee should respond within 7 days. 20 | * We sometimes use [tags with these meanings](https://news.ycombinator.com/item?id=23027988) in code reviews 21 | 22 | ### Legal 23 | 24 | All contributors must sign a contributor agreement, either for an individual or corporation, before a pull request can be accepted. 25 | 26 | ### Style (General) 27 | 28 | * If possible, do not intentionally crash processes (eg, by `panic`ing on unexpected conditions). 29 | * For our client-side software, this is bad practice because we frequently run in-process with our users and we will crash them too! 30 | * For our server-side software, this is bad practice because we will lose any other requests that were also in-flight on that server. 31 | * There are other types of software in which crashing early and often may be more appropriate, but for consistency and code reuse reasons we generally avoid crashing everywhere. 32 | * We use three log levels. Please employ them in a way that's consistent with the following semantics: 33 | * **ERROR**: something truly unexpected has happened **and a developer should go look**. 34 | * Examples that might be ERRORs: 35 | * an important invariant has been violated 36 | * stored data is corrupt 37 | * Examples that probably are *not* ERRORs: 38 | * an http request failed 39 | * couldn't parse user input 40 | * a thing that can time out timed out, or a thing that can fail failed 41 | * **INFO**: information that is immediately useful to the developer, or an important change of state. Info logs should not be spammy. 42 | * Examples that might be INFOs: 43 | * "Server listening on port 1234..." 44 | * udpated foo to a new version 45 | * Examples that probably are *not* INFOs: 46 | * server received a request 47 | * successfully completed a periodic task that is not an important change of state (eg, logs were rotated) 48 | * **DEBUG**: verbose information about what's happening that might be useful to a developer. 49 | * Examples that probably are DEBUGs: 50 | * request and response content 51 | * some process is starting 52 | * We do not use a warning level because warnings are typically not actionable and are mostly ignored. We prefer the developer to take a position: does the thing rise to the level that a developer should go do something about it or not? We do not use a trace level because we haven't yet found a use for it, and extra log levels are just confusing. 53 | 54 | ### Style (Go-specific) 55 | 56 | * Prefer consumer-defined interfaces to introduce seams for testing (as opposed to say variables that point to an implementation, functions that take functions, etc) 57 | * Code must be gofmt'd but does *not* have to lint 58 | * There are a lot of ways to initialize variables in Go. For consistency, we default to literal-style initialization (e.g., `Foo{}` or `map[string]string{}`) because it's a few chars shorter. We use `make` or `new` when necessary, e.g., to create a slice with a specific capacity or to create a channel. 59 | -------------------------------------------------------------------------------- /design.md: -------------------------------------------------------------------------------- 1 | # Spinner-Free Applications 2 | 3 | "[Offline-First](https://www.google.com/search?q=offline+first)" describes a client/server architecture where 4 | the application reads and writes to a local database on the device, and synchronizes with servers asynchronously whenever 5 | there is connectivity. 6 | 7 | These applications are highly desired by product teams and users because they are so much more responsive and 8 | reliable than applications that are directly dependent upon servers. By storing data in a local database, offline-first 9 | applications are instantaneously responsive and reliable in any network conditions. 10 | 11 | Additionally, offline-first applications typically update live, in real time, when something change server side, without 12 | the user having to refresh. Since they are already continuously synchronizing, realtime updates are just updating the UI 13 | when new data arrives. 14 | 15 | Unfortunately, offline-first applications are also really hard to build. Many previous companies and open source projects 16 | have sought to provide an easy framework for buiding offline-first applications, but for a variety of reasons none have 17 | succeeded. 18 | 19 | # Introducing Replicache 20 | 21 | [Replicache](https://replicache.dev) dramatically reduces the difficulty of building offline-first applications. Replicache's goals are: 22 | 23 | 1. Providing a truly offline-first programming model that is natural and easy to reason about 24 | 1. Maximizing compatability with existing application infrastructure and patterns, minimizing the work to integrate 25 | 26 | The key features that drive Replicache's increased usability: 27 | 28 | * **Easy Integration**: Replicache runs alongside your existing application infrastructure. You keep your existing server-side stack and client-side frameworks. Replicache doesn't take ownership of data, and is not the source of truth. Its only job is to provide bidirectional sync between your clients and your servers. This makes it easy to adopt: you can try it for just a small piece of functionality, or a small slice of users, while leaving the rest of your application the same. 29 | * **The Client View**: To use Replicache, developers define a *Client View*, which is the data Replicache keeps cached on a specific device. Developers must arrange to return a delta from some previous version of the Client View to the current one when requested, but developers do *not* have to worry about any local changes the client may have applied. Replicache ensures that any local mutations are correctly ordered with respect to the canonical server state. 30 | * **Transactional Conflict Resolution**: Conflicts are an unavoidable part of offline-first systems, but contrary to popular 31 | belief they don't need to be exceptionally painful. Replicache makes conflict resolution significantly easier by capturing the *intent* of changes and then asking developers to replay that intended change later. See [Conflicts](#conflicts) for more. 32 | * **Causal+ Consistency**: [Consistency guarantees](https://jepsen.io/consistency) make distributed systems easier to reason about and prevent confusing user-visible data anomalies. When properly integrated with your backend, Replicache provides for [Causal+ Consistency](https://jepsen.io/consistency/models/causal) across the entire system. This means that transactions are guaranteed to be applied *atomically*, in the *same order*, *across all clients*. Further, all clients will see an order of transactions that is compatible with *causal history*. Basically: all clients will end up seeing the same thing, and you're not going to have anly weirdly reordered or dropped messages. We have worked with independent Distributed Systems expert Kyle Kingsbury of Jepsen to validate these properties of our design. See [Jepsen on Replicache](https://replicache.dev/jepsen.html). 33 | 34 | # System Overview 35 | 36 | Replicache is a cache that runs inside the browser and synchronizes with a web service. The web service typically already exists when Replicache is added and it could be as simple as a document database or could be a massive distributed system -- Replicache doesn't care. In this document, we refer to the web service as the *Data Layer*. An application uses an instance of the *Replicache Client* to read from and write to the local cache, and the client synchronizes with the data layer in the background. 37 | 38 | ![Diagram](./diagram.png) 39 | 40 | ## Data Model 41 | 42 | Replicache synchronizes updates to per-user *state* across an arbitrary number of Replicache clients. The state is a sorted map of key/value pairs. Keys are strings, values are JSON. The canonical state fetched from the data layer is known as the *Client View*: the client's view of the user's data in the data layer. 43 | 44 | ## The Big Picture 45 | 46 | The Replicache Client maintains a local cache of the user's state against which the application runs read and write transactions (often referred to as *mutations*). Both read and write transactions run immediately against the local state and mutations are additionally queued as *pending* application on the server. In the background the client *syncs*, pushing pending mutations to the Data Layer, and pulling updated state from it. Mutations flow upstream in push and state changes flow downstream in pull. 47 | 48 | A key feature that makes Replicache flexible and easy to adopt is that Replicache does not take ownership of the data on the server. The Data Layer owns the data, is the source of truth, and typically requires only a few small changes to work with Replicache. Processes that Replicache knows nothing about can mutate state in the Data Layer and Replicache Clients will converge on the Data Layer's canonical state and correctly apply client changes on top of it. 49 | 50 | # Detailed Design 51 | 52 | ## Replicache Client 53 | 54 | The Replicache Client maintains: 55 | 56 | * The ClientID, a unique identifier for this client 57 | * The LastMutationID. Write transactions originating on a client are uniquely identified and ordered by an ordinal which increases sequentially. This ordinal serves as an idempotency token for the Data Layer, and is used to determine which transactions the server has applied. 58 | * The Cookie returned along with the Client View in the most recent pull. The cookie is returned to the data layer in the next pull to be used to compute a diff from the state the client has to that which the server has. 59 | * A persistent, versioned, transactional, deterministically iterable key/value store that keeps the user's state 60 | * Persistent meaning that the state of the store persists across browser sessions 61 | * Versioned meaning that we can go back to any previous version and can _fork_ from a version, apply transactions to it, and atomically reveal the new version (like git branch and merge) 62 | * Transactional meaning that we can read and write many keys atomically 63 | 64 | The client-side of the application using Replicache provides: 65 | * Mutators: A *mutator* is a named function that implements a write transaction. The application invokes mutators to do its work, and they read from and write to the local cache. 66 | 67 | The server-side (data layer) of the application provides: 68 | * The Push endpoint: the push endpoint accepts pending mutation invocations from the client and applies them to the canonical state on the server. The push endpoint has a server-side implementaiton of each client-side mutator. 69 | * The Pull endpoint: the pull endpoint returns the latest state to the client, typically in the form of a patch to the data the client already has. 70 | 71 | ### Commits 72 | 73 | Within the Replicache client, each version of the user's state is represented as a _commit_ which has an immutable view of the user's state. 74 | 75 | Commits come in two flavors, those from the client and those from the server: 76 | * *Local commits* represent a change made by a mutator executing locally against the client's cache. The set of local commits that are not yet known to be applied in the Data Layer are known as *pending* commits. Local commits include the *mutator name* and *arguments* that caused them, so that the mutator may be replayed later on top of new snapshot commits from the server if necessary. 77 | * *Snapshot commits* represent a state update pulled from the server. They carry a *cookie*, which the Data Layer can used to calculate the delta for the next pull. 78 | 79 | ### API Sketch 80 | 81 | This API sketch is in TypeScript, for JavaScript bindings. A similar API would exist for every client environment we support. 82 | 83 | ```ts 84 | class Replicache implements ReadTransaction { 85 | constructor({ 86 | pushURL: string, 87 | pushAuth: string, 88 | pullURL: string, 89 | pullAuth: string, 90 | }); 91 | 92 | // Registers a mutator, which is used to make changes to the data. 93 | register(name: string, mutatorImpl: MutatorImpl): Mutator; 94 | 95 | // Subcribe to changes to the underlying data. Every time the underlying data changes onData is called. 96 | // The function is also called once the first time the subscription is added. 97 | subscribe(body: (tx: ReadTransaction) => Promise, onData: (result: R) => void): void; 98 | }; 99 | 100 | // A Replicache "mutator" function is just a normal JS function that accepts any JSON value, makes changes 101 | // to Replicache, and returns a JSON value. Users can invoke mutators themselves, via the return value 102 | // from register(). Also Replicache will itself invoke these functions during sync as part of conflict 103 | // resolution. 104 | type MutatorImpl = ( 105 | tx: WriteTransaction, 106 | args: Args, 107 | ) => Promise; 108 | 109 | interface ReadTransaction { 110 | get(key: string): Promise; 111 | has(key: string): Promise; 112 | scan(startAt: string): Promise<[string, JSONValue][]>; 113 | }; 114 | 115 | interface WriteTransaction extends ReadTransaction { 116 | del(key: string): Promise; 117 | get(key: string): Promise; 118 | has(key: string): Promise; 119 | put(key: string, value: JSONValue): Promise; 120 | }; 121 | ``` 122 | 123 | ## Data Layer 124 | 125 | We expect the data layer to typically be a familiar REST/GraphQL web service, but it could be anything that provides transactional storage. In order to integrate Replicache, the Data Layer must: 126 | 1. maintain a mapping from ClientID to LastMutationID (used by Push and Pull) 127 | 1. implement the Pull endpoint from which the client fetches a user's Client View and its LastMutationID 128 | 1. implement the Push endpoint which executes a batch of mutators pushed upstream by the client 129 | 130 | ### Generality 131 | 132 | As mentioned, the Data Layer could be a simple document database or a complicated distrubuted system. All Replicache cares about is that it runs trasactions and returns the user's data as json in the Client View. Beyond transactional semantics, Replicache takes no opinion on where or how the Data Layer stores its bits. User data might be scattered across several systems under the hood, or assembled on the fly. 133 | 134 | # Data Flow 135 | 136 | Data flows from the client up to the Data Layer, and back down from the Data Layer to the Client. Mutations are pushed upstream while state updates are pulled downstream. Either of these processes can stop or stall indefinitely without affecting correctness. 137 | 138 | The client tracks state changes in a git-like fashion. The Replicache Client has a *main* branch of commits and keeps a *head* commit pointer representing the current state of the local key-value database. Transactions run against the state in the head commit. The head commmit can change in two ways: 139 | 1. write transactions (mutations): when the app runs a mutator that changes the database, the change goes into a pending commit on top of the current head. This new pending commit becomes the new head. 140 | 1. pull: when a new state update is pulled from the server Replicache will: 141 | 1. fork a new branch (the *sync branch*) from the most recent snapshot 142 | 1. add a new snapshot with the new state update to the sync branch; the branch now has state identical to the server 143 | 1. compute the set of mutations to replay on the sync branch by filtering all pending commits on main that have already been applied by the server. That is, find all pending commits on main whose MutationID is greater than the LastMutationID of the new snapshot. 144 | 1. for each mutation to replay, in order, apply it on the sync branch; this extends the sync branch with a pending commit for each mutation not yet seen by the server 145 | 1. make the sync branch main by setting head of main to the head of the sync branch 146 | 147 | ## Syncing 148 | 149 | There are two parts to sync: push and pull. 150 | 151 | To push, the client invokes the Data Layer's Push endpoint, passing all its pending mutations. The Data Layer executes the pending mutations serially. When the Data Layer executes a mutation it sets the client's LastMutationID to match the mutation's ID as part of the same transaction. If a MutationID is less than or equal to the client's LastMutationID or more than one more, the mutation is ignored. 152 | 153 | To pull, the Data Layer's Pull endpoint is invoked by the client, passing the cookie from its most recent Snapshot (as found in the most recent Snapshot commit). The Data Layer computes and returns a delta to the Client View using the cookie, and the LastMutationID for this client. The client applies the delta from Pull as described above: it forks from the previous snapshot commit, applies any local mutations that are still pending (those with mutation ids greater than the LastMutationID indicated along with the client view patch), and reveals the new state by setting head of main to the end of the new branch. The client can now forget about all pending mutations that have been confirmed, that is, all pending mutations with MutationIDs less than or equal to the LastMutationID of the most recent snapshot. 154 | 155 | ## Mutations outside the client 156 | 157 | There is nothing in the design that requires that changes to user data must come through Replicache. In fact we expect there is great utility in mutating the user's state outside of clients, eg in batch jobs or in response to changes in other users' clients. So long as all transactions that mutate the user's data run at a proper isolation level, leave the database in a valid state, and are correctly reflected by Pull, Replicache will faithfully converge all clients to the new state. 158 | 159 | ## Push endpoint 160 | 161 | By design, Replicache places a minimum of constraints on the Data Layer's Push endpoint. For correctness, it must execute mutations in order and ensure that the LastMutationID is updated transactionally along with any effects. Beyond that, Replicache imposes no requirements. For example, the Push endpoint need not be synchronous; it could accept a batch of mutations, enqueue them for execution elsewhere, and return. Similarly, the Push endpoint need not be consistent with the ClientView endpoint; as long as a mutation's effects and the change to LastMutationID are revealed to the ClientView endpoint atomically, the ClientView can lag or flap without affecting correctness. 162 | 163 | ## Conflicts 164 | 165 | Conflicts are an unavoidable part of disconnected systems, but they don't need to be exceptionally painful. 166 | 167 | A common initial approach to conflict resolution is to attempt to merge the *effects* of divergent forks. This doesn't work well because if all you have are the effects of two forks, it can be difficult or impossible to reason about what the correct merge is. 168 | 169 | Imagine a simple database consisting of only a single integer set to the value `1`. A client goes offline for awhile and through a series of changes, ends up with the value `2`. Meanwhile the server value goes through a series of changes and ends at `0`: 170 | 171 | ``` 172 | ... - 1 - ... 0 <- server 173 | \ ... 2 <- offline client 174 | ``` 175 | 176 | What is the correct resolution? We can't possibly know without more information about what the *intent* of those changes were. Were they adding? setting? multiplying? clearing? In real life applications with complex data models, many developers, and many versions of the application live at once, this problem is much worse. 177 | 178 | A better strategy is to capture the *intent* of changes. Replicache embraces this idea by recording, alongside each change, the name of the function that created the change along with the arguments it was passed. Later, when we need to rebase forks, we *replay* one fork atop the other by re-running the series of transaction functions against the newest state. The transaction functions have arbitrary logic and can realize the intended change differently depending on the state they are running against. 179 | 180 | For example, a transaction that reserves an hour on a user's calendar could keep a status for the reservation in the user's data. The transaction might successfully reserve the hour when running locally for the first time, setting the status to RESERVED. Later, if still pending, the transaction might be replayed on top of a state where that hour is unavailable. In this case the transaction might update the status to UNAVAILABLE. Later during Push when played against the Data Layer the transaction will settle on one value or the other, and the client will converge on the value in the Data Layer. App code can rely on subscriptions to keep the UI correctly reflective of the reservation status, or to trigger notification of the user or some other kind of followup such trying the next available slot. 181 | 182 | We believe the Replicache model for dealing with conflicts — to have defensively written, programmatic transaction logic that is replayed atop the latest state — leads to fewer actual conflicts in practice. Our experience is that it preserves expressiveness of the data model and is far easier to reason about than other general models for avoiding or minimizing conflicts. 183 | 184 | # Constraints 185 | 186 | **Data size** Although there is no limit to the amount of data that can be synced by Replicache, for some applications, syncing all data the user has access is impractical. For these use cases, we advise users to maintain per-client state encoding the extent of the data that should be synced. Initially the extent can be relatively small, but as the user moves through the app, the extent can be widened. For example, if the app is a game, the extent might initially be the first level. But as the user progresses through the game, the extent widens one level at a time. It is also possible to purge data from Replicache using the same mechanism, if managing max cache size is a concern. 187 | 188 | A second concern with data size is that it might be infeasible to complete large state update downloads on unreliable or slow connections. We can imagine a variety of potential solutions to this problem but for simplicity's sake we are punting on the problem for now. (The size constraint above helps here as well.) 189 | 190 | **Blobs** Any truly offline first system must have first class bidirectional support for binary assets aka blobs (eg, profile pictures). In some cases these assets should be managed transactionally along with the user's data: either you get all the data and all the blobs it references or you get none of it. In any case, there is presently no special support for blobs in Replicache. Users who need blobs are advised to base64 encode them as JSON strings in the user data. We plan to address this shortcoming in the future. 191 | 192 | **Duplicate transaction logic** You have to implement transactions twice, once in the mobile app and once in the Data Layer. Bummer. We can imagine potential solutions to this problem but it's not clear if the benefit would be worth the cost, or widely usable. It is also expected that client-side transactions will be significantly simpler as they are by nature *speculative*, having the canonical answer come from the server-side implementation. 193 | -------------------------------------------------------------------------------- /diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rocicorp/replicache-old/81c9b730362be621deecb7843d40c95840673cb6/diagram.png -------------------------------------------------------------------------------- /faq.md: -------------------------------------------------------------------------------- 1 | # Frequently Asked Questions 2 | 3 | ## Isn't it slow to serve the entire Client View constantly? 4 | 5 | No question it's slower than serving only the changes. 6 | 7 | However, it's also dramatically easier to implement. Tracking the precise per-client changes is a huge project and architectural change for most existing systems. 8 | It's also easy to get wrong. If you get it wrong, it's difficult to *know* that it's wrong - the clients just end up with the wrong state. 9 | 10 | Serving the entire client view is, in contrast, much closer to how existing application stacks typically work. It's stateless. You can change the schema of your 11 | client view at any time and it just works. 12 | 13 | And remember that you're not serving the entire client view over the internet to the device. You're serving it to another backend service which calcluates a 14 | diff to send to the client. 15 | 16 | You can of course use any standard caching techniques on the backend to make it faster to serve the client view. 17 | 18 | ## How does the client know when to sync? Does it poll? 19 | 20 | By default Replicache polls every 60 seconds. This is nice for development because it gets you up and running fast. 21 | 22 | For production, we recommend that you set up some kind of push channel and send a "poke" over that channel to tell the client when it might be a good time to sync. 23 | 24 | ## What if I don’t have a dedicated backend? I use serverless functions for my backend 25 | 26 | No problem. You can implement the integration points as serverless functions. Our samples are all implemented this way. 27 | -------------------------------------------------------------------------------- /skateboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rocicorp/replicache-old/81c9b730362be621deecb7843d40c95840673cb6/skateboard.png --------------------------------------------------------------------------------