├── .vscode
└── settings.json
├── README.md
├── SUMMARY.md
├── code
├── Root.js
└── html.js
├── imgs
├── bootstrapping-overview.png
├── cached-transactions.png
├── change-snapshot.png
├── count-of-models.png
├── covering-partial-index-values.png
├── direct-referencies.png
├── introduction.png
├── lastsyncid.png
├── linear-databases.png
├── meta-meta.png
├── meta-persistence.png
├── model-partial-store-db.png
├── model-property-lookup.png
├── model-registry.png
├── model-store-class.png
├── model-store-db.png
├── models.png
├── modified-properties.png
├── object-stores.png
├── partial-index-stores.png
├── partial-index-values.png
├── references.png
├── search-for-symbols.png
├── title-image.png
├── transaction-overview.png
├── transaction-queues.png
└── transient-partial-index-keys.png
└── reverse-lse.excalidraw
/.vscode/settings.json:
--------------------------------------------------------------------------------
1 | {
2 | "cSpell.words": [
3 | "Tuomas",
4 | ]
5 | }
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | > [!IMPORTANT]
2 | > This is a pretty awesome (and correct) write-up of our sync engine. ([link](https://github.com/wzhudev/reverse-linear-sync-engine/discussions/2#discussioncomment-12892194))
3 | >
4 | > ...probably the best documentation that exists - internally or externally. ([link](https://x.com/artman/status/1927808159139111007))
5 | >
6 | > -- Tuomas Artman (Co-founder and CTO of Linear)
7 |
8 | > [!IMPORTANT]
9 | >
10 | > **Disclaimer**: This research is conducted solely for learning purposes. Readers should not use the findings to develop software that competes with Linear or attempt to use the information provided to disrupt or compromise Linear’s systems. If the Linear team requests, I will be more than happy to remove this repository.
11 |
12 | > [!IMPORTANT]
13 | > Check out the [SUMMARY](./SUMMARY.md)
14 | >
15 | > My friends found this too long, so I wrote a summary highlighting the key points—making it a 10-minute read. If you're only interested in the main ideas or want to skip the implementation details, just read the summary.
16 |
17 | 
18 |
19 | # Reverse Engineering Linear's Sync Engine: A Detailed Study
20 |
21 | **Join in [discussion](https://github.com/wzhudev/reverse-linear-sync-engine/discussions/2).**
22 |
23 | I work on collaborative softwares, focusing on rich text editors and spreadsheets. **Collaboration engines**, also known as **data sync engines**, play a pivotal role in enhancing user experience in these softwares. They enable real-time, simultaneous edits on the same file while offering features like offline availability and file history. Typically, engineers use **[Operational Transformation (OT)](https://en.wikipedia.org/wiki/Operational_transformation)** or **[Conflict-free Replicated Data Types (CRDTs)](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type)** to build sync engines. While these technologies are effective for editors and spreadsheets, they may not be ideal for other types of applications. Here's why.
24 |
25 | OT is widely adopted but notorious for its complexity. This complexity stems from the need to account for diverse data models and operation sets across different applications, which requires significant effort to implement correct operations and transformation functions. While OT excels at synchronizing edits, preserving user intent, and handling conflicts, its complexity often makes it overkill for simpler use cases—such as managing user information or file metadata—where a straightforward **last-writer-wins** approach might suffice.
26 |
27 | CRDTs, on the other hand, appear more user-friendly. They offer built-in support for fundamental data structures (e.g., texts, lists, maps, counters), reducing the workload for developers. However, CRDTs often introduce metadata overhead and become challenging to manage in scenarios involving partial syncing or permission controls—such as when users can only access a subset of files. These issues arise because CRDTs are primarily designed for decentralized systems, while most modern applications still depend on centralized servers. Although I am personally an advocate of CRDTs, they often fall short for some use cases.
28 |
29 | What I look for in a sync engine includes:
30 |
31 | 1. **Support for arbitrary data models**: Making it adaptable to a wide range of scenarios.
32 | 2. **Rich features**: It should support partial syncing, enforce permission control, and include features like undo/redo, offline availability, and edit history.
33 | 3. **Great developer experience**: Ideally, it should allow model definitions in an ORM-like manner. Developers should not need to be experts in sync engines to build collaborative applications.
34 |
35 | [Linear](https://linear.app)'s Linear Sync Engine (LSE) provides an elegant solution to all the aforementioned requirements. Moreover, it offers an intuitive API that abstracts away the underlying complexity, making feature development significantly simpler. For instance, updating the title of an issue can be as straightforward as:
36 |
37 | ```jsx
38 | issue.title = "New Title";
39 | issue.save();
40 | ```
41 |
42 | I believe LSE is exactly what I've been looking for, so I decided to reverse-engineer its frontend code to understand how it works. Additionally, I'm documenting my findings to help others who are interested as I am.
43 |
44 | > [!NOTE]
45 | > Good References
46 | >
47 | > This [gist](https://gist.github.com/pesterhazy/3e039677f2e314cb77ffe3497ebca07b#gistcomment-5184039) introduces some off-the-shelf solutions, such as ElectricSQL and ZeroSync (which, BTW, I am also very curious about), for general-purpose synchronization. You might want to check them out as well.
48 |
49 | In this post, we will explore how LSE:
50 |
51 | - Defines models, properties, and references.
52 | - Uses **MobX** to make models observable.
53 | - Performs **bootstrapping**.
54 | - Builds and populates a local database (IndexedDB).
55 | - Hydrates lazily-loaded data.
56 | - Syncs clients with the server.
57 | - Handles undo and redo.
58 |
59 | To help you better understand how the Linear Sync Engine (LSE) works at the code level, I've uploaded a version of Linear's (uglified) code with detailed comments. These annotations provide additional insights that may not be covered in this post. Since the identifiers' names are obfuscated, I've done my best to infer their possible original names. At the end of the post, you'll also find a table mapping abbreviated terms to their full forms.
60 |
61 | For the best experience, I recommend cloning the repository and viewing the code in your favorite editor. This allows you to refer to the code alongside the text for a more seamless reading experience. Personally, I suggest using VS Code because its TypeScript language service handles large files exceptionally well. Additionally, I'll include callouts at the beginning of each section to highlight relevant code snippets. You can easily jump to these by searching for symbols using the shortcut Ctrl + Shift + O (or Meta + Shift + O on macOS).
62 |
63 | 
64 |
65 | I am not affiliated with the Linear team, nor have I consulted them while writing this article. As a result, there may be inaccuracies or discrepancies with the actual implementation. However, I've made every effort—especially by watching relevant talks and comparing LSE to well-studied operational transformation (OT) approaches—to ensure that my description of the LSE approach is as accurate as possible. I hope it serves as a valuable reference for building a similar collaborative engine. If you spot any errors or misleading information, please submit an issue or a pull request to help me correct it. Your feedback is greatly appreciated!
66 |
67 | That said, I may inevitably fall victim to the [curse of knowledge](https://en.wikipedia.org/wiki/Curse_of_knowledge). If anything is unclear, the fault is mine, and I'd be more than happy to provide further explanations. Feel free to open an issue, and I'll gladly add more details—or even diagrams—to make the article easier to understand.
68 |
69 | With that out of the way, let's dive in!
70 |
71 | ## Introduction
72 |
73 | If you haven't yet watched Tuomas' [two](https://www.youtube.com/watch?v=WxK11RsLqp4&t=2175s) [ talks](https://linear.app/blog/scaling-the-linear-sync-engine), a [podcast](https://www.devtools.fm/episode/61), and a [presentation at Local First Conf](https://www.youtube.com/watch?v=VLgmjzERT08) about LSE, I highly recommend exploring them out before proceeding. These resources provide valuable context. However, here are the core concepts behind LSE:
74 |
75 | 
76 |
77 | **Model**
78 |
79 | Entities such as `Issue`, `Team`, `Organization`, and `Comment` are referred to as **models** in LSE. These models possess **properties** and **references** to other models, many of which are observable (via **MobX**) to automatically update views when changes occur. In essence, models and properties include **metadata** that dictate how they behave in LSE.
80 |
81 | Models can be loaded from either the **local database** (IndexedDB) or the server. Some models supports **partially loading** and can be loaded on demand, either from the local database or by fetching additional data from the server. Once loaded, models are stored in an **Object Pool**, which serves as a large map for retrieving models by their **UUIDs**.
82 |
83 | Models can be **hydrated** lazily, meaning its properties can be loaded only when accessed. This mechanism is particularly useful for improving performance by loading only the necessary data.
84 |
85 | Operations—such as additions, deletions, updates, and archiving—on models, their properties, and references are encapsulated as **transactions**. These transactions are sent to the server, executed there, and then broadcast as **delta packets** to all connected clients. This ensures data consistency across multiple clients.
86 |
87 | **Transaction**
88 |
89 | Operations sent to the server are packaged as **transactions**. These transactions are intended to execute **exclusively** on the server and are designed to be **reversible** on the client in case of failure. If the client loses its connection to the server, transactions are temporarily **cached** in IndexedDB and automatically resent once the connection is reestablished.
90 |
91 | Transactions are associated with a **sync id**, which is a monotonically increasing number that ensures the correct order of operations. This number is crucial for maintaining consistency across all clients.
92 |
93 | Additionally, transactions play a key role in supporting **undo** and **redo** operations, enabling seamless changes and corrections in real-time collaborative workflows.
94 |
95 | **Delta packets**
96 |
97 | Once transactions are executed, the server broadcasts **delta packets** to all clients—including the client that initiated the transaction—to update the models. A delta packet contains several **sync action**s, and each action is associated with a **sync id** as well. This mechanism prevents clients from missing updates and helps identify any missing packets if discrepancies occur.
98 |
99 | The **delta packets** may differ from the original transactions sent by the client, as the server might perform **side effects** during execution (e.g., generating history).
100 |
101 | ---
102 |
103 | In the following chapters, we will explore these concepts in detail, along with the corresponding modules that manage them. We'll begin with the **"Model"**.
104 |
105 | ## Chapter 1: Defining Models and Metadata
106 |
107 | First and foremost, we need to figure out how models are defined in LSE.
108 |
109 | ### `ModelRegistry`
110 |
111 | > [!NOTE]
112 | > Code references
113 | >
114 | > - `rr`: `ModelRegistry`
115 |
116 | When Linear starts, it first generates metadata for models, including their properties, methods (actions), and computed values. To manage this metadata, LSE maintains a detailed dictionary called `ModelRegistry`.
117 |
118 | 
119 |
120 | > [!NOTE]
121 | > Uglified Names
122 | >
123 | > The names in the screenshots (e.g., Xs) may differ from those in the GitHub source code (rr). Additionally, names may vary across different screenshots. This is completely normal, as Linear ships nearly every half hour!
124 |
125 | `ModelRegistry` is a class with static members that store various types of metadata and provide methods for registering and retrieving this information. For example:
126 |
127 | - **`modelLookup`**: Maps a model's name to its constructor.
128 | - **`modelPropertyLookup`**: Stores metadata about a model's properties.
129 | - **`modelReferencedPropertyLookup`**: Stores metadata about a model's references.
130 | - etc.
131 |
132 | We will discuss how some of this metadata is registered in this chapter, focusing particularly on models and their properties.
133 |
134 | ### Model
135 |
136 | > [!NOTE]
137 | > Code references
138 | >
139 | > - `We`: `ClientModel` decorator
140 | > - `as`: `Model` base model class
141 | > - `re` `Vs`: `Issue` model class
142 | > - `rr.registerModel`: `ModelRegistry.registerModel`
143 |
144 | 
145 |
146 | LSE uses JavaScript's `class` keyword to define models, with all model classes extending the base `Model` class. This base class provides the following key properties and methods:
147 |
148 | - **`id`**: A unique UUID assigned to each model, serving as the key for retrieving the model from the Object Pool.
149 | - **`_mobx`**: An empty object required to make the model observable, as detailed in the "Observability" section.
150 | - **`makeObservable`**: A method for enabling observability. By default, models are not observable upon construction, so this method must be invoked at the appropriate time.
151 | - **`store`**: A reference to `SyncedStore`, which will be explored in depth in later chapters.
152 | - **`propertyChanged`, `markPropertyChanged`, `changeSnapshot`**: Methods that track property changes and generate an `UpdateTransaction`.
153 | - **etc.**: Additional important properties and methods will be discussed in subsequent chapters.
154 |
155 | > [!NOTE]
156 | > While writing this post, the Linear team updated how properties are stored. The `_mobx` object was removed, and each model now uses a `__data` property to store property values. This change affects the implementation of certain decorators and the hydration process. However, it does not impact our understanding of LSE, so I have not revised the related sections of this post.
157 |
158 | Models' metadata includes:
159 |
160 | 1. **`loadStrategy`**: Defines how models are loaded into the client. There are five strategies:
161 | - **`instant`**: Models that are loaded during application bootstrapping (default strategy).
162 | - **`lazy`**: Models that do not load during bootstrapping but are fetched all at once when needed (e.g., `ExternalUser`).
163 | - **`partial`**: Models that are loaded on demand, meaning only a subset of instances is fetched from the server (e.g., `DocumentContent`).
164 | - **`explicitlyRequested`**: Models that are only loaded when explicitly requested (e.g., `DocumentContentHistory`).
165 | - **`local`**: Models that are stored exclusively in the local database. No models have been identified using this strategy.
166 | 2. **`partialLoadMode`**: Specifies how a model is hydrated, with three possible values: `full`, `regular`, and `lowPriority`.
167 | 3. **`usedForPartialIndexes`**: Relates to the functionality of partial indexing.
168 | 4. etc.
169 |
170 | 
171 |
172 | _When I started writing this post, there were 76 models in Linear. As I am about to finish, there are 80 models._
173 |
174 | > [!NOTE]
175 | > What is `local` used for?
176 | >
177 | > You might wonder what the `local` load strategy is used for, given that no models currently use it.
178 | > In his presentation at Local First Conf, Tuomas explained how new features can be developed without modifying server-side code. My guess is that this is achieved by initially setting a new model's load strategy to `local`, ensuring it persists only in the local IndexedDB. Once the model is finalized, syncing can be enabled by changing its load strategy to one of the other available strategies.
179 |
180 | LSE uses **TypeScript decorators** to register metadata in `ModelRegistry`. The decorator responsible for registering models' metadata is `ClientModel` (also known as `We`).
181 |
182 | For example, consider the `Issue` model:
183 |
184 | ```tsx
185 | re = Pe([We("Issue")], re);
186 | ```
187 |
188 | The original source code may looks like this:
189 |
190 | ```typescript
191 | @ClientModel("Issue")
192 | class Issue extends Model {}
193 | ```
194 |
195 | In the implementation of `ClientModel`:
196 |
197 | 1. The model's name and constructor function are registered in `ModelRegistry`'s `modelLookup`.
198 | 2. The model's name, schema version, and property names are combined into a **hash value**, which is registered in `ModelRegistry` and used to check the database schema. If the model's `loadStrategy` is `partial`, this information is also included in the hash.
199 |
200 | You can refer to the source code for more details about how `ClientModel` works.
201 |
202 | ### Properties
203 |
204 | > [!NOTE]
205 | > Code references
206 | >
207 | > - `vn`: `PropertyTypeEnum` enumeration
208 | > - `w`: `Property` decorator
209 | > - `pe`: `Reference` decorator
210 | > - `A4`: `registerReference` helper function
211 | > - `rr.registerModel`: `ModelRegistry.registerModel`
212 | > - `rr.registerProperty`: `ModelRegistry.registerProperty`
213 |
214 | Models have properties that are implemented as JavaScript class properties. Each property is associated with property metadata, which includes key fields such as:
215 |
216 | 1. `type`: Specifies the property's type.
217 | 2. `lazy`: Specifies whether the property should be loaded only when the model is hydrated.
218 | 3. `serializer`: Defines how to serialize the property for data transfer or storage.
219 | 4. `indexed`: Determines whether the property should be indexed in the database. Used for references.
220 | 5. `nullable`: Specifies whether the property can be `null`, used for references.
221 | 6. etc.
222 |
223 | `type` is an enumeration that includes the following values:
224 |
225 | 1. **`property`**: A property that is "owned" by the model. For example, `title` is a `property` of `Issue`.
226 | 2. **`ephemeralProperty`**: Similar to a `property`, but it is not persisted in the database. This type is rarely used. For example, `lastUserInteraction` is an ephemeral property of `User`.
227 | 3. **`reference`**: A property used when a model holds a reference to another model. Its value is typically the ID of the referenced model. A reference can be lazy-loaded, meaning the referenced model is not loaded until this property is accessed. For example, `subscription` is a `reference` of `Team`.
228 | 4. **`referenceModel`**: When `reference` properties are registered, a `referenceModel` property is also created. This property defines getters and setters to access the referenced model using the corresponding `reference`.
229 | 5. **`referenceCollection`**: Similar to `reference`, but it refers to an array of models. For example, `templates` is a `referenceCollection` of `Team`.
230 | 6. **`backReference`**: A `backReference` is the inverse of a `reference`. For example, `favorite` is a `backReference` of `Issue`. The key difference is that a `backReference` is considered "owned" by the referenced model. When the referenced model (B) is deleted, the `backReference` (A) is also deleted.
231 | 7. **`referenceArray`**: Used for many-to-many relationships. For example, `members` of `Project` is a `referenceArray` that references `Users`, allowing users to be members of multiple projects.
232 |
233 | LSE uses a variety of decorators to register different types of properties. In this chapter, let's first look at three of them.
234 |
235 | #### `Property` (`w`)
236 |
237 | Let's take the `Issue` model as an example. `priority` and `title` are declared as properties of type `property` of `Issue`:
238 |
239 | ```tsx
240 | Pe([w()], re.prototype, "title", void 0);
241 | Pe(
242 | [
243 | w({
244 | serializer: P_,
245 | }),
246 | ],
247 | re.prototype,
248 | "priority",
249 | void 0
250 | );
251 | ```
252 |
253 | The original source code may look like this:
254 |
255 | ```tsx
256 | @ClientModel("Issue")
257 | class Issue extends Model {
258 | @Property()
259 | public title: string;
260 |
261 | @Property({ serializer: PrioritySerializer })
262 | public priority: Priority;
263 | }
264 | ```
265 |
266 | In the implementation of `Property`:
267 |
268 | 1. The property is made observable by calling `M1`, which will be covered in the [Observability](#observability-m1) section.
269 | 2. The property is registered in `ModelRegistry`.
270 |
271 | Please refer to the source code for more details.
272 |
273 | #### `Reference` (`pe`) and `OneToMany` (`Nt`)
274 |
275 | For example, `assignee` is a `reference` of `Issue`, as each issue can be assigned to only one user. On the other hand, `assignedIssues` is a `LazyReferenceCollection` of `User`, as a user can have many assigned issues.
276 |
277 | ```tsx
278 | Pe(
279 | [
280 | pe(() => K, "assignedIssues", {
281 | nullable: !0,
282 | indexed: !0,
283 | }),
284 | ],
285 | re.prototype,
286 | "assignee",
287 | void 0
288 | );
289 |
290 | st([Nt()], K.prototype, "assignedIssues", void 0);
291 | ```
292 |
293 | The original source code may look like this:
294 |
295 | ```tsx
296 | @ClientModel("Issue")
297 | class Issue extends Model {
298 | @Reference(() => User, "assignedIssues", {
299 | nullable: true,
300 | indexed: true,
301 | })
302 | assignee: User | null;
303 | }
304 |
305 | @ClientModel("User")
306 | class User extends Model {
307 | @OneToMany()
308 | assignedIssues: LazyReferenceCollection;
309 |
310 | constructor() {
311 | this.assignedIssues = new LazyReferenceCollection(Issue, this, "assigneeId", undefined, {
312 | canSkipNetworkHydration: () => this.canSkipNetworkHydration(Issue)
313 | }),
314 | }
315 | }
316 | ```
317 |
318 | In the implementation of the `Reference` decorator (more specifically, the `registerReference` function), two properties are actually registered: `assignee` and `assigneeId`.
319 |
320 | They are of different types. `assignee` is of type `referenceModel`, while `assigneeId` is of type `reference`. The `assignee` property is not persisted in the database; only `assigneeId` is.
321 |
322 | LSE uses a getter and setter to link `assigneeId` and `assignee`. When the `assignee` value is set, `assigneeId` is updated with the new value's `ID`. Similarly, when `assignee` is accessed, the corresponding record is fetched from the data store using the `ID`.
323 |
324 | Additionally, `assigneeId` is made observable with `M1`.
325 |
326 | 
327 |
328 | _There are lots of `referenceModel` and `reference` pairs in the `ModelRegistry`._
329 |
330 | ### Schema Hash
331 |
332 | `ModelRegistry` includes a special property called **`__schemaHash`**, which is a hash of all models' metadata and their properties' metadata. This hash is crucial for determining whether the local database requires migration, a topic covered in detail in a later chapter. I have already added comments in the source code explaining how it is calculated, so I won't repeat that here.
333 |
334 | > [!NOTE]
335 | > TypeScript Decorators
336 | >
337 | > When TypeScript transpiles decorators, it processes property decorators before model decorators. As a result, property decorators are executed first. By the time `ModelRegistry.registerModel` is called, all properties of that model have already been registered, and their metadata will also be included in the `__schemaHash`.
338 |
339 | ### Observability (`M1`)
340 |
341 | > [!NOTE]
342 | > Code references
343 | >
344 | > - `M1`: `observabilityHelper`
345 |
346 | The `M1` function plays a critical role in making models and properties observable.
347 |
348 | It uses `Object.defineProperty` to define a getter and setter for the property that needs to be observable. When a value is assigned to the property, the setter checks whether a MobX box needs to be created on `__mobx` and assigns the value to that box.
349 |
350 | The same logic applies to the getter, which ensures that if the box exists, it retrieves the value from it. By wrapping React components with `observer`, MobX can track which components subscribe to the observables and automatically refresh them when the observable values change.
351 |
352 | Additionally, when setting the value, the `propertyChanged` method is called to register the property change, along with the old and new values. This information will later be used to create an `UpdateTransaction`, which we'll discuss in a the third chapter.
353 |
354 | Check the source code for more details.
355 |
356 | ### Takeaway of Chapter 1
357 |
358 | Let's summarize the key points covered in this chapter:
359 |
360 | - **Models and Properties in LSE**: Governed by metadata that defines their behavior.
361 | - **Model Definition**: LSE defines models using JavaScript classes and utilizes decorators to register models, properties, and references in the `ModelRegistry`.
362 | - **Load Strategies**: Models can be loaded using different strategies, including `instant`, `lazy`, `partial`, `explicitlyRequested`, and `local`.
363 | - **Property Types**: LSE categorizes properties into several types, such as `property`, `reference`, `referenceModel`, `referenceCollection`, `backReference`, and `referenceArray`.
364 | - **Reactive Data Handling**: LSE uses `Object.defineProperty` to implement getters and setters, enabling efficient reference handling and observability.
365 |
366 | In the upcoming chapters, we'll explore how this metadata is leveraged in practice. Keep reading! 🚀
367 |
368 | ## Chapter 2: Bootstrapping & Lazy Loading
369 |
370 | Once the models are defined, the next step is to **load them into the client**. In this chapter, we'll explore how LSE **bootstraps and lazily loads models**.
371 |
372 | We'll start with a high-level overview to establish a foundational understanding before diving into more intricate details. Since this process involves multiple modules, I'll also provide brief introductions to each for better context.
373 |
374 | 
375 |
376 | 1. `StoreManager` (`cce`) creates either a `PartialStore` (`jm`) or a `FullStore` (`TE`) for each model. These stores are responsible for synchronizing in-memory data with IndexedDB. Also, `SyncActionStore` (`oce`) will be created to store sync actions.
377 | 2. `Database` (`eg`) connects to IndexedDB and get databases and tables ready. If the databases don't exist, they will be created. And if a migration is needed, it will be performed.
378 | 3. `Database` determines the type of bootstrapping to be performed.
379 | 4. The appropriate bootstrapping is executed. For full bootstrapping, models are retrieved from the server.
380 | 5. The retrieved model data will be stored in IndexedDB.
381 | 6. Data requiring immediate hydration is loaded into memory, and observability is activated.
382 | 7. Build a connection to the server to receive delta packets.
383 |
384 | There are three types of bootstrapping in LSE: **full bootstrapping**, **partial bootstrapping**, and **local bootstrapping**. In this post, I'll focus on providing a detailed explanation of **full bootstrapping**.
385 |
386 | ### Create `ObjectStore`s
387 |
388 | > [!NOTE]
389 | > Code references
390 | >
391 | > - `cce`: `StoreManager`
392 | > - `p3`: `PartialStore`
393 | > - `TE`: `FullStore`
394 | >
395 | > The bootstrapping process begins with `km.startBootstrap` (`SyncedStore.startBootstrap`). `StoreManager` is lazily created through the getter `eg.storeManager` (`Database.storeManager`).
396 |
397 | The first step in the bootstrapping process is the construction of `StoreManager`. This module is responsible for creating and managing `ObjectStore` instances for each model registered in the `ModelRegistry`. Each `ObjectStore` handles the corresponding table for its model in IndexedDB.
398 |
399 | 
400 |
401 | _There are 80 kinds of models so there are 80 `ObjectStore` consequently._
402 |
403 | As mentioned earlier, models have `loadStrategy` metadata, and LSE generates corresponding store types based on this field. Models with a `partial` load strategy are managed by `PartialObjectStore` (`p3`, `Jm`), while all other models use `FullObjectStore` (`TE`).
404 |
405 | When an `ObjectStore` is created, it computes a hash for its associated model, which is used as the table's name in the database. For example, the `Issue` model has a `storeName` of `119b2a...`, corresponding to a table with the same hash.
406 |
407 | 
408 |
409 | 
410 |
411 | Notably, for models with a `loadStrategy` of `partial`, an additional database named `_partial` will be created. This database stores indexes that facilitate lazy loading of these models. We will explore this mechanism in detail at the end of this chapter.
412 |
413 | 
414 |
415 | ### Create Databases & Tables in IndexedDB
416 |
417 | > [!NOTE]
418 | > Code references
419 | >
420 | > - `eg.open`: `Database.open`
421 | > - `jn` or `Xn`: `DatabaseManager` - `databaseInfo`, `registerDatabase`, `database`
422 | > - `cce.checkReadinessOfStores`: `StoreManager.checkReadinessOfStores`
423 | > - `TE.checkIsReady`: `SyncStore.checkIsReady`
424 |
425 | After `ObjectStore`s are constructed, the next step is to prepare the database—creating the databases and tables if they don't already exist in IndexedDB.
426 |
427 | LSE maintains two types of databases in IndexedDB: `linear_databases` and others with names like `linear_(hash)`.
428 |
429 | **`linear_databases`**: This database stores information about other databases. LSE creates a separate database for each logged-in user in a workspace. If the user is part of multiple workspaces, LSE creates a database for each logged-in workspace.
430 |
431 | 
432 |
433 | The database information includes:
434 |
435 | 1. **`name`**: The database's name. It is derived from the `userId`, `version`, and `userVersion`. As a result, different user identities lead to multiple databases.
436 | 2. **`schemaHash`**: Used for database migration. This corresponds to the `_schemaHash` property in `ModelRegistry`.
437 | 3. **`schemaVersion`**: An incremental counter that determines if a database migration is necessary. If the new `schemaHash` differs from the one stored in IndexedDB, the counter increments. The updated version is then passed as the second parameter to [`IndexedDB.open`](http://indexdb.open/) to check if migration is needed.
438 |
439 | and so on. You can checkout how this information is calculated in `jn.databaseInfo`.
440 |
441 | **`linear_(hash)`**: This database contains the data of a workspace. For example, `linear_b4782b3125a816b51a44e59f2e939efa` stores the data for my private workspace.
442 |
443 | Inside these databases, there are tables for each model, as we discussed in the previous section. Additionally, it includes two special tables:
444 |
445 | The first table is **`_meta`**, which holds persistence details for each model, as well as the database's metadata.
446 |
447 | 
448 |
449 | _Model persistence state._
450 |
451 | Each model has a corresponding record in the `_meta` table. If the `persisted` field is set to `true`, it indicates that all instances of that model within the workspace have been loaded onto the client.
452 |
453 | 
454 |
455 | _Database's metadata_
456 |
457 | The database's metadata fields includes:
458 |
459 | 1. `lastSyncId`.
460 |
461 | ---
462 |
463 | **`lastSyncId`** is a critical concept in LSE, so allow me to introduce it here. You might find that it ties into concepts like transactions and delta packets, which we will explore in greater detail in the later chapters. It's perfectly fine if you don't fully grasp this part right now. Keep reading and refer back to this section after you've covered the upcoming chapters—everything will come together.
464 |
465 | Linear is often regarded as a benchmark for [local-first software](https://www.inkandswitch.com/local-first/). Unlike most mainstream local-first applications that use CRDTs, Linear's collaboration model aligns more closely with OT, as it relies on a centralized server to establish the order of all transactions. Within the LSE framework, all transactions sent by clients follow a [total order](https://en.wikipedia.org/wiki/Total_order), whereas CRDTs typically require only a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set). This total order is represented by the `sync id`, which is an incremental integer. And `lastSyncId` is the latest `sync id` as you can tell from its name.
466 |
467 | When a transaction is successfully executed by the server, the global **`lastSyncId`** increments by 1. This ID effectively serves as the **version number of the database**, ensuring that all changes are tracked in a sequential manner.
468 |
469 | 
470 |
471 | __Each changes in the database increments the `lastSyncId` by 1. And the `lastSyncId` is also associated with the transaction and the delta packet.__
472 |
473 | The server includes the updated `lastSyncId` in its response to the client that initiated the transaction. Additionally, when the server broadcasts delta packets (which represent incremental changes) to all clients, these packets are also associated with the corresponding `lastSyncId`. This ensures that clients can synchronize their local state with the server using the latest database version.
474 |
475 | The concept of `sync id` is similar to a **file revision number** in operational transformation (OT) algorithms. (For more details, you can check out my [detailed article on OT](https://wzhu.dev/posts/ot).) However, unlike a file revision number that typically applies to a single file, **`lastSyncId` spans the entire database**, regardless of which workspace the changes occur in.
476 |
477 | This broader scope can be observed in practice: even if a single transaction happens in your workspace, the `lastSyncId` often increments significantly, indicating that it is tracking changes across all workspaces in the system.
478 |
479 | Clients use the **`lastSyncId`** to determine whether they are synchronized with the server. By comparing their local `lastSyncId` with the `lastSyncId` provided by the server, clients can identify if they are missing any transactions:
480 |
481 | - If the client's `lastSyncId` is **smaller** than the server's, it indicates that the client is out of sync and has not received some delta packets.
482 | - The server frequently includes the `lastSyncId` in its responses to help clients stay updated.
483 |
484 | The client's `lastSyncId` is initially set during the **full bootstrapping process**, where it retrieves the latest state of the database. As the client receives **delta packets** from the server, the `lastSyncId` is updated to reflect the new synchronized state.
485 |
486 | Now back to other fields of database's metadata.
487 |
488 | ---
489 |
490 | 2. **`firstSyncId`**: Represents the `lastSyncId` value when the client performs a **full bootstrapping**. As we'll see later, this value is used to determine the starting point for incremental synchronization.
491 | 3. **`backendDatabaseVersion`**: Indicates the version of the backend database. The name is self-explanatory and is used to track compatibility between the client and server databases.
492 | 4. **`updatedAt`**: A timestamp indicating the last time the database or its metadata was updated. The name is straightforward.
493 | 5. **`subscribedSyncGroups`**.
494 |
495 | ---
496 |
497 | This concept is crucial in LSE. While all workspaces share the same `lastSyncId` counter, you cannot access issues or receive delta packets from workspaces or teams to which you lack proper permissions. This restriction is enforced through an access control mechanism, with `subscribedSyncGroups` serving as the key component. The `subscribedSyncGroups` array contains UUIDs that represent your user ID, the teams you belong to, and predefined roles.
498 |
499 | > [!NOTE]
500 | > Understanding how `SyncGroup` works was particularly challenging in earlier versions of LSE. However, the introduction of `userSyncGroup` and `teamSyncGroup` in recent updates by the Linear team has clarified its purpose. These changes reveal that a `SyncGroup` is essentially a collection of models linked to either a specific "User" or "Team."
501 |
502 | > [!NOTE]
503 | > Linear's Database Metadata Changes
504 | > In late 2024, Linear modified the database metadata fields. While this screenshot reflects the updated metadata, the source code excerpts were taken before the change. For example, `subscribedSyncGroups` is replaced by `userSyncGroups`. Since this update does not significantly impact the core concepts of how LSE works, I will omit these differences in this post.
505 |
506 | ---
507 |
508 | The explanation above covered the `_meta` table. Now, let's discuss the second special table: **`_transaction`**. This table stores unsent transactions or those queued for server synchronization. We'll delve deeper into the details of transactions in the next chapter.
509 |
510 | 
511 |
512 | _Cached transactions_
513 |
514 | Let's return to the bootstrapping process and explore how these two types of databases are created in IndexedDB. Please refer to `ng.initializeDatabase` (`SyncClient.initializeDatabase`) for source code and comments.
515 |
516 | **Step 1: Retrieve Workspace Metadata**
517 |
518 | The process begins by retrieving the metadata for the workspace being bootstrapped via the `Xn.databaseInfo` method. During this step, if the `linear_databases` database has not yet been created, it will be initialized. Once the metadata is successfully retrieved, it is stored in the `linear_databases` database using the `Xn.registerDatabase` method.
519 |
520 | **Step 2: Create the Workspace-Specific Database**
521 |
522 | Next, LSE prepares the workspace-specific database, such as `linear_b4782b3125a816b51a44e59f2e939efa`. It first establishes a connection to the database and evaluates whether it needs to be created or migrated. If creation or migration is required, the `StoreManager` invokes its `createStores` method (`this.storeManager.createStores(i, l)`) to initialize the necessary tables for the models.
523 |
524 | At this stage, LSE also attempts to read the database's metadata. However, during a full bootstrapping process, no metadata is stored yet, so all fields are initialized to `0` or other default values.
525 |
526 | **Step 3: Check Store Readiness**
527 |
528 | The final stage involves verifying the readiness of each store. During the first load, as all tables are initially empty, so none of the stores will be ready.
529 |
530 | At this point, LSE has prepared the databases and is ready to load data from the server. Let's dive deeper into how this process works.
531 |
532 | ### Determine the Bootstrapping Type
533 |
534 | > [!NOTE]
535 | > Code references
536 | >
537 | > - `ng.bootstrap`: `SyncClient.bootstrap`
538 | > - `eg.requiredBootstrap`: `Database.requiredBootstrap`
539 |
540 | The next step in the process is determining the bootstrapping type and executing it.
541 |
542 | The `Database.requiredBootstrap` method identifies the appropriate bootstrapping type and supplies the necessary parameters for its execution. The method returns an object with the following fields:
543 |
544 | 1. **`type`**: The type of bootstrapping to perform.
545 |
546 | ---
547 |
548 | There are three types of bootstrapping:
549 |
550 | 1. **`full`**: LSE retrieves all required models from the server.
551 | 2. **`local`**: Data is loaded from the local database, and the application synchronizes with the server using incremental deltas.
552 | 3. **`partial`**: A subset of models is loaded from the server, depending on the load strategy.
553 |
554 | **LSE performs a full bootstrapping** in the following scenarios (excluding demo project logic for simplicity):
555 |
556 | 1. **No stores are ready**: Newly created tables, as discussed earlier, are still empty and unavailable.
557 | 2. **`lastSyncId` is undefined**: This indicates the database lacks a record of the last synchronization point.
558 | 3. **Models are outdated**: When the client is online, and some models are outdated, a full bootstrap refreshes all data.
559 |
560 | This post will focus on the **full bootstrapping** process.
561 |
562 | ---
563 |
564 | Additional fields in the `requiredBootstrap` 's return include:
565 |
566 | 2. **`modelsToLoad`**: Names of the models with a load strategy marked as either **instant** or **lazy**.
567 | 3. **`lastSyncId`**: Indicates the database snapshot the client is currently synchronized to. During full bootstrapping, this value is zero, as no data has yet been loaded from the server. In this case, the server will returns the latest snapshot.
568 |
569 | ### Bootstrapping the Database
570 |
571 | > [!NOTE]
572 | > Code references
573 | >
574 | > 1. `ng.bootstrap`: `SyncClient.bootstrap`
575 | > 2. `eg.bootstrap`: `Database.bootstrap`
576 | > 3. `Xm.fullBootstrap`: `BootstrapHelper.fullBootstrap`
577 | > 4. `sd.restModelsJsonStreamGen`: `GraphQLClient.restModelsJsonStreamGen`
578 |
579 | When LSE initiates a full bootstrapping process, it sends a request through the `GraphQLClient.restModelsJsonStreamGen` method. This function is responsible for retrieving models from the server and will be referenced multiple times throughout the remainder of this article.
580 |
581 | The request would look like this:
582 |
583 | ```
584 | https://client-api.linear.app/sync/bootstrap?type=full&onlyModels=WorkflowState,IssueDraft,Initiative,ProjectMilestone,ProjectStatus,TextDraft,ProjectUpdate,IssueLabel,ExternalUser,CustomView,ViewPreferences,Roadmap,RoadmapToProject,Facet,Project,Document,Organization,Template,Team,Cycle,Favorite,CalendarEvent,User,Company,IssueImport,IssueRelation,TeamKey,UserSettings,PushSubscription,Activity,ApiKey,EmailIntakeAddress,Emoji,EntityExternalLink,GitAutomationTargetBranch,GitAutomationState,Integration,IntegrationsSettings,IntegrationTemplate,NotificationSubscription,OauthClientApproval,Notification,OauthClient,OrganizationDomain,OrganizationInvite,ProjectLink,ProjectUpdateInteraction,InitiativeToProject,Subscription,TeamMembership,TimeSchedule,TriageResponsibility,Webhook,WorkflowCronJobDefinition,WorkflowDefinition,ProjectRelation,DiaryEntry,Reminder
585 | ```
586 |
587 | It has two parameters:
588 |
589 | 1. **`type`**: In our case, it is `"full"`.
590 | 2. **`onlyModels`**: A comma-separated list of the model names to be loaded. This corresponds to the `modelsToLoad` returned by `requiredBootstrap`.
591 |
592 | > [!NOTE]
593 | > Linear's bootstrapping requests had changed
594 | > As part of an optimization rolled out in late 2024, the Linear team split this single request into multiple requests to improve cache performance and loading speed in large workspaces. This change does not affect how LSE operates, so I will omit the details here. For more information, open your browser's debug tools and search for `splitToCacheableRequests` in the source code.
595 |
596 | And an example response would be like this:
597 |
598 | ```jsx
599 | {"id":"8ce3d5fe-07c2-481c-bb68-cd22dd94e7de","createdAt":"2024-07-03T11:37:04.865Z","updatedAt":"2024-07-03T11:37:04.865Z","userId":"4e8622c7-0a24-412d-bf38-156e073ab384","issueId":"01a3c1cf-7dd5-4a13-b3ab-a9d064a3e31c","events":[{"type":"issue_deleted","issueId":"01a3c1cf-7dd5-4a13-b3ab-a9d064a3e31c","issueTitle":"Load data from remote sync engine."}],"__class":"Activity"}
600 | {"id":"ec9ec347-4f90-465c-b8bc-e41dae4e11f2","createdAt":"2024-07-03T11:37:06.944Z","updatedAt":"2024-07-03T11:37:06.944Z","userId":"4e8622c7-0a24-412d-bf38-156e073ab384","issueId":"39946254-511c-4226-914f-d1669c9e5914","events":[{"type":"issue_deleted","issueId":"39946254-511c-4226-914f-d1669c9e5914","issueTitle":"Reverse engineering Linear's Sync Engine"}],"__class":"Activity"}
601 | // ... many lines omitted here
602 | _metadata_={"method":"mongo","lastSyncId":2326713666,"subscribedSyncGroups":["89388c30-9823-4b14-8140-4e0650fbb9eb","4e8622c7-0a24-412d-bf38-156e073ab384","AD619ACC-AAAA-4D84-AD23-61DDCA8319A0","CDA201A7-AAAA-45C5-888B-3CE8B747D26B"],"databaseVersion":948,"returnedModelsCount":{"Activity":6,"Cycle":2,"DocumentContent":5,"Favorite":1,"GitAutomationState":3,"Integration":1,"Issue":3,"IssueLabel":4,"NotificationSubscription":2,"Organization":1,"Project":2,"ProjectStatus":5,"Team":1,"TeamKey":1,"TeamMembership":1,"User":1,"UserSettings":1,"WorkflowState":7,"Initiative":1,"SyncAction":0}}
603 | ```
604 |
605 | The response is a stream of JSON objects, with each line (except the last) representing the information of a model instance. For instance, here's an object describing an `Issue` model:
606 |
607 | ```json
608 | {
609 | "id": "556c8983-ca05-41a8-baa6-60b6e5d771c8",
610 | "createdAt": "2024-01-22T01:02:41.099Z",
611 | "updatedAt": "2024-05-16T08:23:31.724Z",
612 | "number": 1,
613 | "title": "Welcome to Linear 👋", // Text encoding issue. Here's actually an emoji.
614 | "priority": 1,
615 | "boardOrder": 0,
616 | "sortOrder": -84.71, // LSE uses fractional indexing for sorting
617 | "startedAt": "2024-05-16T08:16:57.239Z",
618 | "labelIds": ["30889eaf-fac5-4d4d-8085-a4c3bd80e588"],
619 | "teamId": "89388c30-9823-4b14-8140-4e0650fbb9eb",
620 | "projectId": "3e7ada3c-f833-4b9c-b325-6db37285fa11",
621 | "projectMilestoneId": "397b95c4-3ee2-47b0-bad1-d6b1c7003616",
622 | "subscriberIds": ["4e8622c7-0a24-412d-bf38-156e073ab384"],
623 | "previousIdentifiers": [],
624 | "assigneeId": "4e8622c7-0a24-412d-bf38-156e073ab384",
625 | "stateId": "030a7891-2ba5-4f5b-9597-b750950cd866",
626 | "reactionData": [],
627 | "__class": "Issue"
628 | }
629 | ```
630 |
631 | The last line of the response contains metadata specific to this bootstrapping request. Certain fields within this metadata are used to update the corresponding fields in the database's metadata.
632 |
633 | ```json
634 | {
635 | "method": "mongo",
636 | "lastSyncId": 2326713666,
637 | "subscribedSyncGroups": [
638 | "89388c30-9823-4b14-8140-4e0650fbb9eb",
639 | "4e8622c7-0a24-412d-bf38-156e073ab384",
640 | "AD619ACC-AAAA-4D84-AD23-61DDCA8319A0",
641 | "CDA201A7-AAAA-45C5-888B-3CE8B747D26B"
642 | ],
643 | "databaseVersion": 948,
644 | "returnedModelsCount": {
645 | "Activity": 6,
646 | "Cycle": 2,
647 | "DocumentContent": 5,
648 | "Favorite": 1,
649 | "GitAutomationState": 3,
650 | "Integration": 1,
651 | "Issue": 3,
652 | "IssueLabel": 4,
653 | "NotificationSubscription": 2,
654 | "Organization": 1,
655 | "Project": 2,
656 | "ProjectStatus": 5,
657 | "Team": 1,
658 | "TeamKey": 1,
659 | "TeamMembership": 1,
660 | "User": 1,
661 | "UserSettings": 1,
662 | "WorkflowState": 7,
663 | "Initiative": 1,
664 | "SyncAction": 0
665 | }
666 | }
667 | ```
668 |
669 | Key fields:
670 |
671 | - **`method`**: Indicates the source of the result, which is `"mongo"`. This signifies that the data was retrieved from the MongoDB cache. Tuomas discussed this strategy in a talk about scaling Linear's sync engine.
672 | - **`lastSyncId`**: Represents the snapshot the client is updated to after this bootstrapping request. For example, the snapshot ID is `2326713666`.
673 | - **`subscribedSyncGroups`**: Specifies the sync groups the client should subscribe to for accessing relevant incremental changes.
674 | - **`returnedModelsCount`**: Ensures request validity by verifying that the number of models in the response matches this count.
675 |
676 | > [!NOTE]
677 | > Linear's bootstrapping requests had changed
678 | >
679 | > In the aforementioned optimization, Linear moved `subscribedSyncGroups` from the response to a pre-request at `/sync/user_sync_groups`. In the `/sync/bootstrap` request, the sync groups are now included in the request parameters. So it can split bootstrapping requests.
680 |
681 | Finally, the retrieved models are written to their respective object stores, and the database metadata is updated accordingly to reflect the changes.
682 |
683 | ### Hydration and Object Pool
684 |
685 | > [!NOTE]
686 | > Code references
687 | >
688 | > 1. `ng.bootstrap`: `SyncClient.bootstrap`.
689 | > 2. `eg.getAllInitialHydratedModelData`: `Database.getAllInitialHydratedModelData`.
690 | > 3. `ng.addModelToLiveCollections`: `SyncClient.addModelToLiveCollections`.
691 | > 4. `as.updateFromData`: `ClientModel.updateFromData`.
692 | > 5. `as.updateReferencedModels`: `ClientModel.updateReferencedModels`.
693 |
694 | With the raw models written into the `ObjectStore`s, the next step is to construct these models in memory and add them to the **Object Pool**, making them accessible to other parts of the application. This process is known as **model hydration**.
695 |
696 | LSE initiates model hydration by calling `Database.getAllInitialHydratedModelData`. During this step, the `Database` loads models with a `loadStrategy` set to `instant`. For each of these models, LSE retrieves the constructors from the `ModelRegistry` and uses them to instantiate model objects. These objects are then added to the Object Pool via the `addModelToLiveCollections` method.
697 |
698 | The Object Pool is implemented as a map called `modelLookup` on `SyncClient`. This map links a model's ID to its corresponding model object, enabling other parts of Linear to efficiently retrieve models by their IDs.
699 |
700 | When constructing a model object, LSE does not pass the dehydrated model data directly to the constructor. Instead, it initializes the object first, then hydrates it by invoking the `updateFromData` method to populate the object with the data. Additionally, it calls `attachToReferencedProperties` to resolve and populate any references.
701 |
702 | ### Lazy Hydration
703 |
704 | > [!note]
705 | > Code references
706 | >
707 | > - `as.hydrate`: `Model.hydrate`.
708 | > - constructor of `Issue` (`re`)
709 | > - `Et`: `LazyReferenceCollection`
710 | > - `hydrate`
711 | > - `getCoveringPartialIndexValues`
712 | > - `Nt`: `LazyReferenceCollection`
713 | > - `Ku`: `PartialIndexHelper`
714 | > - `resolveCoveringPartialIndexValues`
715 | > - `partialIndexInfoForModel`
716 | > - `processPartialIndexInfoForModel`
717 |
718 | LSE does not load all data into memory during bootstrapping, regardless of the type. Instead, additional data is fetched via network requests or local database queries as the application is used. For example, when you view the details of an issue, LSE asynchronously loads the comments associated with that issue. This is called **lazy hydration**.
719 |
720 | Classes with a `hydrate` method can be hydrated, such as `Model`, `LazyReferenceCollection`, `LazyReference`, `RequestCollection`, and `LazyBackReference`, among others.
721 |
722 | Let's start by examining the `hydrate` method of the `Model`. It checks all of its properties that need hydration and calls their respective `hydrate` methods. There are four types of properties that require hydration:
723 |
724 | 1. `LazyReferenceCollection`
725 | 2. `LazyReference`
726 | 3. `Reference` and `ReferenceCollection`, which are set to be hydrated alongside the model.
727 |
728 | We won't dive too deep into the hydration of `Reference` and `ReferenceCollection`, as they simply call the `hydrate` method of other `Model` instances recursively. Instead, let's focus on `LazyReferenceCollection` and `LazyReference`, as these are responsible for lazy hydration.
729 |
730 | Now, let's discuss `LazyReferenceCollection`.
731 |
732 | Earlier, when we discussed the definition of properties, we saw that `referenceCollection` is one of the seven types of properties. Now, let's dive deeper into this. The `OneToMany` (`Nt`) decorator is used for such properties. For instance, `comments` is a `LazyReferenceCollection` property of the `Issue` model. This decorator registers the property's metadata in the `ModelRegistry`.
733 |
734 | ```ts
735 | Pe([Nt()], re.prototype, "comments", void 0);
736 | ```
737 |
738 | Additionally, a `LazyReferenceCollection` instance is initialized for the property. For example, in the constructor of `Issue`:
739 |
740 | ```ts
741 | this.comments = new Et(nt,this,"issueId"),
742 | ```
743 |
744 | The source code would be something like this:
745 |
746 | ```js
747 | @ClientModel()
748 | class Issue extends BaseModel {
749 | @OneToMany()
750 | public comments = new LazyReferenceCollection(Comment, this, "issueId");
751 | }
752 | ```
753 |
754 | > [!NOTE]
755 | > Changes of decorators used here
756 | >
757 | > After I began writing this post, the Linear team introduced a new approach that eliminates the need for developers to manually call the constructor of `LazyReferenceCollection`. In essence, they added more decorators similar to `OneToMany` that automatically construct `LazyReferenceCollection` with various options. Since this change doesn't affect how lazy hydration works, I'll omit it from this post for simplicity.
758 |
759 | In the `hydrate` method of `LazyReferenceCollection`, the first to step is to call `this.getCoveringPartialIndexValues` to get partial index values. So what is a partial index?
760 |
761 | ---
762 |
763 | > [!note]
764 | > Code references
765 | >
766 | > - `re.constructor`: `Issue.constructor`
767 | > - `LazyReferenceCollectionImpl` (`Et`)
768 | > - `hydrate`
769 | > - `getCoveringPartialIndexValues`
770 | > - `LazyReferenceCollection` (`Nt`)
771 | > - `PartialIndexHelper` (`Ku`)
772 | > - `resolveCoveringPartialIndexValues`
773 | > - `partialIndexInfoForModel`
774 | > - `processPartialIndexInfoForModel`
775 |
776 | **Partial Index** plays a crucial role in LSE by addressing a key question: **How should we determine which models need to be lazy-loaded?** In other words, when querying lazy-loaded models, what **parameters should the query use**? If we have the model IDs, the answer is straightforward. However, in cases where LSE needs to load assigned `Issues` for a `User`, it may not have the `Issue` IDs readily available.
777 |
778 | Imagine you're designing Linear's database schema. To query `Issues` assigned to a `User`, you would include an `assigneeId` field and create an index on it. This concept is applied similarly in LSE's frontend code. When defining the `Issue` model, a reference to the `User` model is created, and LSE automatically generates an index for that field.
779 |
780 | ```typescript
781 | Pe(
782 | [
783 | pe(() => K, "assignedIssues", {
784 | nullable: !0,
785 | indexed: !0,
786 | }),
787 | ],
788 | re.prototype,
789 | "assignee",
790 | void 0
791 | );
792 | ```
793 |
794 | The original source code may look like this:
795 |
796 | ```typescript
797 | @ClientModel("Issue")
798 | class Issue extends Model {
799 | @Reference(() => User, "assignee", {
800 | nullable: true,
801 | indexed: true,
802 | })
803 | public assignee: User | null;
804 | }
805 | ```
806 |
807 | And `User` model also references `Issue`:
808 |
809 | ```typescript
810 | st([Nt()], K.prototype, "assignedIssues", void 0);
811 |
812 | // In User's constructor:
813 | this.assignedIssues = new Et(re,this,"assigneeId",void 0,{
814 | canSkipNetworkHydration: ()=>this.canSkipNetworkHydration(re)
815 | }),
816 | ```
817 |
818 | The original source code may look like this:
819 |
820 | ```typescript
821 | @ClientModel("User")
822 | class User extends Model {
823 | @OneToMany()
824 | public assignedIssues = new LazyReferenceCollection(
825 | Issue,
826 | this,
827 | "assigneeId",
828 | undefined,
829 | {
830 | canSkipNetworkHydration: () => this.canSkipNetworkHydration(Issue),
831 | }
832 | );
833 | }
834 | ```
835 |
836 | LSE can load `Issues` by the assignee's ID. In other words, the query parameter to fetch `Issues` could be `assigneeId-`. Similarly, `Issues` can be loaded based on the `Team` they belong to or the `Project` they are associated with. To determine how a model can be referenced, LSE uses a `PartialIndexHelper` (`ku`) class. This class returns an array that describes how a model can be referenced by other models.
837 |
838 | 
839 |
840 | LSE takes this approach even further. As discussed in the previous section, `Issue` references `Comment`s, meaning `Comments` are indirectly referenced by `Team`. This allows for nested references to `Comment`s. For example, if you send a query with the parameter `issue.cycleId-`, you can, theoretically, retrieve all comments for all issues associated with that cycle. In the method `Ku.processPartialIndexInfoForModel`, LSE calculates these nested references, supporting up to three levels of depth. The diagram below illustrates how models reference a `Comment`, either directly or indirectly.
841 |
842 | 
843 |
844 | Back to `getCoveringPartialIndexValues`. The partial indexes used to query a comment would look like `i` in the screenshot:
845 |
846 | 
847 |
848 | You can clearly see the relationship between the two images above. Essentially, LSE appends the ID of the referencing model to the end of each query parameter.
849 |
850 | As we'll explore later, partial indexes are used to query models from the server, and also used to check whether the target model has already been fetched from the server.
851 |
852 | > [!note]
853 | > Code references
854 | >
855 | > - `ng.hydrateModelsByIndexedKey`: `SyncClient.hydrateModelsByIndexedKey`
856 | > - `eg.getModelDataByIndexedKey`: `Database.getModelDataByIndexedKey`
857 | > - `Jm`: `PartialStore`
858 | > - `getAllForIndexedKey`
859 | > - `hasModelsForPartialIndexValues`
860 | > - `getAllFromIndex`
861 |
862 | After the partial indices are retrieved, the `hydrate` method of `LazyReferenceCollection` calls `SyncedStore.hydrateModels`, which in turn triggers `SyncClient.hydrateModelsByIndexedKey`.
863 |
864 | Let's assume we're lazy-loading `Comment` objects for an `Issue`. The method parameters would be as follows:
865 |
866 | 1. `e`: The class of the `Comment` model.
867 | 2. `t`: The parameters for the query. The `key` indicates that we're loading `Comment` references by `Issue`, and `coveringPartialIndexValues` signifies that these `Comment` objects may also be indirectly referenced by other models. `value` is the ID of the `Issue` that references the `Comment`.
868 |
869 | 
870 |
871 | In this implementation, the `LazyReferenceCollection` (LSE) first checks if the models can be loaded from the local database by calling `Database.getModelDataByIndexedKey`. If not, it decides whether a **network hydration** is needed based on the following conditions:
872 |
873 | 1. **Missing `coveringPartialIndexValues`:** If the `coveringPartialIndexValues` parameter is absent, the LSE can't determine if the requested models were previously fetched from the server.
874 | 2. **Absent partial index in the store:** If the `coveringPartialIndexValues` are not found in the partial store, a network hydration shall be necessary.
875 |
876 | Recall that when we discussed `ObjectStore`, we learned that for models with a `partial` load strategy, there exists a partial index store to track partial indices. This is the point at which the store comes into play. For example, if the `Comment` model's partial index store contains two records, it means the LSE has previously attempted to load `Comment` objects using those indices, confirming that the corresponding comments were fetched from the server at some point. We will discuss when these partial indices get updated later.
877 |
878 | 
879 |
880 | 3. **`canSkipNetworkHydration` returns `true`:** If this option is set to `true`, LSE can skip the network hydration and proceed with loading the data locally.
881 |
882 | If no network hydration is necessary, LSE will query the IndexedDB by calling `getAllFromIndex`. If a network hydration is required, it will schedule the request via `BatchModelLoader.addRequest`.
883 |
884 | > [!NOTE]
885 | > Code references
886 | >
887 | > - `BatchedRequest` (`PE`)
888 | > - `addRequest`
889 | > - `BatchModelLoader` (`wm`)
890 | > - `handleBatch`
891 | > - `loadSyncBatch`
892 | > - `loadFullModels`
893 | > - `handleLoadedModels`
894 | > - `Database` (`eg`)
895 | > - `setPartialIndexValueForModel`
896 |
897 | `BatchModelLoader`, as the name suggests, batches multiple network hydration requests into a single GraphQL request. While we won't dive into the details of how LSE deduplicates requests in this article (you can refer to the code, where I've added comments for clarity), the focus here will be on how LSE handles the batching process.
898 |
899 | In the `BatchModelLoader.handleBatch` method, Linear divides requests into three categories:
900 |
901 | 1. Requests associated with a partial index key.
902 | 2. Requests associated with a `SyncGroup`.
903 | 3. Requests that are neither associated with an `indexedKey` nor a `SyncGroup`.
904 |
905 | LSE handles each category using different methods: `loadSyncBatch`, `loadPartialModels`, and `loadFullModels`.
906 |
907 | In `loadSyncBatch`, it calls `GraphQLClient.resetModelsJsonStream` to send a request to `https://client-api.linear.app/sync/batch`. The request body will look like this:
908 |
909 | ```json
910 | {
911 | "firstSyncId": 3528373991,
912 | "requests": [
913 | {
914 | "indexedKey": "issueId",
915 | "keyValue": "bda1a998-91b0-4ceb-8f89-91b7f6608685",
916 | "modelName": "Comment"
917 | },
918 | {
919 | "indexedKey": "issueId",
920 | "keyValue": "bda1a998-91b0-4ceb-8f89-91b7f6608685",
921 | "modelName": "IssueHistory"
922 | }
923 | ]
924 | }
925 | ```
926 |
927 | And the response will look like this:
928 |
929 | ```json
930 | {"id":"9a4ea82f-bd0e-4a3d-a8f3-430ea570bbbb","createdAt":"2025-01-27T05:11:59.451Z","updatedAt":"2025-01-27T05:11:59.337Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","userId":"4e8622c7-0a24-412d-bf38-156e073ab384","bodyData":"{\"type\":\"doc\",\"content\":[{\"type\":\"paragraph\",\"content\":[{\"type\":\"text\",\"text\":\"Some comment.\"}]}]}","reactionData":[],"subscriberIds":["4e8622c7-0a24-412d-bf38-156e073ab384"],"__class":"Comment"}
931 | {"id":"6168d074-cfc0-45ef-9a14-2e5162cbf3dd","createdAt":"2025-01-19T16:02:24.806Z","updatedAt":"2025-01-19T16:02:24.806Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","fromCycleId":"43714e70-c8a5-44f8-b2ce-3b8613397955","toCycleId":"8c74d891-d9f3-4b47-b4cc-79cf27d9f93c","__class":"IssueHistory"}
932 | {"id":"c7726ecb-672f-4604-a0f5-f2bf1e420ba7","createdAt":"2025-01-06T11:49:21.638Z","updatedAt":"2025-01-06T11:49:21.638Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","actorId":"4e8622c7-0a24-412d-bf38-156e073ab384","fromPriority":3,"toPriority":1,"__class":"IssueHistory"}
933 | {"id":"b68d9b8a-6fb4-45ab-944e-7b45a8c30673","createdAt":"2025-01-26T02:17:17.771Z","updatedAt":"2025-01-26T02:17:17.771Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","actorId":"4e8622c7-0a24-412d-bf38-156e073ab384","fromPriority":1,"toPriority":2,"__class":"IssueHistory"}
934 | {"id":"4c8a41ec-b0b5-448e-a47d-34dfb11112d1","createdAt":"2024-12-30T07:34:42.354Z","updatedAt":"2024-12-30T08:08:27.668Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","actorId":"4e8622c7-0a24-412d-bf38-156e073ab384","fromPriority":0,"toPriority":3,"fromStateId":"dfbab132-44b7-47b9-a411-906426533033","toStateId":"7bd765d0-7fa4-40ed-9b62-ae963436682c","toCycleId":"398ba8c7-c523-43c8-b6c7-d748d0e171a4","addedLabelIds":["bd1ce6b0-b0d0-49c9-b4a3-f905674fe9ac"],"__class":"IssueHistory"}
935 | {"id":"cd0833d4-d92e-40ef-a7d2-223d6c7b4592","createdAt":"2025-01-05T16:01:14.558Z","updatedAt":"2025-01-05T16:01:14.558Z","issueId":"bda1a998-91b0-4ceb-8f89-91b7f6608685","fromCycleId":"398ba8c7-c523-43c8-b6c7-d748d0e171a4","toCycleId":"43714e70-c8a5-44f8-b2ce-3b8613397955","__class":"IssueHistory"}
936 | _metadata_={"returnedModelsCount":{"Comment":1,"IssueHistory":5}}
937 | ```
938 |
939 | Does this response look familiar to you? Yes, it follows the same format used in full bootstrapping. Later, in handleLoadedModels, the response will be parsed, the models will be written to the database, and objects will be created in memory. Importantly, the partial index of the request will be **saved in the database**, so the next time LSE tries to hydrate the model, it will know that network hydration is unnecessary.
940 |
941 | You might wonder: _Why is `firstSyncId` included in the `/sync/patch` request parameters, but not `lastSyncId`? After all, `lastSyncId` is used to determine if the client is up to date with the latest data. Won't these models be updated?_ The answer lies in how LSE handles incremental changes (delta packets), which I'll explain in Chapter 4. The basic idea is this: when a delta packet arrives, LSE checks which models are affected and haven't yet been loaded, and immediately loads those models. If lazy hydration completes after this process, LSE will not overwrite the existing models in memory. You can refer to `createModelsFromData` method for implementation details.
942 |
943 | ---
944 |
945 | > [!NOTE]
946 | > Code references
947 | >
948 | > - `BatchModelLoader` (`wm`)
949 | > - `handleBatch`
950 | > - `Database` (`eg`)
951 | > - `loadPartialModels`
952 |
953 | Under certain circumstances, `syncGroups` will be used as the query parameters instead of partial indexed keys. For example, a `Team` may have many associated `Issue`s, so it has an `issue` property of type `lazyReferenceCollection`. In this case, `customNetworkHydration` is used to define the query parameters for loading the `Issue`s of a `Team`.
954 |
955 | ```typescript
956 | this.issues = new Et(re, this, "teamId", void 0, {
957 | customNetworkHydration: () => [
958 | {
959 | modelClass: re, // Issue model
960 | syncGroup: this.id,
961 | },
962 | {
963 | modelClass: mr, // Attachment model
964 | syncGroup: this.id,
965 | },
966 | ],
967 | });
968 | ```
969 |
970 | When LSE loads the `Issue`s of a `Team`, `loadPartialModels` calls `BootstrapHelper.partialBootstrap`, which sends a request like this:
971 |
972 | ```
973 | https://client-api.linear.app/sync/bootstrap?type=partial&noSyncPackets=true&useCFCaching=true&noCache=true&firstSyncId=3577987809&syncGroups=aa788b7b-9b76-4caa-a439-36ca3b3d6820&onlyModels=Issue,Attachment&modelsHash=4f1dabd6151ad381a502c352b677d5c4
974 | ```
975 |
976 | As you can see from the request parameters, `modelClass` is mapped to the `onlyModels` property, and `syncGroups` are also included in the request.
977 |
978 | ---
979 |
980 | ### Takeaway of Chapter 2
981 |
982 | Let's sum up what we've learned in chapter 2:
983 |
984 | - LSE creates two types of databases:
985 | - A `linear_databases` database to store information about other databases.
986 | - A `linear_database_` database to store models, metadata, and transactions for a specific workspace.
987 | - There are three bootstrapping types: full, partial, and local. We've discussed full bootstrapping in detail.
988 | - The sync ID is the global version number of the database. It helps determine whether the client is up to date with the latest data.
989 | - LSE can lazily hydrate models into memory, either from the server or the local database.
990 |
991 | In the upcoming chapters, we'll explore how LSE synchronizes changes between clients and the server, starting with how local changes are sent to the server.
992 |
993 | ## Chapter 3: Transactions
994 |
995 | In the previous chapter, we explored how LSE loads existing models from the server. Now, we'll shift our focus to how LSE synchronizes changes between clients and the server. Specifically, this chapter will examine how client-side changes are synced to the server.
996 |
997 | Let's start with a fundamental question: **What happens when we change the assignee of an Issue?** How does LSE handle networking, offline caching, observability, and other underlying complexities—all in just two lines of code?
998 |
999 | ```jsx
1000 | issue.assignee = user;
1001 | issue.save();
1002 | ```
1003 |
1004 | In this chapter, I will use `UpdateTransaction` as an example. As before, let's start with a high-level overview of the process before diving into the details.
1005 |
1006 | 
1007 |
1008 | 1. When a property is assigned a new value, the system records key information: the name of the changed property and its previous value. **Models in memory** are updated **immediately** to reflect these changes.
1009 | 2. When `issue.save()` is called, an **`UpdateTransaction`** is created. This transaction captures the changes made to the model.
1010 | 3. The generated `UpdateTransaction` is then added to a request queue. Simultaneously, it is saved in the `__transactions` table in IndexedDB for **caching**.
1011 | 4. The `TransactionQueue` schedules timers (sometimes triggering them immediately) to send the queued transactions to the server in **batches**.
1012 | 5. Once a batch is successfully processed by the backend, it is removed from the `__transactions` table in IndexedDB. The Local Storage Engine (LSE) then clears the cached batch.
1013 | 6. Transactions will wait for delta packets containing the `lastSyncId` to complete before proceeding.
1014 |
1015 | In addition to `UpdateTransaction`, there are four other types of transactions, and `TransactionQueue` provides corresponding methods to create them.
1016 |
1017 | | Minimized name | Original name | Description |
1018 | | -------------- | -------------------------- | -------------------------------------------------------------------------- |
1019 | | `M3` `Zo` | `BaseTransaction` | The base class for all transaction types. |
1020 | | `Hu` | `CreationTransaction` | The transaction for adding a new model object. |
1021 | | `zu` | `UpdatingTransaction` | The transaction for updating properties of an existing model object. |
1022 | | `g3` | `DeletionTransaction` | The transaction for deleting a model object (e.g., deleting a comment). |
1023 | | `m3` | `ArchivalTransaction` | The transaction for archiving a model object (e.g., archiving an issue). |
1024 | | `y3` | `UnarchiveTransaction` | The transaction for unarchiving a model object. |
1025 | | `Tc` | `LocalTransaction` | A simpler transaction wrapper for a model object that performs no actions. |
1026 |
1027 | ### Figuring out what has been changed
1028 |
1029 | > [!note]
1030 | > **Code references**
1031 | >
1032 | > - `M1`: The decorator used to add observability to LSE models.
1033 | > - `as.propertyChanged`: `ClientModel.propertyChanged`
1034 | > - `as.markPropertyChanged`: `ClientModel.markPropertyChanged`
1035 | > - `as.referencedPropertyChanged`: `ClientModel.referencedPropertyChanged`
1036 | > - `as.updateReferencedModel`: `ClientModel.updateReferencedModel`
1037 |
1038 | As discussed in the [Observability](#observability-m1) section, LSE leverages the `M1` function to make model properties observable. Beyond enabling observability, `M1` also plays a critical role in transaction generation. Here's how it works:
1039 |
1040 | When a property of a model is assigned a new value, the setter intercepts the assignment, triggering `propertyChanged`, which then calls `markPropertyChanged` with the **property's name**, the **old value**, and the **new value**. Next, `markPropertyChanged` serializes the old value and stores it in `modifiedProperties`. This serialized data will later be used to generate a transaction.
1041 |
1042 | 
1043 |
1044 | It's important to note that **before `save()` is called, the model in memory is already updated**. Transactions **do not** update in-memory models—this happens immediately when a property is changed. However, transactions do play a key role in **undo** and **redo** operations, as well as in updating in-memory models. We'll explore this in greater detail in **Chapter 5**.
1045 |
1046 | ### Generating an `UpdateTransaction`
1047 |
1048 | > [!note]
1049 | > Code references
1050 | >
1051 | > - `as.save`: `ClientModel.save` and it calls `SyncedStore.save`
1052 | > - `sg.save`: `SyncedStore.save` and it calls `SyncClient.update`
1053 | > - `ng.update`: `SyncClient.update`
1054 | > - `uce.update`: `TransactionQueue.update`
1055 |
1056 | In `SyncClient.update`, the `TransactionQueue.update` method is called to generate an `UpdateTransaction` instance. During the construction of the `UpdateTransaction`, the model's `changeSnapshot` function is invoked. Ultimately, an object is generated to represent the changes and is bound to the `changeSnapshot` property of the `UpdateTransaction`.
1057 |
1058 | 
1059 |
1060 | An `UpdateTransaction` has the following properties:
1061 |
1062 | - **`type`**: The type of transaction.
1063 | - **`model`**: The in-memory model object this transaction is related to.
1064 | - **`batchIndex`**: Each transaction has a `batchIndex`. The `TransactionQueue` also has a `batchIndex` property, and when creating a transaction, this index is assigned. Transactions with the same `batchIndex` are grouped together and sent to the server in a single batch.
1065 |
1066 | At the end of `TransactionQueue.update`, the `TransactionQueue.enqueueTransaction` method is called to add the transaction to the `createdTransactions` queue.
1067 |
1068 | ### Queueing transactions
1069 |
1070 | > [!note]
1071 | > Code references
1072 | >
1073 | > - `uce`: `TransactionQueue`
1074 | > - `enqueueTransaction`
1075 | > - `commitCreatedTransactions`
1076 | > - `dequeueNextTransactions`
1077 | > - `dequeueTransaction`
1078 | > - `ww`: `MicrotaskScheduler`
1079 | > - `as.prepare`: `Model.prepare`
1080 | > - `as.updateMutation`: `Model.updateMutation`
1081 | > - `zu.graphQLMutation`: `UpdateTransaction.graphQLMutation`
1082 |
1083 | Besides creating transaction instances, `TransactionQueue` is also responsible for managing transactions and sending them to the server. It uses four arrays to handle these transactions:
1084 |
1085 | 
1086 |
1087 | 1. **`createdTransactions`**: After a transaction is created, it is initially placed in this array.
1088 |
1089 | A **`commitCreatedTransactions`** scheduler moves all transactions from this array to the end of `queuedTransactions` and increments the `batchIndex` of `TransactionQueue` by 1. This scheduler operates as a microtask, which means that **transactions created within the same event loop will share the same `batchIndex`**.
1090 |
1091 | When transactions are moved to `queuedTransactions`, they are also stored in the `__transactions` table. If the client closes before these transactions are sent to the server, it can reload them from this table and resend them.
1092 |
1093 | 2. **`queuedTransactions`**: These transactions are waiting to be executed.
1094 |
1095 | A **`dequeueTransaction`** scheduler prepares transactions from this queue and moves them in batches to `executingTransactions`.
1096 |
1097 | Several factors determine which transactions are moved to `executingTransactions` in a batch:
1098 |
1099 | - **Transaction Limit:** If there are too many transactions in `queuedTransactions`, the scheduler will not move any to `executingTransactions` to avoid overwhelming the server with too many requests.
1100 |
1101 | - **Batch Index & Independence:** Transactions must have the same `batchIndex` and should be **independent** of each other to be grouped in a batch.
1102 |
1103 | Before executing, LSE calls the `prepare` method on each transaction. This method generates a GraphQL mutation query for the transaction. Each transaction object has a `graphQLMutation` function that generates these queries and binds them to the `graphQLMutationPrepared` property. For example, updating the assignee of an issue might generate a query like this:
1104 |
1105 | ```json
1106 | {
1107 | "mutationText": "issueUpdate(id: \"a3dad63b-8302-4f1f-a874-a80e6d9ed418\", input: $issueUpdateInput) { lastSyncId }",
1108 | "variables": {
1109 | "issueUpdateInput": {
1110 | "assigneeId": "4e8622c7-0a24-412d-bf38-156e073ab384"
1111 | }
1112 | },
1113 | "variableTypes": {
1114 | "issueUpdateInput": "IssueUpdateInput"
1115 | }
1116 | }
1117 | ```
1118 |
1119 | - **GraphQL mutation size limit:** the `graphQLMutationPrepared` property is evaluated based on its size. If the accumulated size of the transactions at the beginning of `executingTransactions` exceeds a certain threshold, the scheduler will stop moving additional transactions to `executingTransactions`. This is done to prevent sending overly large GraphQL queries to the server.
1120 |
1121 | Finally, the scheduler calls `executeTransactionBatch` to move the next batch of transactions from `queuedTransactions` to `executingTransactions`.
1122 |
1123 | 3. **`executingTransactions`**: These transactions have been sent to the server but have not yet been accepted or rejected. In the next section, we will discuss how these transactions are executed.
1124 |
1125 | 4. **`persistedTransactionsEnqueue`**: When the database is bootstrapped, transactions saved in the `__transactions` table are loaded into this array. After remote updates are processed, they are moved to `queuedTransactions` and wait to be executed. We will cover this in the final section of this chapter.
1126 |
1127 | There is also a special array called **`completedButUnsyncedTransactions`**. I will explain how it works when we discuss **rebasing transactions** in Chapter 4.
1128 |
1129 | ### Executing transactions
1130 |
1131 | > [!NOTE]
1132 | > Code references
1133 | >
1134 | > - `uce.executeTransactionBatch`: `TransactionQueue.executeTransactionBatch`
1135 | > - `dce.execute`: `TransactionExecutor.execute`
1136 | > - `OE`: `WaitSyncQueue`
1137 |
1138 | LSE creates a `TransactionExecutor` to execute a batch of transactions. In the `TransactionExecutor.execute` method, the `graphQLMutationPrepared` property of each transaction in the batch is merged into a single GraphQL mutation query and sent to the server. For example:
1139 |
1140 | ```json
1141 | {
1142 | "query": "mutation IssueUpdate($issueUpdateInput: IssueUpdateInput!) { issueUpdate(id: \"a3dad63b-8302-4f1f-a874-a80e6d9ed418\", input: $issueUpdateInput) { lastSyncId } }",
1143 | "variables": {
1144 | "issueUpdateInput": {
1145 | "assigneeId": "4e8622c7-0a24-412d-bf38-156e073ab384"
1146 | }
1147 | },
1148 | "operationName": "IssueUpdate"
1149 | }
1150 | ```
1151 |
1152 | The response contains `lastSyncId` for each mutating query.
1153 |
1154 | ```json
1155 | {
1156 | "data": {
1157 | "issueUpdate": {
1158 | "lastSyncId": 3273967562
1159 | }
1160 | }
1161 | }
1162 | ```
1163 |
1164 | This example might seem too simple because it only contains a single mutation query. Let's look at a more complex example, such as creating a new `Project`. In this case, you can see clearly how LSE handles multiple transactions within a batch. The request would look like this:
1165 |
1166 | ```json
1167 | {
1168 | "query": "mutation ProjectCreate_DocumentContentCreate($projectCreateInput: ProjectCreateInput!, $documentContentCreateInput: DocumentContentCreateInput!) { o1:projectCreate(input: $projectCreateInput) { lastSyncId }, o2:documentContentCreate(input: $documentContentCreateInput) { lastSyncId } }",
1169 | "variables": {
1170 | "projectCreateInput": {
1171 | "id": "940e248c-2226-4b0a-a14e-8a410ccfaaa7",
1172 | "name": "New Project",
1173 | "description": "",
1174 | "color": "#bec2c8",
1175 | "statusId": "fd3884d6-b740-41d9-919c-796119d6c5ed",
1176 | "memberIds": [],
1177 | "sortOrder": 1005.71,
1178 | "prioritySortOrder": 0,
1179 | "priority": 0,
1180 | "teamIds": ["369af3b8-7d07-426f-aaad-773eccd97202"]
1181 | },
1182 | "documentContentCreateInput": {
1183 | "id": "bb75174e-1e26-46ae-94b6-67977be435c3",
1184 | "projectId": "940e248c-2226-4b0a-a14e-8a410ccfaaa7"
1185 | }
1186 | },
1187 | "operationName": "ProjectCreate_DocumentContentCreate"
1188 | }
1189 | ```
1190 |
1191 | And the response would be like this:
1192 |
1193 | ```json
1194 | {
1195 | "data": {
1196 | "o1": {
1197 | "lastSyncId": 0
1198 | },
1199 | "o2": {
1200 | "lastSyncId": 3588467486
1201 | }
1202 | }
1203 | }
1204 | ```
1205 |
1206 | When the response is received, LSE removes the transactions from the `executingTransactions` queue, clears their cache from the `__transactions` table and calls `transactionCompleted` method of each transform to complete the transactions. The transaction's `syncInNeededForCompletion` property is set to the largest `lastSyncId` in the response, and the transaction will wait for the sync action with the matching `lastSyncId` to arrive at the client before it is considered complete. If the transaction cannot complete immediately, it is moved to the `completedButUnsyncedTransactions` queue. As discussed in the next chapter, this is crucial for performing **transaction rebasing**.
1207 |
1208 | An important point to note is that, up until now, LSE has **not** modified model tables (e.g., the `Issue` table) in IndexedDB. This is because, in Linear, the local database is a subset of the server database (the SSOT), and it cannot contain changes that have not been approved by the server. If the server rejects the transaction, modifying the model tables prematurely could make it difficult and error-prone to revert the changes in the local database.
1209 |
1210 | If the server rejects the mutation query, `transactionCompleted` is still called, but this time it will be invoked with an error. In response, the transaction will trigger its `rollback` method to undo any changes made on the client side and be removed from the `executingTransaction` queue.
1211 |
1212 | ### Persisted Transactions
1213 |
1214 | > [!NOTE]
1215 | > Code references
1216 | >
1217 | > - `eg.putTransaction`: `Database.putTransaction`
1218 | > - `ng.bootstrap`: `SyncClient.bootstrap`
1219 | > - `uce.loadPersistedTransactions`: `TransactionQueue.loadPersistedTransactions`
1220 | > - `uce.confirmPersistedTransactions`: `TransactionQueue.confirmPersistedTransactions`
1221 |
1222 | In the previous chapters, we discussed that when transactions are moved to `queuedTransactions`, they are also stored in the `__transactions` table for caching purposes. During this process, the `serialize` method of transactions is called. Each type of transaction has its own implementation of this method.
1223 |
1224 | When Linear starts again, cached transactions are loaded in `TransactionQueue.loadPersistedTransactions`, at which point they are deserialized. Similarly, each type of transaction also implements a static `fromSerializedData` method. This method **replays** the transaction and modifies the models in memory, effectively restoring the client's state after a restart.
1225 |
1226 | Finally, during the bootstrap process, `TransactionQueue.confirmPersistedTransactions` is called to move these transactions to `createdTransactions`.
1227 |
1228 | There is a small chance that this process could lead to unintended issues. For example, if the client sends a transaction to the server but the window closes before receiving a response, the transaction will be stored in the `__transactions` table. When the client restarts, the transaction will be reloaded and sent to the server again. Since some transactions are not idempotent, users may encounter errors like: "You can't delete a model that doesn't exist." While this is a rare occurrence and generally doesn't significantly affect the user experience, it's something to be aware of. (In OT systems, this could result in more serious issues, which is why transactions or operations usually include an incremental counter to ensure deduplication).
1229 |
1230 | ### Takeaway of Chapter 3
1231 |
1232 | Let's sum up this chapter.
1233 |
1234 | **Firstly, client-side operations will never directly modify the tables in the local database!** Instead, they only alter in-memory models, and the changes are sent as transactions to the server. As we will see in the next chapter, only after receiving the corresponding delta packages from the server does the local models get updated.
1235 |
1236 | **Secondly, LSE uses a transaction queue to manage transactions.** The queue schedules transactions to be sent to the server in batches. This batching mechanism helps to reduce the number of requests sent to the server and improves efficiency.
1237 |
1238 | **Lastly, LSE handles transactions in a robust manner.** It ensures that transactions are persisted in the local database and can be restored after a client restart. This approach guarantees that the client will never lose any changes, even if the client crashes or the network connection is lost.
1239 |
1240 | ## Chapter 4: Delta Packets
1241 |
1242 | In this chapter, we will explore how LSE handles incremental updates and ensures that the client stays synchronized with the server.
1243 |
1244 | Let's begin with an overview, similar to what we did in the previous chapters!
1245 |
1246 | 1. At the end of the bootstrapping process, after the persisted transactions are loaded, the client establishes a WebSocket connection to the server to receive incremental updates, or delta packets.
1247 | 2. Handling delta packets involves several key tasks: updating models in memory and in the local database, rebasing transactions, and more.
1248 |
1249 | ### Establishing Connection
1250 |
1251 | > [!NOTE]
1252 | > Code references
1253 | >
1254 | > - `ng.startSyncing`: `SyncClient.startSyncing`
1255 | > - `ng.constructor`: `SyncClient.constructor`
1256 | > - `handshakeCallback` callback
1257 |
1258 | The final phase of bootstrapping involves establishing a WebSocket connection to the server to receive incremental updates after loading persisted transactions from the local database. In the `handshakeCallback`, which is executed once the connection is established, the client compares the `lastSyncId` from the callback's parameters with the local lastSyncId to determine whether any incremental changes have been **missed**. If a discrepancy is found, the client requests the missing delta packets from the server and applies them accordingly.
1259 |
1260 | The parameters of the callback would appear as follows:
1261 |
1262 | ```typescript
1263 | {
1264 | "userSyncGroups": {
1265 | "all": [
1266 | "89388c30-9823-4b14-8140-4e0650fbb9eb",
1267 | "094f76cf-b0c1-4f6c-9908-1801a6654f05",
1268 | "4e8622c7-0a24-412d-bf38-156e073ab384",
1269 | "E7E6104E-AAAA-42BC-9B8B-B91FCDD9946B",
1270 | "AD619ACC-AAAA-4D84-AD23-61DDCA8319A0",
1271 | "CDA201A7-AAAA-45C5-888B-3CE8B747D26B",
1272 | "B0B41C7E-AAAA-4C7D-A93D-CD9565DA4358"
1273 | ],
1274 | "optimized": [
1275 | "4e8622c7-0a24-412d-bf38-156e073ab384",
1276 | "E7E6104E-AAAA-42BC-9B8B-B91FCDD9946B",
1277 | "AD619ACC-AAAA-4D84-AD23-61DDCA8319A0",
1278 | "CDA201A7-AAAA-45C5-888B-3CE8B747D26B",
1279 | "B0B41C7E-AAAA-4C7D-A93D-CD9565DA4358"
1280 | ]
1281 | },
1282 | "lastSyncId": 3529152751,
1283 | "lastSequentialSyncId": 3529152751,
1284 | "databaseVersion": 1179
1285 | }
1286 | ```
1287 |
1288 | Additionally, the `SyncClient` module listens to the `SyncMessage` channel on the WebSocket connection, which emits delta packets. Upon receiving these packets, it invokes the `applyDelta` method to process and apply the updates.
1289 |
1290 | ### Applying Deltas
1291 |
1292 | > [!NOTE]
1293 | > Code references
1294 | >
1295 | > - `ng.applyDelta`: `SyncClient.applyDelta`
1296 | > - `ng.constructor`: `SyncClient.constructor`
1297 | > - `oce.addSyncPacket`: `SyncActionStore.addSyncPacket`
1298 | > - `zu.supportedPacket`: `DependentsLoader.supportedPacket`
1299 | > - `uce.modelUpserted`: `TransactionQueue.modelUpserted`
1300 |
1301 | After a client sends a GraphQL mutation to the server, the server executes the query and generates a set of delta packets, which are then broadcast to all connected clients, **including the client that initiated the mutation**. Each delta packet contains the changes, or **sync actions**, that occurred on the server. For example, if the assignee of an `Issue` is changed, the client will receive delta packets like the following:
1302 |
1303 | ```jsx
1304 | [
1305 | {
1306 | id: 2361610825,
1307 | modelName: "Issue",
1308 | modelId: "a8e26eed-7ad4-43c6-a505-cc6a42b98117",
1309 | action: "U",
1310 | data: {
1311 | id: "a8e26eed-7ad4-43c6-a505-cc6a42b98117",
1312 | title: "Connect to Slack",
1313 | number: 3,
1314 | teamId: "369af3b8-7d07-426f-aaad-773eccd97202",
1315 | stateId: "28d78a58-9fc1-4bf1-b1a3-8887bdbebca4",
1316 | labelIds: [],
1317 | priority: 3,
1318 | createdAt: "2024-05-29T03:08:15.383Z",
1319 | sortOrder: -12246.37,
1320 | updatedAt: "2024-07-13T06:25:40.612Z",
1321 | assigneeId: "e86b9ddf-819e-4e77-8323-55dd488cb17c",
1322 | boardOrder: 0,
1323 | reactionData: [],
1324 | subscriberIds: ["e86b9ddf-819e-4e77-8323-55dd488cb17c"],
1325 | prioritySortOrder: -12246.37,
1326 | previousIdentifiers: [],
1327 | },
1328 | __class: "SyncAction",
1329 | },
1330 | {
1331 | id: 2361610826,
1332 | modelName: "IssueHistory",
1333 | modelId: "ac1c69bb-a37e-4148-9a35-94413dde172d",
1334 | action: "I",
1335 | data: {
1336 | id: "ac1c69bb-a37e-4148-9a35-94413dde172d",
1337 | actorId: "e86b9ddf-819e-4e77-8323-55dd488cb17c",
1338 | issueId: "a8e26eed-7ad4-43c6-a505-cc6a42b98117",
1339 | createdAt: "2024-07-13T06:25:40.581Z",
1340 | updatedAt: "2024-07-13T06:25:40.581Z",
1341 | toAssigneeId: "e86b9ddf-819e-4e77-8323-55dd488cb17c",
1342 | },
1343 | __class: "SyncAction",
1344 | },
1345 | {
1346 | id: 2361610854,
1347 | modelName: "Activity",
1348 | modelId: "1321dc17-cceb-4708-8485-2406d7efdfc5",
1349 | action: "I",
1350 | data: {
1351 | id: "1321dc17-cceb-4708-8485-2406d7efdfc5",
1352 | events: [
1353 | {
1354 | type: "issue_updated",
1355 | issueId: "a8e26eed-7ad4-43c6-a505-cc6a42b98117",
1356 | issueTitle: "Connect to Slack",
1357 | changedColumns: ["subscriberIds", "assigneeId"],
1358 | },
1359 | ],
1360 | userId: "e86b9ddf-819e-4e77-8323-55dd488cb17c",
1361 | issueId: "a8e26eed-7ad4-43c6-a505-cc6a42b98117",
1362 | createdAt: "2024-07-13T06:25:41.924Z",
1363 | updatedAt: "2024-07-13T06:25:41.924Z",
1364 | },
1365 | __class: "SyncAction",
1366 | },
1367 | ];
1368 | ```
1369 |
1370 | As shown in the example above, each action includes an integer `id` field, which corresponds to the sync ID associated with the sync action. (Note that the `id` of the third sync action is 28 greater than the `id` of the second sync action, indicating that the `lastSyncId` is tied to the entire database, not just the workspace.)
1371 |
1372 | Each action also has an `action` type. The possible types are as follows:
1373 |
1374 | 1. `I` - Insertion
1375 | 2. `U` - Update
1376 | 3. `A` - Archiving
1377 | 4. `D` - Deletion
1378 | 5. `C` - Covering
1379 | 6. `G` - Changing sync groups
1380 | 7. `S` - Changing sync groups (though the distinction from `G` remains unclear)
1381 | 8. `V` - Unarchiving
1382 |
1383 | The `ng.applyDelta` method is responsible for handling these sync actions. It performs the following tasks (refer to the source for implementation details):
1384 |
1385 | Here's a refined version of your points:
1386 |
1387 | 1. Determine whether the user is added to or removed from sync groups. If the user is added to a sync group, LSE triggers a network request (essentially a partial bootstrapping) to fetch models associated with that sync group. LSE will wait for the response before continuing to process the sync actions.
1388 |
1389 | 2. Load dependencies of specific actions.
1390 |
1391 | ---
1392 |
1393 | > [!NOTE]
1394 | > **Code references**
1395 | > - `ng.applyDelta`: `SyncClient.applyDelta`
1396 | > - `Zu.supportedPacket`: `DependentsLoader.supportedPacket`
1397 |
1398 | The next step is to load references for certain models involved in the sync actions. For instance, if the child issues of an `Issue` are modified, LSE needs to load these child issues. Why? The answer lies in the `DependentsLoader.supportedPacket`. This method identifies which sync actions require loading of dependent models.
1399 |
1400 | The sync actions must meet the following conditions to be loaded:
1401 |
1402 | - They should **not** be of type "I", "A", or "D".
1403 | - The model they manipulate must be either an `Issue` or a `Project`, and it must be used for partial indexes, which applies to both `Issue` and `Project`.
1404 | - The action type should be "V", or its associated model should have references that have changed, with LSE already storing the partial index keys of these changed references. (This is a bit tricky. I will do my best to explain it, but the most effective way to understand this is by reviewing the source code.)
1405 |
1406 | First, LSE retrieves the **transient partial indexed keys** of the model. These are the Cartesian product of the model's partial indexes and its dependencies. For example, `Issue` has transient partial indexed keys like this:
1407 |
1408 | 
1409 |
1410 | These keys represent the Cartesian product of its 9 partial indexes and 17 dependencies.
1411 |
1412 | Next, LSE checks whether any references in the model have changed. If they have, it will generate a new partial index value for the updated references.
1413 |
1414 | Finally, LSE checks if the new partial index value is already stored in the local database. If so, it means the dependents of the model need to be updated because the partial index value now points to a new dependent model.
1415 |
1416 | ---
1417 |
1418 | 3. Write data for the new sync groups and their dependents into the local database.
1419 |
1420 | 4. Loop through all sync actions and resolve them to update the local database.
1421 |
1422 | In this step, LSE calls `TransactionQueue.modelUpserted` to remove local `CreationTransaction`s that are no longer valid after the sync actions. If the `CreationTransaction`'s UUID matches the model's ID, the transaction is canceled. This step ensures that UUID conflicts are avoided, as the UUID is generated on the client side. Additionally, if a user leaves a sync group, the models associated with that group are also removed.
1423 |
1424 | As mentioned in the previous chapter, LSE will not modify the local database until the server confirms the changes.
1425 |
1426 | 5. Loop through all sync actions again to update in-memory data.
1427 |
1428 | ---
1429 |
1430 | In this step, LSE loops through the sync actions twice.
1431 |
1432 | The first loop prepares models to perform the sync actions:
1433 |
1434 | - For actions of type "I", "V", or "U", LSE creates corresponding model instances.
1435 | - For actions of type "A", LSE updates the models' properties.
1436 |
1437 | Next, LSE attaches references to the newly created models. However, before doing so, LSE checks if the newly created model has been deleted by a sync action in the same delta packet, in order to avoid unnecessary operations. It does this by comparing the `syncId`s of the action that created the model and the action that deletes it. If the `syncId` of the deleting action is larger, the model will not be created.
1438 |
1439 | The second loop handles sync actions one by one.
1440 |
1441 | For actions of type "I", "V", "U", and "C", LSE will **rebase** `UpdateTransactions` onto them.
1442 |
1443 | > [!NOTE]
1444 | > **Code references**
1445 | > - `ng.applyDelta`: `SyncClient.applyDelta`
1446 | > - `uce.rebaseTransactions`: `SyncActionStore.addSyncPacket`
1447 | > - `zu.rebase`: `UpdateTransaction.rebase`
1448 |
1449 | When applying a sync action, conflicts can arise with local transactions. For example, imagine your colleague changes the assignee to Alice, while you simultaneously change the assignee to Bob. The server processes your colleague's update first, so, according to the "last-writer-wins" principle, the assignee on the server ends up as Bob.
1450 |
1451 | Here's what happens on your client: you create an `UpdateTransaction` to change the assignee, but before the transaction is executed by the server, your client receives a delta packet that updates the assignee to Alice. At this point, LSE needs to perform a rebasing. Following the "last-writer-wins" principle, the in-memory model must be reverted to Bob.
1452 |
1453 | This rebasing occurs in the `rebaseTransactions` method, where all `UpdateTransaction` objects in the queue call the `rebase` method. The `original` value of each transaction is updated to reflect the value from the delta packet (in this case, Alice), and the in-memory model is reset to Bob. It is similar to Operational Transformation (OT).
1454 |
1455 | Remember the `completedButUnsyncedTransactions` queue we discussed in the previous chapter? During rebasing, LSE checks if any transactions in this queue have a `syncIdNeededForCompletion` smaller than or equal to the `lastSyncId` of the delta packet. If so, these transactions are removed from the `completedButUnsyncedTransactions` queue.
1456 |
1457 | ---
1458 |
1459 | 6. Update `lastSyncId` on the client, and update `firstSyncId` if sync groups change.
1460 |
1461 | 7. Resolve completed transactions waiting for the `lastSyncId`.
1462 |
1463 | After receiving the delta packets, the client checks if any transactions are waiting for the `lastSyncId` of those packets. If such transactions exist, they will be resolved, as shown here:
1464 |
1465 | ```typescript
1466 | this.syncWaitQueue.progressQueue(this.lastSyncId);
1467 | ```
1468 |
1469 | It's important to note that LSE performs all the above steps inside a `updateLock.runExclusive` callback. This ensures that LSE waits for a delta packet to be fully processed before processing the next one, maintaining consistency between the client's state and the server's.
1470 |
1471 | ### Takeaway of Chapter 4
1472 |
1473 | Here's an optimized version of your chapter summary for improved clarity and structure:
1474 |
1475 | **LSE uses delta packets to keep the client synchronized with the server.** When a client sends a mutation to the server, the server processes the mutation and generates delta packets. These packets are broadcast to all connected clients, containing the changes that occurred on the server. The client then applies these changes to its in-memory models.
1476 |
1477 | Looking closer at the `UpdateTransaction` and the corresponding delta packets, we see that the delta packets carry more data than the transaction itself—specifically, an `IssueHistory` of the assignee change. Unlike Operational Transformation (OT), where the server primarily handles operation transformations, validates permissions, and executes operations to maintain a single source of truth, LSE's backend involves additional business logic alongside these tasks.
1478 |
1479 | In contrast to OT, which only sends an acknowledgment to the mutator, **LSE sends all modified model properties to all connected clients, even if the client making the modification is not the mutator.** This approach simplifies the management of WebSocket connections.
1480 |
1481 | Finally, LSE employs a simple **Last-Writer-Wins** strategy to resolve conflicts, specifically addressing conflicts in `UpdateTransaction` only.
1482 |
1483 | ## Chapter 5: Misc
1484 |
1485 | ### Undo & Redo
1486 |
1487 | Undos and redos in LSE are transaction-based. Each transaction type includes a specific `undoTransaction` method, which performs the undo logic and returns another transaction for redo purposes. For example, the `undoTransaction` method of an `UpdateTransaction` reverts the model's property to its previous value and returns another `UpdateTransaction` to the `UndoQueue`. It's important to note that when a transaction executes its undo logic, a new transaction is created and added to the `queuedTransactions` to ensure proper synchronization.
1488 |
1489 | But how does the `UndoManager` determine which transactions should be appended to the undo/redo stack? The answer lies in Linear's UI logic, which identifies the differences:
1490 |
1491 | ```jsx
1492 | n.title !== d &&
1493 | o.undoQueue.addOperation(
1494 | s.jsxs(s.Fragment, {
1495 | children: [
1496 | "update title of issue ",
1497 | s.jsx(Le, {
1498 | model: n,
1499 | }),
1500 | ],
1501 | }),
1502 | () => {
1503 | (n.title = d), n.save();
1504 | }
1505 | );
1506 | ```
1507 |
1508 | When an edit is made, the UI calls `UndoQueue.addOperation`, allowing the `UndoQueue` to subscribe to the next `transactionQueuedSignal` and create an undo item. This signal is emitted when transactions are added to `queuedTransactions`. The subscription ends once the callback finishes execution, at which point `save()` is called, and any transactions created in `save()` are pushed to the undo/redo stack.
1509 |
1510 | However, when an undo operation is performed, although the signal is triggered, no additional undo item is created, because `UndoQueue` is not actively listening during that time.
1511 |
1512 | When performing an undo, the `undoTransaction` method returns the corresponding transaction for redo purposes.
1513 |
1514 | ## Conclusion
1515 |
1516 | I'm so glad we've come this far in our journey exploring the Linear Sync Engine! In this post, we've covered a wide range of topics:
1517 |
1518 | 1. **Model Definition**: LSE uses decorators to define models, properties, and references. The metadata for these models is stored in the `ModelRegistry` class and is widely utilized across LSE.
1519 | 2. **Observability**: LSE leverages MobX to make models observable, enabling it to track changes to models and properties and generate transactions accordingly. Decorators are used to add observability.
1520 | 3. **Bootstrapping**: LSE supports three types of bootstrapping, and we've gone into detail on full bootstrapping.
1521 | 4. **Lazy Loading**: LSE hydrates lazily-loaded data as needed. We explored how partial indexes and sync groups are used to manage this process.
1522 | 5. **Syncing**: LSE uses transactions to synchronize clients with the server. We've discussed how transactions are generated, queued, and sent to the server, as well as how LSE handles cache, rebasing, conflict resolution, and more.
1523 | 6. **Undo and Redo**: LSE supports undo and redo operations based on transactions.
1524 |
1525 | I hope you now have a clearer understanding of how the Linear Sync Engine operates. While I've tried to cover as much as possible, there are still some topics left to explore. If you're interested in diving deeper, here are some recommendations for further reading:
1526 |
1527 | 1. **Other Transaction Types**: We've looked at `UpdateTransaction`, but LSE also supports other types such as `CreateTransaction`, `DeleteTransaction`, and `ArchiveTransaction`. How do these work, and how do metadata fields like `onDelete` and `onArchive` affect transactions?
1528 | 2. **Other Bootstrapping Types**: We focused on full bootstrapping, but LSE also supports partial bootstrapping and local bootstrapping. How do these differ from full bootstrapping, and when are they used?
1529 |
1530 | I highly encourage you to share your thoughts, questions, and findings in the Issues. I look forward to hearing from you!
1531 |
1532 | ## Appendix A: Actions and Computed Values
1533 |
1534 | Actions (`rt`) & Computed (`O`)
1535 |
1536 | Let's take `moveToTeam` and `parents` of `Issue` for example, there is `Action` decorator and `Computed` decorator.
1537 |
1538 | ```jsx
1539 | Pe([rt], re.prototype, "moveToTeam", null);
1540 | Pe([O], re.prototype, "parents", null);
1541 | ```
1542 |
1543 | And the original code is like this:
1544 |
1545 | ```typescript
1546 | @ClientModel("Issue")
1547 | class Issue {
1548 | @Action
1549 | moveToTeam() {
1550 | // implementation
1551 | }
1552 |
1553 | @Computed
1554 | get parents() {
1555 | // implementation
1556 | }
1557 | }
1558 | ```
1559 |
1560 | **Action** and **computed** are core MobX primitives. During bootstrapping, these properties are made observable by directly calling MobX's `makeObservable` API.
1561 |
1562 | ## Credits
1563 |
1564 | Thanks to Tuomas Artman for generously sharing insights into how LSE works in talks and podcasts.
1565 |
1566 | Special thanks to [@zxch3n](https://github.com/zxch3n), [@vincentdchan](https://github.com/vincentdchan), and [@promer94](https://github.com/promer94) for their valuable reviews.
1567 |
1568 | ---
1569 |
1570 | © 2025 Wenzhao Hu. This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
1571 |
--------------------------------------------------------------------------------
/SUMMARY.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | This summary covers the reverse-engineering study of Linear's Sync Engine (LSE). It provides a conceptual overview of how LSE works and serves as a good starting point for understanding the source code.
4 |
5 | ## Introduction
6 |
7 | Here are the core concepts behind LSE:
8 |
9 | 
10 |
11 | **Model**
12 |
13 | Entities such as `Issue`, `Team`, `Organization`, and `Comment` are referred to as **models** in LSE. These models possess **properties** and **references** to other models, many of which are observable (via **MobX**) to automatically update views when changes occur. In essence, models and properties include **metadata** that dictate how they behave in LSE.
14 |
15 | Models can be loaded from either the **local database** (IndexedDB) or the server. Some models supports **partially loading** and can be loaded on demand, either from the local database or by fetching additional data from the server. Once loaded, models are stored in an **Object Pool**, which serves as a large map for retrieving models by their **UUIDs**.
16 |
17 | Models can be **hydrated** lazily, meaning its properties can be loaded only when accessed. This mechanism is particularly useful for improving performance by loading only the necessary data.
18 |
19 | Operations—such as additions, deletions, updates, and archiving—on models, their properties, and references are encapsulated as **transactions**. These transactions are sent to the server, executed there, and then broadcast as **delta packets** to all connected clients. This ensures data consistency across multiple clients.
20 |
21 | **Transaction**
22 |
23 | Operations sent to the server are packaged as **transactions**. These transactions are intended to execute **exclusively** on the server and are designed to be **reversible** on the client in case of failure. If the client loses its connection to the server, transactions are temporarily **cached** in IndexedDB and automatically resent once the connection is reestablished.
24 |
25 | Transactions are associated with a **sync id**, which is a monotonically increasing number that ensures the correct order of operations. This number is crucial for maintaining consistency across all clients.
26 |
27 | Additionally, transactions play a key role in supporting **undo** and **redo** operations, enabling seamless changes and corrections in real-time collaborative workflows.
28 |
29 | **Delta packets**
30 |
31 | Once transactions are executed, the server broadcasts **delta packets** to all clients—including the client that initiated the transaction—to update the models. A delta packet contains several **sync action**s, and each action is associated with a **sync id** as well. This mechanism prevents clients from missing updates and helps identify any missing packets if discrepancies occur.
32 |
33 | The **delta packets** may differ from the original transactions sent by the client, as the server might perform **side effects** during execution (e.g., generating history).
34 |
35 | ## Defining Models and Metadata
36 |
37 | When Linear starts, it first generates metadata for models, including their properties, methods (actions), and computed values. To manage this metadata, LSE maintains a detailed dictionary called `ModelRegistry`.
38 |
39 | 
40 |
41 | LSE uses **decorators** to define models and properties, and record their metadata to the `ModelRegistry`.
42 |
43 | Model's metadata includes:
44 |
45 | 1. **`loadStrategy`**: Defines how models are loaded into the client. There are five strategies:
46 | - **`instant`**: Models that are loaded during application bootstrapping (default strategy).
47 | - **`lazy`**: Models that do not load during bootstrapping but are fetched all at once when needed (e.g., `ExternalUser`).
48 | - **`partial`**: Models that are loaded on demand, meaning only a subset of instances is fetched from the server (e.g., `DocumentContent`).
49 | - **`explicitlyRequested`**: Models that are only loaded when explicitly requested (e.g., `DocumentContentHistory`).
50 | - **`local`**: Models that are stored exclusively in the local database. No models have been identified using this strategy.
51 | 2. **`partialLoadMode`**: Specifies how a model is hydrated, with three possible values: `full`, `regular`, and `lowPriority`.
52 | 3. **`usedForPartialIndexes`**: Relates to the functionality of partial indexing.
53 |
54 | Property's metadata includes:
55 |
56 | 1. `type`: Specifies the property's type.
57 | 2. `lazy`: Specifies whether the property should be loaded only when the model is hydrated.
58 | 3. `serializer`: Defines how to serialize the property for data transfer or storage.
59 | 4. `indexed`: Determines whether the property should be indexed in the database. Used for references.
60 | 5. `nullable`: Specifies whether the property can be `null`, used for references.
61 | 6. etc.
62 |
63 | `type` is an enumeration that includes the following values:
64 |
65 | 1. **`property`**: A property that is "owned" by the model. For example, `title` is a `property` of `Issue`.
66 | 2. **`ephemeralProperty`**: Similar to a `property`, but it is not persisted in the database. This type is rarely used. For example, `lastUserInteraction` is an ephemeral property of `User`.
67 | 3. **`reference`**: A property used when a model holds a reference to another model. Its value is typically the ID of the referenced model. A reference can be lazy-loaded, meaning the referenced model is not loaded until this property is accessed. For example, `subscription` is a `reference` of `Team`.
68 | 4. **`referenceModel`**: When `reference` properties are registered, a `referenceModel` property is also created. This property defines getters and setters to access the referenced model using the corresponding `reference`.
69 | 5. **`referenceCollection`**: Similar to `reference`, but it refers to an array of models. For example, `templates` is a `referenceCollection` of `Team`.
70 | 6. **`backReference`**: A `backReference` is the inverse of a `reference`. For example, `favorite` is a `backReference` of `Issue`. The key difference is that a `backReference` is considered "owned" by the referenced model. When the referenced model (B) is deleted, the `backReference` (A) is also deleted.
71 | 7. **`referenceArray`**: Used for many-to-many relationships. For example, `members` of `Project` is a `referenceArray` that references `Users`, allowing users to be members of multiple projects.
72 |
73 | LSE uses a variety of decorators to register different types of properties. In this chapter, let's first look at three of them.
74 |
75 | ### Schema Hash
76 |
77 | `ModelRegistry` includes a special property called **`__schemaHash`**, which is a hash of all models' metadata and their properties' metadata. This hash is crucial for determining whether the local database requires migration.
78 |
79 | ## Bootstrapping
80 |
81 | A full bootstrapping of Linear looks like this:
82 |
83 | 
84 |
85 | 1. `StoreManager` (`cce`) creates either a `PartialStore` (`jm`) or a `FullStore` (`TE`) for each model. These stores are responsible for synchronizing in-memory data with IndexedDB. Also, `SyncActionStore` (`oce`) will be created to store sync actions.
86 | 2. `Database` (`eg`) connects to IndexedDB and get databases and tables ready. If the databases don't exist, they will be created. And if a migration is needed, it will be performed.
87 | 3. `Database` determines the type of bootstrapping to be performed.
88 | 4. The appropriate bootstrapping is executed. For full bootstrapping, models are retrieved from the server.
89 | 5. The retrieved model data will be stored in IndexedDB.
90 | 6. Data requiring immediate hydration is loaded into memory, and observability is activated.
91 | 7. Build a connection to the server to receive delta packets.
92 |
93 | Linear creates a database for each workspaces logged in. The metadata of this database includes the following fields.
94 |
95 | 1. `lastSyncId`. Explained in a section below.
96 | 2. **`firstSyncId`**: Represents the `lastSyncId` value when the client performs a **full bootstrapping**. As we'll see later, this value is used to determine the starting point for incremental synchronization.
97 | 3. **`subscribedSyncGroups`**. Explained in a section below.
98 | 4. etc.
99 |
100 | During a full bootstrapping, the response will contains this metadata and LSE will dump them into the database.
101 |
102 | ### `lastSyncId`
103 |
104 | **`lastSyncId`** is a critical concept in LSE. You might find that it ties into concepts like transactions and delta packets, which we will explore in greater detail in the later chapters. It's perfectly fine if you don't fully grasp this part right now. Keep reading and refer back to this section after you've covered the upcoming chapters—everything will come together.
105 |
106 | Linear is often regarded as a benchmark for [local-first software](https://www.inkandswitch.com/local-first/). Unlike most mainstream local-first applications that use CRDTs, Linear's collaboration model aligns more closely with OT, as it relies on a centralized server to establish the order of all transactions. Within the LSE framework, all transactions sent by clients follow a [total order](https://en.wikipedia.org/wiki/Total_order), whereas CRDTs typically require only a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set). This total order is represented by the `sync id`, which is an incremental integer. And `lastSyncId` is the latest `sync id` as you can tell from its name.
107 |
108 | When a transaction is successfully executed by the server, the global **`lastSyncId`** increments by 1. This ID effectively serves as the **version number of the database**, ensuring that all changes are tracked in a sequential manner.
109 |
110 | 
111 |
112 | To some extent, LSE leans more towards OT (Operational Transformation) rather than CRDT (Conflict-Free Replicated Data Types), because it requires a central server to arrange the order of transactions.
113 |
114 | ### SyncGroup
115 |
116 | This concept is crucial in LSE. While all workspaces share the same `lastSyncId` counter, you cannot access issues or receive delta packets from workspaces or teams to which you lack proper permissions. This restriction is enforced through an access control mechanism, with `subscribedSyncGroups` serving as the key component. The `subscribedSyncGroups` array contains UUIDs that represent your user ID, the teams you belong to, and predefined roles.
117 |
118 | ### Lazy Hydration
119 |
120 | Linear does not load everything from the server at once during full bootstrapping, nor does it load everything into memory each time. Instead, it supports **lazy hydration**, meaning only the necessary data is loaded into memory when needed. This approach improves performance and reduces memory usage.
121 |
122 | Classes with a `hydrate` method, such as `Model`, `LazyReferenceCollection`, `LazyReference`, `RequestCollection`, and `LazyBackReference`, can be hydrated.
123 |
124 | LSE uses different approaches, including **partial indexes** and **sync groups**, as keys to load lazy models.
125 |
126 | ## Syncing
127 |
128 | LSE clients send transactions to the server to perform operations on models. Below is a brief overview of how transactions work in LSE, using `UpdatingTransactions` as an example:
129 |
130 | 
131 |
132 | 1. When a property is assigned a new value, the system records key information: the name of the changed property and its previous value. **Models in memory** are updated **immediately** to reflect these changes.
133 | 2. When `issue.save()` is called, an **`UpdateTransaction`** is created. This transaction captures the changes made to the model.
134 | 3. The generated `UpdateTransaction` is then added to a request queue. Simultaneously, it is saved in the `__transactions` table in IndexedDB for **caching**.
135 | 4. The `TransactionQueue` schedules timers (sometimes triggering them immediately) to send the queued transactions to the server in **batches**.
136 | 5. Once a batch is successfully processed by the backend, it is removed from the `__transactions` table in IndexedDB. The Local Storage Engine (LSE) then clears the cached batch.
137 | 6. Transactions will wait for delta packets containing the `lastSyncId` to complete before proceeding.
138 |
139 | Transactions offer the following key features:
140 |
141 | 1. **Caching** – If the client disconnects or closes, transactions can be resent upon reconnection.
142 | 2. **Undo & Redo** – Transactions can be undone, redone, and reverted on the client side, allowing smooth handling of server rejections.
143 | 3. **Conflict Resolution** – Uses a **last-writer-wins** strategy to resolve conflicts.
144 |
145 | LSE will create a WebSocket connection to the server to receive delta packets, and performing the following tasks when receiving delta packets.
146 |
147 | 1. Determine whether the user is added to or removed from sync groups.
148 | 2. Load dependencies of specific actions.
149 | 3. Write data for the new sync groups and their dependents into the local database.
150 | 4. Loop through all sync actions and resolve them to update the local database.
151 | 5. Loop through all sync actions again to update in-memory data.
152 | 6. Update `lastSyncId` on the client, and update `firstSyncId` if sync groups change.
153 | 7. Resolve completed transactions waiting for the `lastSyncId`.
154 |
--------------------------------------------------------------------------------
/imgs/bootstrapping-overview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/bootstrapping-overview.png
--------------------------------------------------------------------------------
/imgs/cached-transactions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/cached-transactions.png
--------------------------------------------------------------------------------
/imgs/change-snapshot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/change-snapshot.png
--------------------------------------------------------------------------------
/imgs/count-of-models.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/count-of-models.png
--------------------------------------------------------------------------------
/imgs/covering-partial-index-values.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/covering-partial-index-values.png
--------------------------------------------------------------------------------
/imgs/direct-referencies.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/direct-referencies.png
--------------------------------------------------------------------------------
/imgs/introduction.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/introduction.png
--------------------------------------------------------------------------------
/imgs/lastsyncid.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/lastsyncid.png
--------------------------------------------------------------------------------
/imgs/linear-databases.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/linear-databases.png
--------------------------------------------------------------------------------
/imgs/meta-meta.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/meta-meta.png
--------------------------------------------------------------------------------
/imgs/meta-persistence.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/meta-persistence.png
--------------------------------------------------------------------------------
/imgs/model-partial-store-db.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/model-partial-store-db.png
--------------------------------------------------------------------------------
/imgs/model-property-lookup.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/model-property-lookup.png
--------------------------------------------------------------------------------
/imgs/model-registry.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/model-registry.png
--------------------------------------------------------------------------------
/imgs/model-store-class.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/model-store-class.png
--------------------------------------------------------------------------------
/imgs/model-store-db.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/model-store-db.png
--------------------------------------------------------------------------------
/imgs/models.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/models.png
--------------------------------------------------------------------------------
/imgs/modified-properties.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/modified-properties.png
--------------------------------------------------------------------------------
/imgs/object-stores.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/object-stores.png
--------------------------------------------------------------------------------
/imgs/partial-index-stores.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/partial-index-stores.png
--------------------------------------------------------------------------------
/imgs/partial-index-values.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/partial-index-values.png
--------------------------------------------------------------------------------
/imgs/references.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/references.png
--------------------------------------------------------------------------------
/imgs/search-for-symbols.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/search-for-symbols.png
--------------------------------------------------------------------------------
/imgs/title-image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/title-image.png
--------------------------------------------------------------------------------
/imgs/transaction-overview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/transaction-overview.png
--------------------------------------------------------------------------------
/imgs/transaction-queues.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/transaction-queues.png
--------------------------------------------------------------------------------
/imgs/transient-partial-index-keys.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wzhudev/reverse-linear-sync-engine/20abf7ba3185d67061ca7a6f878335d7288ac993/imgs/transient-partial-index-keys.png
--------------------------------------------------------------------------------