├── OPEN_PROBLEMS ├── MUTABLE_DATA.md ├── HUMAN_READABLE_NAMING.md ├── PRESERVE_USER_PRIVACY.md ├── ENHANCED_BITSWAP_GRAPHSYNC.md ├── HASH_LINKED_DATA_GRAPH_LAYOUTS.md └── brief-problem-statements.md ├── CRDT ├── applying-operations-overview.png ├── README.md └── json-crdt.md ├── VIDEO.md ├── .github ├── ISSUE_TEMPLATE │ ├── config.yml │ └── open_an_issue.md └── workflows │ └── stale.yml ├── BITSWAP.md ├── LICENSE └── README.md /OPEN_PROBLEMS/MUTABLE_DATA.md: -------------------------------------------------------------------------------- 1 | Moved to https://github.com/protocol/ResNetLab/blob/master/OPEN_PROBLEMS 2 | -------------------------------------------------------------------------------- /OPEN_PROBLEMS/HUMAN_READABLE_NAMING.md: -------------------------------------------------------------------------------- 1 | Moved to https://github.com/protocol/ResNetLab/blob/master/OPEN_PROBLEMS 2 | -------------------------------------------------------------------------------- /OPEN_PROBLEMS/PRESERVE_USER_PRIVACY.md: -------------------------------------------------------------------------------- 1 | Moved to https://github.com/protocol/ResNetLab/blob/master/OPEN_PROBLEMS 2 | -------------------------------------------------------------------------------- /CRDT/applying-operations-overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ipfs/notes/HEAD/CRDT/applying-operations-overview.png -------------------------------------------------------------------------------- /OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md: -------------------------------------------------------------------------------- 1 | Moved to https://github.com/protocol/ResNetLab/blob/master/OPEN_PROBLEMS 2 | -------------------------------------------------------------------------------- /OPEN_PROBLEMS/HASH_LINKED_DATA_GRAPH_LAYOUTS.md: -------------------------------------------------------------------------------- 1 | Moved to https://github.com/protocol/ResNetLab/blob/master/OPEN_PROBLEMS 2 | -------------------------------------------------------------------------------- /VIDEO.md: -------------------------------------------------------------------------------- 1 | # P2P Video 2 | 3 | ## Projects 4 | 5 | - [Toronto Mesh Networks Live Streaming Setup Presentation](https://ipfs.infura.io/ipfs/QmWsKvBvXUKaHcHzrUS91XV4k3YjQFdywQ7bY9BZVX4ghk/) 6 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- 1 | blank_issues_enabled: false 2 | contact_links: 3 | - name: Getting Help on IPFS 4 | url: https://ipfs.io/help 5 | about: All information about how and where to get help on IPFS. 6 | - name: IPFS Official Forum 7 | url: https://discuss.ipfs.io 8 | about: Please post general questions, support requests, and discussions here. 9 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/open_an_issue.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Open an issue 3 | about: Only for actionable issues relevant to this repository. 4 | title: '' 5 | labels: need/triage 6 | assignees: '' 7 | 8 | --- 9 | 20 | -------------------------------------------------------------------------------- /BITSWAP.md: -------------------------------------------------------------------------------- 1 | # Bitswap 2 | 3 | ## Discussions 4 | 5 | - Previous discussions on ipfs/notes 6 | - https://github.com/ipfs/notes/issues/20 7 | - https://github.com/ipfs/notes/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20Bitswap 8 | - Previous meetings notes https://github.com/ipfs/research-bitswap/tree/master/meeting-notes 9 | 10 | ## Bitswap Research Review (papers, books, talks, lectures, etc) 11 | 12 | ### Papers 13 | 14 | ### Books 15 | 16 | ### Lectures 17 | 18 | ### Talks 19 | 20 | - [Jeromy Coffee Talks - Bitswap](https://www.youtube.com/watch?v=9UjqJTCg_h4) 21 | 22 | ### Notes and blog posts 23 | 24 | ### Implementations: 25 | 26 | - [Go](https://github.com/ipfs/go-ipfs/tree/master/exchange/bitswap) 27 | - [JavaScript](https://github.com/ipfs/js-ipfs-bitswap) 28 | -------------------------------------------------------------------------------- /OPEN_PROBLEMS/brief-problem-statements.md: -------------------------------------------------------------------------------- 1 | The most important problems/technologies relevant to Protocol Labs that will/should exist 5-10 years from now: 2 | 3 | - apps on the local web 4 | - make internet applications able to run entirely on LAN 5 | - totally encrypted webapps 6 | - decrypted in the browser 7 | - becomes private computation 8 | - anonymous IPFS/private content 9 | - better names / human-readable names 10 | - namesystems 11 | - on mobile / low-power 12 | - office suite on IPFS 13 | - better represent data in hash-linked graphs 14 | - importers 15 | - certifiable archives (.car) 16 | 17 | 18 | --- 19 | This began as notes from a conversation on 2017-12-15 with @jbenet, @stebalien, @nicola, and @miyazono, but this list should grow and evolve. Please directly edit this to add additional items, submit issues to propose clarifications or removal of items, and convert these to more complete and formalized open problem statements. 20 | -------------------------------------------------------------------------------- /.github/workflows/stale.yml: -------------------------------------------------------------------------------- 1 | name: Close and mark stale issue 2 | 3 | on: 4 | schedule: 5 | - cron: '0 0 * * *' 6 | 7 | jobs: 8 | stale: 9 | 10 | runs-on: ubuntu-latest 11 | permissions: 12 | issues: write 13 | pull-requests: write 14 | 15 | steps: 16 | - uses: actions/stale@v3 17 | with: 18 | repo-token: ${{ secrets.GITHUB_TOKEN }} 19 | stale-issue-message: 'Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.' 20 | close-issue-message: 'This issue was closed because it is missing author input.' 21 | stale-issue-label: 'kind/stale' 22 | any-of-labels: 'need/author-input' 23 | exempt-issue-labels: 'need/triage,need/community-input,need/maintainer-input,need/maintainers-input,need/analysis,status/blocked,status/in-progress,status/ready,status/deferred,status/inactive' 24 | days-before-issue-stale: 6 25 | days-before-issue-close: 7 26 | enable-statistics: true 27 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Protocol Labs, Inc. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # IPFS Collaborative Notebook for Research 2 | 3 | [![](https://img.shields.io/badge/made%20by-Protocol%20Labs-blue.svg?style=flat-square)](http://protocol.ai) 4 | [![](https://img.shields.io/badge/project-libp2p-yellow.svg?style=flat-square)](http://ipfs.io/) 5 | [![](https://img.shields.io/badge/freenode-%23libp2p-yellow.svg?style=flat-square)](http://webchat.freenode.net/?channels=%23ipfs) 6 | 7 | ## What's in This Repo? 8 | 9 | We use this repo in two ways: 10 | 11 | - [Issues](https://github.com/ipfs/notes/issues) to track several kinds of discussion on topics related with Research and IPFS, random ideas and proposals for new systems or features that don't fall on a specific repo yet. All the discussion happens in the issues. 12 | - [OPEN_PROBLEMS](./OPEN_PROBLEMS) list and unpack the currently known Open Problems for IPFS. 13 | 14 | **Disclaimer:** While we work hard to document our work as it progresses, research progress may not be fully reflected here for some time, or may be worked out out-of-band. 15 | 16 | ## Request for Proposals 17 | 18 | Some of our Open Problems have open [RFPs](https://github.com/protocol/research-rfps#rfps-and-grants). For all, we welcome any collaborations (potentially leading to new constructions, discoveries and publications). Please reach out at research@protocol.ai. 19 | 20 | ## Funding 21 | 22 | [Protocol Labs runs an RFP (Request For Proposals)](https://github.com/protocol/research-rfps) Program with the goal of funding individuals and groups to come up with novel solutions to the Open Problems found in this and other repos. If interested, please follow the link to check the active RFPs. 23 | 24 | ## Related research repos 25 | 26 | - [Protocol Labs Research](https://github.com/protocol/research) 27 | - [libp2p Research](https://github.com/libp2p/notes) 28 | - [IPLD Research](https://github.com/ipld/research) 29 | 30 | ## Contribute 31 | 32 | Feel free to join in. All welcome. Open an [issue](https://github.com/libp2p/notes/issues)! 33 | 34 | This repository falls under the IPFS [Code of Conduct](https://github.com/ipfs/community/blob/master/code-of-conduct.md). 35 | 36 | [![](https://cdn.rawgit.com/jbenet/contribute-ipfs-gif/master/img/contribute.gif)](https://github.com/ipfs/community/blob/master/CONTRIBUTING.md) 37 | 38 | ## License 39 | 40 | [MIT](LICENSE) 41 | -------------------------------------------------------------------------------- /CRDT/README.md: -------------------------------------------------------------------------------- 1 | # CRDT Research 2 | 3 | > Discussions and Planning about getting CRDT implementation on top of IPFS & libp2p 4 | 5 | # tl;dr; 6 | 7 | CRDT, or Conflict-Free Replicated Data Types, is a type of specially-designed data structure used to achieve strong eventual consistency (SEC) and monotonicity (absence of rollbacks). 8 | 9 | # Discussions 10 | 11 | - Previous discussions on ipfs/notes 12 | - https://github.com/ipfs/notes/issues/40#issuecomment-194899389 13 | 14 | # CRDT Research Review (papers, books, talks, lectures, etc) 15 | 16 | ### Background concepts 17 | 18 | It may be useful to be familiar with these concepts in order to be able to understand some of the literature: 19 | 20 | * [Partially Ordered Set](https://en.wikipedia.org/wiki/Partially_ordered_set) 21 | * [Lattice](https://en.wikipedia.org/wiki/Lattice_(order)) 22 | * [Semilattice](https://en.wikipedia.org/wiki/Semilattice) 23 | 24 | For a great explanation of these concepts plus what is a "Monotonic Join Semilattice", take a look at this great article: 25 | 26 | * [A CRDT Primer Part I: Defanging Order Theory](http://jtfmumm.com/blog/2015/11/17/crdt-primer-1-defanging-order-theory/) 27 | 28 | ### Papers 29 | 30 | - [Conflict-free replicated data types](https://scholar.google.pt/citations?view_op=view_citation&hl=en&user=NAUDTpMAAAAJ&citation_for_view=NAUDTpMAAAAJ:M3ejUd6NZC8C) 31 | - [A comprehensive study of Convergent and Commutative Replicated Data Types](http://hal.upmc.fr/inria-00555588/document) 32 | - [Merging OT and CRDT Algorithms](http://dl.acm.org/citation.cfm?id=2596636) 33 | - [Delta State Replicated Data Types](https://arxiv.org/abs/1603.01529) 34 | - [CRDTs: Making δ-CRDTs Delta-Based](http://novasys.di.fct.unl.pt/~alinde/publications/a12-van_der_linde.pdf) 35 | - [Key-CRDT Stores](https://run.unl.pt/bitstream/10362/7802/1/Sousa_2012.pdf) 36 | - [TRVE Data: Placing a bit less trust in the cloud](https://www.cl.cam.ac.uk/research/dtg/trve/) 37 | - [LSEQ: an Adaptive Structure for Sequences in Distributed Collaborative Editing](https://hal.archives-ouvertes.fr/hal-00921633/document) 38 | - [A Conflict-Free Replicated JSON Datatype](https://arxiv.org/pdf/1608.03960.pdf) 39 | - [OpSets: Sequential Specifications for Replicated Datatypes](https://arxiv.org/abs/1805.04263) 40 | - [Snapdoc: Authenticated snapshots with history privacy in peer-to-peer collaborative editing](https://martin.kleppmann.com/papers/snapdoc-pets19.pdf) 41 | 42 | #### Access Control 43 | 44 | - [Access Control for Weakly Consistent Data Stores](http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_25.pdf) 45 | - [ACGreGate: A Framework for Practical Access Control for Applications using Weakly Consistent Databases](https://arxiv.org/abs/1801.07005) 46 | 47 | 48 | ### Primers 49 | 50 | * [A CRDT Primer Part I: Defanging Order Theory](http://jtfmumm.com/blog/2015/11/17/crdt-primer-1-defanging-order-theory/) 51 | * [A CRDT Primer Part II: Convergent CRDTs](http://jtfmumm.com/blog/2015/11/24/crdt-primer-2-convergent-crdts/) 52 | 53 | ### Books 54 | 55 | ### Lectures 56 | 57 | ### Talks 58 | 59 | - [RedisConf18: CRDTs and Redis—From Sequential to Concurrent Executions by Carlos Baquero](https://www.youtube.com/watch?v=ZoMIzBM0nf4) 60 | - [QCon London 2018: CRDTs and the Quest for Distributed Consistency by Martin Kleppmann](https://www.infoq.com/presentations/crdt-distributed-consistency) 61 | - ["CRDTs Illustrated" by Arnout Engelen](https://www.youtube.com/watch?v=9xFfOhasiOE) 62 | - [Coding CRDT](https://www.youtube.com/playlist?list=PLzUeAPxtWcqxBXjUelmcm5ORVjEpbUlHH) 63 | - [Dmitry Ivanov & Nami Naserazad - Practical Demystification of CRDT (Lambda Days 2016)](https://www.youtube.com/watch?v=PQzNW8uQ_Y4) 64 | - [ElixirConf 2015 - CRDT: Datatype for the Apocalypse by Alexander Songe](https://www.youtube.com/watch?v=txD1tfyIIvY) 65 | - [GOTO 2016 • Conflict Resolution for Eventual Consistency • Martin Kleppmann](https://www.youtube.com/watch?v=yCcWpzY8dIA) 66 | - [CRDTs in IPFS](https://www.youtube.com/watch?v=2VOF-Z-nLnQ) 67 | - [Journal Club - 2018 06 13 CRDT JSON Datatype, by Gonçalo Pestana](https://www.youtube.com/watch?v=TRvQzwDyVro) 68 | 69 | ### Notes and blog posts 70 | 71 | - [CRDTs For Fun and Profit](https://github.com/el10savio/crdts-for-fun-and-profit/blob/main/crdts.md) 72 | - [CRDT Tutorial for Beginners](https://github.com/ljwagerfield/crdt) 73 | - [Conflict-Free Replicated Data Types (CRDTs), An Offline Camp passion talk](https://medium.com/offline-camp/conflict-free-replicated-data-types-crdts-2c6ae67ab9a4#.duh4g0r9k) 74 | - [CRDT Notes by Paul Frazee](https://github.com/pfrazee/crdt_notes) 75 | - [Towards a unified theory of Operational Transformation and CRDT by Raph Levien](https://medium.com/@raphlinus/towards-a-unified-theory-of-operational-transformation-and-crdt-70485876f72f) 76 | - [A simple approach to building a real-time collaborative text editor](http://digitalfreepen.com/2017/10/06/simple-real-time-collaborative-text-editor.html) 77 | - [Data Laced with History: Causal Trees & Operational CRDTs](http://archagon.net/blog/2018/03/24/data-laced-with-history/) 78 | 79 | 80 | ### Available libraries and systems using CRDT 81 | 82 | - http://y-js.org/ 83 | - http://swarmdb.net/ 84 | - http://gun.js.org/enterprise/ 85 | - http://scuttlebot.io/ 86 | - https://github.com/mafintosh/hyperlog 87 | - https://github.com/orbitdb 88 | - https://github.com/jboner/akka-crdt 89 | - https://github.com/ipfs-shipyard/peer-crdt and https://github.com/ipfs-shipyard/peer-crdt-ipfs 90 | 91 | ### CRDT libraries using IPFS 92 | 93 | - Yjs through [y-ipfs-connector](https://github.com/pgte/y-ipfs-connector) 94 | - [ipfs-log](https://github.com/orbitdb/ipfs-log), append-only log CRDT used in [OrbitDB](https://github.com/orbitdb/orbit-db) 95 | - [peer-crdt](https://github.com/ipfs-shipyard/peer-crdt) and [peer-crdt-ipfs](https://github.com/ipfs-shipyard/peer-crdt-ipfs) 96 | 97 | ### Terms 98 | 99 | There is an [IPFS Glossary](https://github.com/ipfs/glossary), a work in progress, which should have definitions for terms used in CRDT. If you are consistently running into terms that you do not know the meaning of, please open an issue on that repository and we can work on a definition that will help you (and others!) going forward. 100 | -------------------------------------------------------------------------------- /CRDT/json-crdt.md: -------------------------------------------------------------------------------- 1 | # JSON-CRDT 2 | 3 | This document explains the internals of JSON-CRDTs based on the 4 | [Conflict-Free weplicated JSON Datatype](https://arxiv.org/pdf/1608.03960.pdf) 5 | paper. 6 | 7 | Conflict-Free Replicated Datatypes (CRDTs) are a family of data structures that 8 | support concurrent local modifications in different replicas in a way that there 9 | are no conflicts when replicas are merged. It also ensures that the state of 10 | each replica will eventually converge. A JSON CRDT ensures that a data structure 11 | consistent with the [JSON](https://json.org) object standard and in which its 12 | basic types - lists, maps and registers - are CRDTs and can be embedded. From 13 | the paper's abstract: 14 | 15 | > [...] an algorithm and formal semantics for a JSON data structure that 16 | > automatically resolves concurrent modifications such that no updates are lost, 17 | > and such that all replicas converge towards the same state (a conflict-free 18 | > replicated datatype or CRDT). It supports arbitrarily nested list and map 19 | > types, which can be modified by insertion, deletion and assignment. The 20 | > algorithm performs all merging client-side and does not depend on ordering 21 | > guarantees from the network, making it suitable for deployment on mobile 22 | > devices with poor network connectivity, in peer-to-peer networks, and in 23 | > messaging systems with end-to-end encryption. 24 | 25 | ## Document editing API 26 | 27 | The interaction with the local JSON document is done through an API, which 28 | enables the user to programmatically retrieve and modify the document. Read-only 29 | API calls do not produce any side effect in the data structure, but 30 | modifying the document produces `operations` which uniquely identify the 31 | modifications in the document. 32 | 33 | Example on how to use the API programmatically to construct and edit a JSON 34 | document as described by the paper: 35 | 36 | ```javascript 37 | doc.get("shopping") = [] 38 | let head = doc.get("shopping").idx(0) 39 | head.insertAfter("eggs") 40 | let eggs = doc.get("shopping".idx(1) 41 | head.insertAfter("cheese") 42 | eggs.insertAfter("milk") 43 | 44 | // current state: 45 | // {"shopping": ["cheese", "eggs", "milk"]} 46 | 47 | doc.get("bought") = {} 48 | doc.get("shopping").idx(1).delete() 49 | doc.get("bought").idx(0).insertAfter("eggs") 50 | 51 | // current state: 52 | // {"shopping": ["cheese", "milk"], "bought": ["eggs"]} 53 | 54 | doc.yield() // performs network transactions (sends/receives operations) 55 | ``` 56 | 57 | ## Supported types 58 | 59 | A JSON document is composed of maps, lists and registers 60 | which can be embedded. The JSON CRDT as presented in the paper is a JSON data 61 | type in which maps, lists and registers are CRDTs and that can be embedded as 62 | expected by the [JSON specs](https://json.org). 63 | 64 | A register as defined in the paper is represented as a multi-value CRDT register 65 | which keeps a mapping between the register values and the operation ID that set 66 | it. 67 | 68 | ## Diving deep 69 | 70 | In this section we'll describe in more detail the JSON CRDT data structure and 71 | algorithm that ensure strong eventual consistency and no user input loss across 72 | all replicas. 73 | 74 | ### Editing the JSON document and replica local state 75 | 76 | Each replica keeps its own state of the document. Changes in the document 77 | occur only locally. When the document is edited, it generates an `operation` 78 | which uniquely describes the mutation globally. The operations are distributed 79 | to all peers as a mechanism to reach state consistency across all peers. 80 | 81 | Once a replica receives an external operation, it applies the operation locally. 82 | The merging algorithm ensures that applying external operations locally will 83 | not conflict with the current local state. 84 | 85 | The state of the document at any point in time is a set of operations which were 86 | applied locally. 87 | 88 | ### Operations 89 | 90 | The paper defines the JSON CRDT as an operation-based CRDT, in which an 91 | operation - a tuple representing uniquely a mutation in the JSON document - can be 92 | propagated to other replicas without sending the whole local document state. The 93 | local state of the document is represented by a set of individual operations. 94 | 95 | The operation-based representation has the advantage that replicas need only to 96 | share and merge singular operations, instead of the entire local state. On the 97 | other hand, whenever a new node joins the network, it still needs to receive the 98 | whole document state, which grows linearly with the number of operations applied 99 | to the document. This problem can be addressed with, for example, 100 | [Delta State CRDTs](https://github.com/ipfs/research-CRDT/issues/31). 101 | 102 | #### Operation Representation 103 | 104 | An operation is a tuple of the form 105 | 106 | ``` 107 | op ( 108 | id: N x ReplicaID 109 | deps: P(N x ReplicaID) 110 | cur: cursor(, kn) 111 | mut: insert(v) | delete | assign(v) v: VAL 112 | ) 113 | ``` 114 | 115 | where the `id` is a Lamport timestamp which identifies the operation uniquely; 116 | `deps` is a set of casual dependencies of the operation; `cur` describes the 117 | position of the document to be modified; and `mut` describes the modification 118 | itself. 119 | 120 | - **Id** 121 | 122 | The operation `id` is a Lamport timestamp (see appendix), which guarantees that 123 | every operation is uniquely identified globally and that the peers implicitly 124 | agree on the partial casual order of when the operations occurred, without any 125 | other synchronization mechanism. 126 | 127 | - **Cursor** 128 | 129 | A cursor `cur` describes the path from the root to a branch or leaf of the 130 | document in which the mutation will be applied. The cursor must consist only of 131 | immutable keys and identifiers. 132 | 133 | If we consider the following JSON document 134 | 135 | ``` 136 | {"shopping": ["cheese", "eggs", "milk"]} 137 | ``` 138 | 139 | the cursor 140 | 141 | ``` 142 | cursor(, 1) 143 | ``` 144 | 145 | is evaluated to `eggs` 146 | 147 | 148 | **Dependencies** 149 | 150 | The dependencies `deps` is a set of identifiers of all operation that the current 151 | operation depends on. The purpose of the operation dependencies is to ensure 152 | that the partial order of the operations is respected. 153 | 154 | In order to maintain the partial ordering of operations, an operation is applied 155 | on a local replica only when the local state has applied all the 156 | operation dependencies. If an external operation has been received and the set 157 | of dependencies is not fulfilled, the received operation is buffered until the 158 | dependencies operations are applied to the document. 159 | 160 | The dependencies set can be implemented in several ways. The requirements are 161 | that `deps` contains information about all the operations that the current 162 | operations depends on. 163 | 164 | The operation identifiers which are part of the `deps` set can be represented 165 | as, for example, a set of Lamport timestamps, version vectors, state vectors or 166 | dotted version vectors. The implementation decision for which type of identifier 167 | to use should take into consideration that the set `deps` may grow impracticable if 168 | the number of elements in the set grows linearly with the number of 169 | operations applied to the document. 170 | 171 | **Mutation** 172 | 173 | A mutation `mut` describes the modification to be applied to the node 174 | pointed by the `cursor`. The mutation type may be one of `INSERT`, `DELETE` 175 | and `ASSIGN`. Each document type supports different types of mutation: 176 | 177 | | | MapT | ListT | RegisterT | 178 | |--------|------|-------|-----------| 179 | | INSERT | Yes | Yes | No | 180 | | ASSIGN | Yes | Yes | Yes | 181 | | DELETE | Yes | Yes | Yes | 182 | 183 | *Table: Document type and support to type of mutation* 184 | 185 | The `INSERT` mutation inserts a new element to a list of a new key-value pair to 186 | a map. The `ASSIGN` assignes a new value to a register, map, which overwrites 187 | the current structure values. The `ASSING` mutation may also be applied to a 188 | register. The `DELETE` mutation deletes a map, list or register. 189 | 190 | #### Applying Operations 191 | 192 | Operations are always applied locally. The operations to apply on the 193 | local state may be generated locally or received from other replicas. The Figure 194 | 1 shows an overview of the algorithm to apply local and remote operations on the 195 | JSON document in a way that guarantees conflict free state updates with eventual 196 | consistency and no loss of data. 197 | 198 | ![Applying operations overview](applying-operations-overview.png?raw=true "Figure 1. Applying operations overview") 199 | 200 | When an operation `op` is received from a remote replica (remote operation case) 201 | the CRDT first makes sure that the operation was not applied already by checking 202 | whether the `op` ID is part of a set containing all the operation IDs 203 | applied to the local state. If the operation was not applied yet, the next step 204 | is to make sure that the local state has applied all the operations that `op` is 205 | is dependent on. This can be verified by comparing the `op` dependency set with 206 | the set containing all the operation IDs applied to the local state. If one or 207 | more dependent operation is missing in the current local state, the operation is 208 | buffered and applied only when all the dependencies have been satisfied. 209 | 210 | The `apply_local` action performs the state update in the document. It starts 211 | by traversing the document from the root until the node represented by the `op` 212 | cursor and where the mutation should be applied. The document traversal is done by 213 | descending from the document's root until reaching the node representing by the 214 | `op` cursor. While traversing the document tree, the `op` ID is added to the 215 | set of `deps` kept by each node traversed. The goal is to keep a log in 216 | each node of which operations have relied on them. 217 | 218 | A node is considered deleted form the local state when its set of dependency 219 | operations is empty. When traversing the document, if the next node does not 220 | exist, the CRDT creates the node, which may be of type `map` or `list`. 221 | 222 | Once the traversal is complete, the next step is to apply `op` mutation to the 223 | node. The mutation can be one of `INSERT`, `ASSIGN` or `DELETE`. 224 | 225 | #### Deleting and clearing state 226 | 227 | The `DELETE` and `ASSIGN` operations are destructive mutations. In the operation 228 | based JSON CRDT as presented in the paper, a deleted node is 229 | not removed from the document but instead it is represented by a node in which 230 | its `deps` set is empty. A document node which has its `deps` set empty is 231 | considered deleted and its called 232 | [*tombstones*](https://github.com/ipfs/research-CRDT/issues/30). 233 | 234 | The process of clearing the state of the nodes differs from its type. A register 235 | maintains a map between the operation ID and the value assigned by 236 | the operations. Clearing a register consists of deleting the map entries in the 237 | register that key is part of the operation `deps` set. 238 | 239 | Maps and lists follow a similar rationale: the `deps` set must be updated in the 240 | in the same way on cleared node and in all its children. 241 | 242 | **Tombstones and garbage-collection** 243 | 244 | The tombstone nodes must be kept in the data 245 | structure to avoid inconsistencies between replicas which delete nodes that are 246 | concurrently edited or used by other replicas. As a consequence, the data 247 | structure may keep unnecessary nodes once all replicas are in sync regarding 248 | deleted nodes. Ideally, the tombstone nodes should be safely removed from the 249 | data structure to avoid wasted bandwidth, wasted storage space and in order to 250 | speed up new entities joining the network. A possible solution for this problem 251 | is to snapshot the data structure at times and safely garbage-collection 252 | tombstone nodes. 253 | 254 | For more details and discussion about CRDT snapshot and garbage collection 255 | in operation based CRDTs, check [here](https://github.com/ipfs/dynamic-data-and-capabilities/issues/2) 256 | and [here](https://github.com/ipfs/dynamic-data-and-capabilities/issues/14). 257 | 258 | ### Data structure metadata 259 | 260 | In order for the JSON CRDT data structure to ensure strong eventual consistency 261 | between all replicas and no user input losses, the data structure needs to 262 | maintain metadata describing the operations which have been applied previously 263 | at a document, node and operation level. The metadata aims at supporting 264 | applying operations locally. 265 | 266 | At a document level, the data structure needs to keep a list of operation ID 267 | which have been applied locally. 268 | 269 | Each operation must keep a unique network-wide ID and its dependency set. At an 270 | operation level, the set of dependencies represents the operations that were 271 | applied locally before the operation was created. 272 | 273 | At a node level, each node in the document must keep a list of operation ID 274 | which the node depends on. Every time a node is traversed or created, the 275 | ID from the operation should be added to the set. If a node is deleted, the 276 | operation ID and its dependencies must be removed from the node's dependencies 277 | set. 278 | 279 | ## Further reading 280 | 281 | [Conflict-Free weplicated JSON Datatype](https://arxiv.org/pdf/1608.03960.pdf) 282 | 283 | [Delta State CRDTs](https://github.com/ipfs/research-CRDT/issues/31) 284 | 285 | [Lamport Logical clocks](https://lamport.azurewebsites.net/pubs/time-clocks.pdf) 286 | 287 | [Garbage-collection in op-based CRDTs](https://github.com/ipfs/research-CRDT/issues/30) 288 | 289 | [CRDT snapshotting](https://github.com/ipfs/dynamic-data-and-capabilities/issues/14) 290 | --------------------------------------------------------------------------------